Volume 23, 2019
|Page(s)||350 - 386|
|Published online||28 June 2019|
Statistical estimation of conditional Shannon entropy
Steklov Mathematical Institute of Russian Academy of Sciences,
* Corresponding author: email@example.com
Accepted: 28 November 2018
The new estimates of the conditional Shannon entropy are introduced in the framework of the model describing a discrete response variable depending on a vector of d factors having a density w.r.t. the Lebesgue measure in ℝd. Namely, the mixed-pair model (X, Y ) is considered where X and Y take values in ℝd and an arbitrary finite set, respectively. Such models include, for instance, the famous logistic regression. In contrast to the well-known Kozachenko–Leonenko estimates of unconditional entropy the proposed estimates are constructed by means of the certain spacial order statistics (or k-nearest neighbor statistics where k = kn depends on amount of observations n) and a random number of i.i.d. observations contained in the balls of specified random radii. The asymptotic unbiasedness and L2-consistency of the new estimates are established under simple conditions. The obtained results can be applied to the feature selection problem which is important, e.g., for medical and biological investigations.
Mathematics Subject Classification: 60F25 / 62G20 / 62H12
Key words: Shannon entropy / conditional entropy estimates / asymptotic unbiasedness / L2-consistency / logistic regression / Gaussian model
© EDP Sciences, SMAI 2019
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.