Shannon Entropy

InformationTheory.ShannonEntropy.HistType
Hist(; bins::Tuple = (-1,))

A method for calculating Shannon entropy using histograms.

Fields

  • bins::Tuple: A tuple specifying the binning strategy for the histogram. If (-1,), the binning is determined automatically by StatsBase.fit. Otherwise, a tuple of bin edges for each dimension should be provided.
source
InformationTheory.ShannonEntropy.KSGType
KSG(; k::Int = 5)

A method for calculating Shannon entropy using the Kozachenko-Leonenko (KSG) estimator.

Fields

  • k::Int: The number of nearest neighbors to consider for each point.
source
InformationTheory.ShannonEntropy.shannonMethod
shannon(method::Hist, x::AbstractVector...)

Calculates the Shannon entropy of a set of variables x using a histogram-based method.

Arguments

  • method::Hist: The histogram-based Shannon entropy calculation method.
  • x::AbstractVector...: One or more vectors representing the data for which to calculate the entropy. Each vector is a dimension of the data.

Returns

  • H::Float64: The calculated Shannon entropy.

Details

The function first fits a histogram to the data x. The probability density function (PDF) is then approximated from the histogram. Finally, the Shannon entropy is calculated by integrating -p(x) * log(p(x)) over the domain of x, where p(x) is the PDF.

source
InformationTheory.ShannonEntropy.shannonMethod
shannon(method::KSG, x::AbstractVector...)

Calculates the Shannon entropy of a set of variables x using the KSG estimator.

Arguments

  • method::KSG: The KSG Shannon entropy calculation method.
  • x::AbstractVector...: One or more vectors representing the data for which to calculate the entropy. Each vector is a dimension of the data.

Returns

  • H::Float64: The calculated Shannon entropy.

Details

The function uses the k-nearest neighbors (k-NN) algorithm to estimate the probability density function (PDF) of the data. The Shannon entropy is then calculated based on the distances to the k-th nearest neighbors. This method is particularly useful for high-dimensional data.

source