public class DiscreteDistribution extends Object
double
values, which are assumed to be normalized
such that the entries in a single array sum to 1.Constructor and Description |
---|
DiscreteDistribution() |
Modifier and Type | Method and Description |
---|---|
static double |
cosine(double[] dist,
double[] reference)
Returns the cosine distance between the two
specified distributions, which must have the same number
of elements.
|
static double |
entropy(double[] dist)
Returns the entropy of this distribution.
|
static double |
KullbackLeibler(double[] dist,
double[] reference)
Returns the Kullback-Leibler divergence between the
two specified distributions, which must have the same
number of elements.
|
static double[] |
mean(Collection<double[]> distributions)
Returns the mean of the specified
Collection of
distributions, which are assumed to be normalized arrays of
double values. |
static double[] |
mean(double[][] distributions)
Returns the mean of the specified array of distributions,
represented as normalized arrays of
double values. |
static void |
normalize(double[] counts,
double alpha)
Normalizes, with Lagrangian smoothing, the specified
double
array, so that the values sum to 1 (i.e., can be treated as probabilities). |
static double |
squaredError(double[] dist,
double[] reference)
Returns the squared difference between the
two specified distributions, which must have the same
number of elements.
|
static double |
symmetricKL(double[] dist,
double[] reference) |
public static double KullbackLeibler(double[] dist, double[] reference)
i
of
dist[i] * Math.log(dist[i] / reference[i])
.
Note that this value is not symmetric; see
symmetricKL
for a symmetric variant.dist
- the distribution whose divergence from reference
is being measuredreference
- the reference distributiondist[i] * Math.log(dist[i] / reference[i])
symmetricKL(double[], double[])
public static double symmetricKL(double[] dist, double[] reference)
dist
- the distribution whose divergence from reference
is being measuredreference
- the reference distributionKullbackLeibler(dist, reference) + KullbackLeibler(reference, dist)
KullbackLeibler(double[], double[])
public static double squaredError(double[] dist, double[] reference)
i
of the square of
(dist[i] - reference[i])
.dist
- the distribution whose distance from reference
is being measuredreference
- the reference distribution(dist[i] - reference[i])^2
public static double cosine(double[] dist, double[] reference)
dist.length
-dimensional space.
Given the following definitions
v
= the sum over all i
of dist[i] * dist[i]
w
= the sum over all i
of reference[i] * reference[i]
vw
= the sum over all i
of dist[i] * reference[i]
vw / (Math.sqrt(v) * Math.sqrt(w))
.dist
- the distribution whose distance from reference
is being measuredreference
- the reference distributiondist
and reference
, considered as vectorspublic static double entropy(double[] dist)
i
of
-(dist[i] * Math.log(dist[i]))
dist
- the distribution whose entropy is being measured-(dist[i] * Math.log(dist[i]))
public static void normalize(double[] counts, double alpha)
double
array, so that the values sum to 1 (i.e., can be treated as probabilities).
The effect of the Lagrangian smoothing is to ensure that all entries
are nonzero; effectively, a value of alpha
is added to each
entry in the original array prior to normalization.counts
- the array to be converted into a probability distributionalpha
- the value to add to each entry prior to normalizationpublic static double[] mean(Collection<double[]> distributions)
Collection
of
distributions, which are assumed to be normalized arrays of
double
values.distributions
- the distributions whose mean is to be calculatedmean(double[][])
public static double[] mean(double[][] distributions)
double
values.
Will throw an "index out of bounds" exception if the
distribution arrays are not all of the same length.distributions
- the distributions whose mean is to be calculatedCopyright © 2015. All rights reserved.