Cross Entropy, KL Scatter | Definitions and Interrelationships

1 KL dispersion

For discrete probability distributions\(P\) cap (a poem)\(Q\) , the KL dispersion is defined as:

\[\text{KL}(P \| Q) = -E_{x\sim P}\log P(x)-\log Q(x) \\ =\sum_{\mathbf{x}} P(\mathbf{x}) \log \frac{P(\mathbf{x})}{Q(\mathbf{x})} \]

For a continuous probability distribution, defined as:

\[\text{KL}(P \| Q) = \int p(\mathbf{x}) \log \frac{p(\mathbf{x})}{q(\mathbf{x})} d\mathbf{x} \]

Among them.\(p(\mathbf{x})\) be\(P\) The probability density function of the\(q(\mathbf{x})\) be\(Q\) The probability density function of the

KL Properties of dispersion:

non-negative: KL dispersion is always non-negative.\(\text{KL}(P \| Q) \geq 0\)。
asymmetry: The KL dispersion is not symmetric, i.e., the\(\text{KL}(P \| Q) \neq \text{KL}(Q \| P)\)。
zero point(coll.) ding dong\(P\) cap (a poem)\(Q\) Exactly the same when\(\text{KL}(P \| Q) = 0\)。
does not satisfy the triangle inequality: The KL dispersion does not satisfy the triangle inequality in the traditional sense.

Cross-entropy (cross-entropy) is closely related to KL dispersion and can also be used to measure the difference between two distributions.

For discrete probability distributions\(P\) cap (a poem)\(Q\) , the cross entropy is defined as:

\[H(P,Q)=-E_{x\sim P}\log Q(x)=-\sum P(x_i)\log Q(x_i) \]

For a continuous probability distribution, defined as:

\[H(P,Q) = -\int p(\mathbf{x}) \log q(\mathbf{x}) d\mathbf{x} \]

It can be seen that\(H(P,Q)=H(P)+D_\text{KL}(P \| Q)\) which\(H(P)\) is the entropy of P .

Nature:

Non-negativity.
and KL dispersion are the same, and the cross-entropy does not have symmetry, i.e.\(H(P,Q)\neq H(Q,P)\);
Finding the cross-entropy for the same distribution is equivalent to finding the entropy for it.