can KL aberration merge to 0, yet difference of log probabilities not merge to 0?

Allow $p(x)$ be a set circulation.

Allow $A, C > 0$ be constants. Allow $\epsilon > 0$. Can we locate an instance of a circulation $q_{\epsilon}$ such that $\mathrm{KL}(p||q_{\epsilon}) < \epsilon$, yet $E[(\log (p(x)/q(x)))^2] \ge A * \epsilon^C$?

If $p$ and also $q_{\epsilon}$ would certainly be a coin throw or a geometric circulation, that would certainly be also much better.

I wish I am making myself clear. This is an instead tough inquiry. Probably I can additionally get an idea if a person can inform me whether there is a name for the amount $E[(\log (p(x)/q(x)))^2]$. (KL aberration is specified as $E[\log (p(x)/q(x))]$.)

All assumptions are taken relative to $p$. So, as an example,

$E[(\log (p(x)/q(x)))^2] = \sum_x p(x) \left(\log (p(x)/q(x))\right)^2$

2019-05-18 20:57:52
Source Share
Answers: 1

The suggestion is to change simply one regard to a circulation $p$ in such a way that transforms the circulation really little yet modifies the difference of the log proportion greatly. In order to make this job, the term that obtains changed adjustments at each action, taking us further and also further right into the tail of $p$, in the hope that this will certainly ensure the merging of the changed circulations to $p$ in the feeling of the KL aberration in between them mosting likely to $0$.

Allow $p$ be the geometric circulation

$$\Pr(x) = p(x) = 2^{-x}, x = 1, 2, \ldots$$

and also for each and every $n = 1, 2, \ldots$ allow

$$q_n(x) = C_n p(x) \quad \text{if}\quad x \ne n$$. $$= 2^{-f_n} \quad \text{if} \quad x = n$$

where $f_n$ is a series of favorable actual numbers to be established and also $C_n$ is a series of stabilizing constants. The need that $q_n$ be a probability function establishes $C_n$:

$$1 = \sum_x {q(x)} = C_n \sum_{x \ne n} {p(x)} + 2^{-f_n} = C_n (\sum_x {p(x)} - p(n)) + 2^{-f_n} $$

$$= C_n (1 - p(n)) + 2^{-f_n} = C_n(1 - 2^{-n}) + 2^{-f_n},$$


$$C_n = (1 - 2^{-f_n}) / (1 - 2^{-n}).$$

We can currently calculate the KL aberration:

$$\mathrm{KL}(p||q_n) = \sum_x {p(x) \log( p(x) / q_n(x))}.$$

Breaking this right into an amount over $x \ne n$ and also the continuing to be term as prior to offers

$$= \sum_x {p(x) \log(C_n)} - p(n) \log(C_n) + p(n) \log( 2^{-n} / 2^{-f_n} )$$

$$= \log(C_n) \left(1 - p(n) \right) + p(n) \left( f_n - n \right)$$

$$= \log(C_n) \left(1 - 2^{-n} \right) + 2^{-n} \left( f_n - n \right)$$

(making use of the logarithm base 2 for convenience of calculation). Notification that the series $\log(C_n)$ merges to $0$ given $f_n$ deviates, so the first summand mosts likely to $0$. To get the entire point to merge to absolutely no we consequently call for that $f_n - n = o(2^n)$ ; i.e. , $f_n$ needs to not deviate also swiftly.

A comparable calculation generates the assumption of the made even log proportion:

$$ \sum_x {p(x) ( \log( p(x) / q_n(x)) )^2}$$

$$= \sum_x {p(x) (\log(C_n))^2} - p(n) (\log(C_n))^2 + p(n) \left(\log( 2^{-n} / 2^{-f_n} ) \right)^2$$

$$= (\log(C_n))^2 \left(1 - 2^{-n} \right) + 2^{-n} \left( f_n - n \right)^2 .$$

Again the first summand merges to absolutely no. Nonetheless, we can pick $f_n$ (based on the previous constraint not to deviate also promptly) to make the appropriate summand act virtually as we desire. 3 regimens are shown by these selections:

  1. $f_n = n^2$. The appropriate summand merges to $0$.

  2. $f_n = 2^{n/2} + n$. The appropriate summand amounts to a constant $1$, merging to $1$.

  3. $f_n = (2^{n/2} + n)n$. The appropriate summand (equivalent to $n^2$) deviates.

In all instances $f_n - n = o(2^n)$ as called for, making certain that $\mathrm{KL}(p||q_n) \to 0$. Ultimately, the difference of $\log( p / q )$ acts like the assumption of $(\log( p/q ))^2$ due to the fact that the square of the assumption is merging to $0$.

This reveals that the asymptotic actions of the difference of the log proportion is basically independent of the merging of $q$ to $p$ in the feeling of the KL aberration.

$q$, incidentally, is the probability circulation for a straightforward trimmed experiment given $f_n \ge n$ (that includes all 3 instances): allow $X$ be the variety of turns of a reasonable coin required to get to heads the very first time. If the variety of turns is not $n$, approve $X$. Or else, just in case $X = n$, create a consistent arbitrary number $U$. If $U \lt 2^{n - f_n}$, approve the outcome $X$. Or else, re - run the very same experiment recursively till at some point some value is approved.

2019-05-21 02:26:39