Information Theory: Entropy and Mutual Information

Given a discrete r.v. $X$, where $X$ ranged in $\{a_1, \ldots, a_n\}$, $\mathbb{P}(X=a_k)=p_k$. Entropy $H(X)$ is defined as:

$$H(X)= - \sum_k p_k \log p_k$$

When regarded as a function of $\{p_k\}$, entropy satisfies the following properties:

  1. $H(p_1,\ldots,p_n)$ is continuous, and non-negative;
  2. $H(p_1,\ldots,p_n)$ is convex w.r.t. $(p_1,\ldots,p_n)$;
  3. $H(p_1,\ldots,p_n)$ has a unique maxima $(\frac{1}{n},\ldots,\frac{1}{n})$;
  4. $H(n):=H(\frac{1}{n},\ldots,\frac{1}{n})$ increases along with $n$;
  5. $H(p_1,\ldots,p_n)=H(p_1+\ldots+p_k,p_{k+1},\ldots,p_n)+(p_1+\ldots+p_k)H(p_{k+1}',\ldots,p_n')$.

Property 5 is so-called addictivity. That is, if we observe $X$ in two steps, firstly obtaining a value from $\{\hat{a},a_{k+1},\ldots,a_n\}$ and then another value from $\{a_1,\ldots,a_k\}$ if $\hat{a}$ selected, the entropy of the whole system should be sum of these two subsystems.

Note that a function satisfying property 1, 4, 5 must have a form of $H(\vec{p})= - C \sum_k p_k \log p_k$, which reveals that entropy function is unique.

Entropy measures the uncertainty of a random value. Intuitively, entropy reaches its maximum $\log n$ when all alphabets occur with same probability, and likewise has a minimum of $0$ if $p_k=1$ for some $k$.

Entropy also represents the smallest average length to encode a message. Say we have a message consisting of alphabets $a_1,\ldots,a_n$, occurring with probability $p_1,\ldots,p_n$. Now we want to assign a code (an $N$-ary string) to each alphabet, with no two codes sharing a same prefix. The length of the codes are denoted as $l_1,\ldots,l_n$. Shannon’s source coding theroem states that the average code length $\sum_k p_k l_k$ could not be less than $H(p_1,\ldots,p_n)$ (taking $N$ as logarithm base).

Read More


铁板烧

最近喜欢上了那家铁板烧。在寸土寸金的北京,那家店占据了食街不大的一角。开放式的厨房几乎就是它的全部。一圈细长的大理石柜台,被椅子围得严严实实。人们就此坐下,点菜,颇有一番日式的味道。隔着不高的玻璃,可以看到三个大厨在柜台另一边忙碌。食材触到铁板的「吱」响,伴随时不时的爆裂声,白色的烟汽袅起,混合着诱人的香味,好不热闹。

入冬后,食街上的人更多了。日落后,室外温度急剧下降,这时来热闹的地方吃点热热的东西,再合适不过了。独自点上一份肉,一份菜,一碟满满的炒饭,吃得满面油光,奢侈而满足。推门而出,白天发生的不愉快,或是接下来要加班的怠惰,多少缓解了一些。但每次吃完,衣服上,头发上总会带上一股味道——铁板烧的味道,油烟的味道。

有人不在乎这味道,觉得这是烟火味,是相关料理的灵魂,是生活的气息。在以前,我是很痛恨这种说法的。吃火锅,或是烧烤时,高温食材逸散出的分子,几乎无孔不入,附在衣服上、头发上,就像被泼了一身油。每一回都要努力地洗澡,再将衣服由里至外换一遍,方可除去这种味道。我不能忍受这样的味道,藉而讨厌相关的料理。但近几年,这种想法似乎在慢慢消失,我开始能接受这种味道。尽管还是接受不了烧烤,但这种铁板烧倒是可以,甚至慢慢喜欢上了。

讨厌烟火味,和睡前一定要洗澡一样,这些观念的背后似乎有着奇妙的机理。人们小时候的经历,或是大人的说教,或是自身的负面遭遇,仿佛在主导这些观念。怕小鸡的以前踩死过小鸡,不敢吃鱼的是被鱼刺卡过。但我的这个观念好像是自发产生的,没有听从任何人表达过这样的观点,而是打心底里就讨厌这种味道。

如此根深蒂固的观念,又是如何被改变的呢?大概是太忙碌,忙得无暇顾及这类事。烟汽熏天的铁板烧,是我不可多得的闲暇时光。生活于生存之上,生存尚不能满足,又谈何更高层次的吹毛求疵?

一个习惯的改变,或者是一种生活方式的丧失。


西郊线

西郊线真的是电车,那种拖着两个辫子,走得不快的电车。沿途的车站也是很小的车站。不到两米宽的站台,甚至比广州的 BRT 站更窄一些,立了一排凳子,仅此而已。半露天的车站,有遮雨棚,却没有真正意义的墙。透过一人多高的玻璃看,站台外沿长满了狗尾巴草,再往后铺开了大片的草甸和桦林,一直延伸到远处的香山,和仿佛水洗后的天空。午前的秋阳给眼前的一切都染上了愉悦的暖色。

Read More


Proof of the Gumbel Max Trick

Statement

Assume that $\alpha_1, \alpha_2, \ldots, \alpha_n$ satisify $\sum_k\alpha_k=1$. Define

$$Z=\arg\max_k\{\log\alpha_k+G_k\}$$

where $G_1,\ldots,G_n \text{ i.i.d.}\sim Gumbel(0,1)$, whose PDF and CDF are defined as

$$\begin{align} f(x)&=e^{-(x+e^{-x})} \\ F(x)&=e^{-e^{-x}}\end{align}$$

. Then $\mathbb{P}(Z=k)=\alpha_k$.

Proof

Set $u_k=\log{\alpha_k}+G_k$. We prove by direct calculations.

$$\begin{align} \mathbb{P}(Z=k)&=\mathbb{P}(u_k \geq u_j,\forall j \neq k) \\ &=\int_{-\infty}^\infty \mathbb{P}(u_k \geq u_j, \forall j \neq k|u_k)\mathbb{P}(u_k) du_k \\ &=\int_{-\infty}^\infty \prod_{j\neq k}\mathbb{P}(u_k \geq u_j|u_k)\mathbb{P}(u_k) du_k \\ &=\int_{-\infty}^\infty \prod_{j\neq k}e^{-e^{-u_k+\log \alpha_j}} e^{-(u_k-\log\alpha_k+e^{-(u_k-\log\alpha_k)})} du_k \\ &=\int_{-\infty}^\infty e^{-\sum_{j\neq k}\alpha_je^{-u_k}} \alpha_k e^{-(u_k+\alpha_k e^{-u_k})} du_k \\ &=\alpha_k \int_{-\infty}^\infty e^{-u_k-(\alpha_k+\sum_{j\neq k}\alpha_j)e^{-u_k}} du_k \\ &= \alpha_k \end{align}$$.

Application

The trick is commonly used in DL to make sampling over a discrete distribution differentiable.

References


Option::as_ref

Let’s consider the following function:

use std::ptr::NonNull;

fn transform<T>(option: &Option<NonNull<T>>) -> Option<&T> {
option.map(|x| unsafe { x.as_ref() })
}

The function transform takes an Option<NonNull<T>> as input, and converts the inner pointer to an immutable reference &T if possible. The method NonNull::as_ref() is marked unsafe so we need an unsafe block. The snippet causes an compilation error:

Read More


Rc, RefCell and Interior Mutability

Say we need a type Cursor<T> , which holds a mutable reference to T. A method .dup() duplicates the internal reference, wraps it in a new instance of Cursor<T> and returns. Such pattern exists commonly in database driver library. Users could hold multiple cursors simultaneously, with each owning a (mutable) reference to the same connection object.

One might implements with a primitive mutable reference:

struct Cursor<'a, T> {
obj: &'a mut T,
}

impl<'a, T> Cursor<'a, T> {
fn new(t: &'a mut T) -> Cursor<'a, T> {
Cursor { obj: t }
}

fn dup(&mut self) -> Cursor<T> {
Cursor { obj: self.obj }
}
}

fn main() {
let mut i = 1;
let mut cursor_a = Cursor::new(&mut i);
let _cursor_b = cursor_a.dup();
}

Perfect and neat, and luckily Rust compiler did not complain. Fresh Rustanceans would have to work hard for shutting up the compiler, especially when fighting with references.

The invocation of ::new() and .dup() are on separate lines. Now what about to chain up the constructor and .dup()? This time the compiler fails:

Read More


Visualizing Correlation

Say we have a matrix A of shape N x M , which can be viewed as a collection of N vectors of shape 1 x M . The code below gives us the correlation matrix of A :

A_corr = np.corrcoef(A)  # shape: (N, N)

To visualize it, just use plt.matshow(A_corr) .

If N is so large that the figure could not provide a clear insight, we might alternatively use histograms like this:

def corr_matrix_to_array(corr_mat):
N = corr_mat.shape[0]
return np.array([corr_mat[i][j] for i in range(1, N) for j in range(i + 1, N)])

plt.hist(corr_matrix_to_array(A_corr), bins=np.linspace(-1, 1, N_bins))

三月十日杂感

我还是说出了那句话。我们走到一起,已经有七天了。

起初并没有表白的想法。当晚看完电影后,两人都有微妙的感觉,但直到最后也没有表露心意。「就当做又一次平常的约电影好了」目送她进楼时我想。看电影时没有说出来,她回去了我就更说不出来了——我认为我们当时还是普通朋友关系,表白显得太唐突了。我兴奋又失落地回到宿舍。兴奋的是我们之间的关系貌似又近了一些,失落的是这样下去大概率是没有结果的。

但当晚,她接连发了几条动态,内容和语气都暗示了她对我的好感,只是和我一样,缺乏勇气。两个相互有好感却又没有恋爱经验的人(当然这是我后来知道的),就要这样还未开始便结束了彼此的关系,这怕是上天对胆小鬼的惩罚吧。

Read More


三月一日杂感

我是怎么想的呢?我应该是喜欢她了。她看起来像是一个单纯的女孩,长得可爱,有些呆,总是一副样子很努力的样子。我们有共同的话题,应该可以谈得来,虽然谈话常因为关系不够深入而戛然而止。

她是怎么想的呢?她专心起来很少注意周围,我们有许多次出现在同一个场所,但她很少注意到我。看她朋友圈,大学时应该是有过被人纠缠的烦恼。她成绩不是很理想,为未来烦恼着。

我希求的是一段相互平等的关系,两人的能力和价值观不应该有太大的差距,两人的走近不应该妨碍各自的理想。我不希望我的另一半为了我而放弃自己一直追求的东西。伴侣就是伴侣,不是任何一方的附属品,也不应被任何一方「占有」。如果坚持这个理念,我们的关系想来是不会长久的,至少我走的路,我要去的地方,她不会想去。

那么,就让这段关系不要开始?这或许是一个好选择。我们的认识源于偶然,一年中保持着模糊的朋友关系。进一步发展,如果她不接受,那我就会陷入尴尬的境地;就算接受,毕业时也是要告别的。没有开始,就不会有结束。

但或许,我只是个胆小鬼罢了。

天性の弱虫さ


二月十一日杂感
mybirth:bPNCBXyPaSNUzw/pnDHCvg==:XmV/Km36hgVGWzBaxi/vSA==:AShsE4U6aTxtO2IyCKoxDijDwP1Xp8Q+8c8y7XSpoMdIf7SU2NWOY+TgQgkPX69k0eW3KJPyDuH9C0mTerMlXmwgbtkxJ1YGfxYuMlXAu7w9VwpUB4Okt+7jbjR6M1rJ6zh0nUjGAjK0A3j4XZ5v/4DNvk1/IP7ScR7xxoRPPlGA7VlTeKVIk/+d6Jw7UmDl68Hvo5EdeU/3gpGx0d51uAweuXM2uGAFHjxzKsv14UIqzMJKN+DlmKdJXDUvtII1jLnQlFIn2jMd2IDJCRD1OTiyBsah611IWNUIJs+uRN9p1YNZW08NtaIuKtu/8GIvD+/0byUsf256I0VBGDqxZhONTx6IvmaBqsogRvIfxmK5nCzFVtiPf0OzAYUQ2Bclj2Daz0TNuB7ILWjzxWIMiQhRjznyHr5VO8nOUVLwWA4osU/8he4hmOcLvKE2PZiq9adf0YO+me1HaGlELQUQHOT7d80A8cAcihZugz8VWSyymZ4SyyKsIQETUnchNBsKfvK9Px3G4icVb2tVl5oChh47r2pqsOpacjbngivzDxQKIGanihve/sROysmHY1gs9Sy9ZefDIhIra2G4ZuqUuAgmoExJmyMRX2mhh/0RNn0ql7aDHMqnw1yI52YKHpIXjhI7gaR9PqGFZ1iDrboU9N8PA6KFKLdoArsFN265OkMXJZpSDI2ai4evImumivpe0vW6G5tN2HO21uHxTRj1h8AjWv6O+EVR75H9J9ERfoRfXxy1X+aXkQN3mYicMzNTcIlP5CM8F5j239qfPmhhZaQVyfLtkuOBrvH+IsDzN3d5MUkYKVgwpX4DABpg9d66RTeCx24UG9p/mOIp/4Pqsyh0HzwwOk3bxRsoPJIp/ceAh3VqlZ1zE1O5xaqwId1UGS74xZZKdo5WK6FJeWAiEO0ZHBYYajjr7aVhXf20JFh/laNOYTHwu5EsvUE7e6ZPzMCscJ+++JvHM3qVUN9zKKlg2hlnaWHhcPEEmfXsj1tWMvvxTiP/DTNaPiGKd5TmZ+YY7Ao32H5Z6YrGcfMKoSj1PLY0eA0IPwXQvCM7W7IfYxGS+7YqtXVpybgnyuhZgTePArN+UHEBicQasixf41plNpHyy6/KNUiP2QYOcqeerOcOgLzSnVL7bJk2saoL74de0YqlSZCD/slXgwg9IboaNiQEYVs87sKabWoyxnAHnIpnvz2AYRwxDGEdaS4KpDEIOQb3MVhtEiujbCuT9IOuaPI65VjHvsy7kQ6xAijCqLQqVKZ8Qxqc2RK5wUnp8wQJo/VaR0IK0h143FZwxfKE7cllRhrQCZRPZfCqfl1yxjoW8Sg+2M7mtdIYqZBGV2Rs3bQ1adob3ERzDiyaKKWI/VxZlsy7KCnOnETzGndk0m9zhIxvGV/xQl23fkw/Eg0PZxAzarDfBp9/swavhNvA3zAWaQG9smAypUxecwBrhoWefToPz5RjMj+CPcPrDzDYLJjh08rr51ku2zCf17lo9Oj/gJr5/QxNUojedpa3/hQoI0obuzNA0mdFKBXf7BevAFS3feveh0jcDBCM0UJ+Orjm3ChL2T3RdTjGeboUv1F5UtjvWfxj8ueOxJApSExIr9hgb+SjOAD/uE6YmymKEQansTUCzb9b1eceONNYTlpDGSlzk+DPKW5wpPfsK9K0csjLp3lFou8dOg1v7pwolJs+SBuDI+iLRHJusnjK0l2nWvnxrwIsKqFVo6/tlRWlZHHqKI1cpkF1Pu+b/F4a98IxmymeRhSaL2HzdWi7Asw0ND+XiGX0MswsCuoi24l2FAK5Tb9KgFWL08tgp3bco1jpzBj6kqIjEUcvAKukUVhBlIM3p0FBhSCR7JV/ZaaDPR3GorhYKujCli2XAV7jfgVGphjxUQ97iwCSdcs/kGENSzlKWM3ZRpCWfHOfcAq7aqbF9KThMWnSLRcQg69oivn5ejbpDmDVeIs6oKbISLHNOxwMvoQw7KBFLZhbclTlAbdrOk/W+vZBxjs6EqH54+eNoRtmN/qmR1MuPSsReBF6ZBHqNBl8L/JYyCH3e25o0kT2G1B1yxOsVy8ZPvkmxJJuaw7lBdNb4imOq8YH3hebigWFNuL7vMp/k0A3tTL6+tY8WtijiynsLdYISnRKW6FJwXRwUibS0UfVBZAdlRnjD3g1DOQv9kUX1Glzgj+ho2Sf8ToqTh3/KW51D13wnlOp33I5as3klQvrKqNjIP0tgp8l3lToMfXjIBF9YH9VJ73HRuHPol1Cm6mPp0jwEWJ6KPyJeQDRG8+Gfk24vKzem4yDkxW3iS/gzeB1nJMbntyPmxydxsOLxAs02w==
**THIS CONTENT IS ENCRYPTED**
 
DECRYPTING...