# Intuition of information theory I am reading the book "Elements of Information Theory" by Cover and

Intuition of information theory
I am reading the book "Elements of Information Theory" by Cover and Thomas and I am having trouble understanding conceptually the various ideas.
For example, I know that H(X) can be interpreted as the average encoding length. But what does $H\left(Y|X\right)$ intuitively mean?
And what is mutual information? I read things like "It is the reduction in the uncertainty of one random variable due to the knowledge of the other". This doesn't mean anything to me as it doesn't help me explain in words why $I\left(X;Y\right)=H\left(Y\right)-H\left(Y|X\right)$. Or explain the chain rule for mutual information.
I also encountered the Data processing inequality explained as something that can be used to show that no clever manipulation of the data can improve the inferences that can be made from the data. $I\left(X;Y\right)\ge I\left(X;Z\right)$. If I had to explain this result to someone in words and explain why it should be intuitively true I would have absolutely no idea what to say. Even explaining how "data processing" is related to a markov chains and mutual information would baffle me.
I can imagine explaining a result in algebraic topology to someone since there is usually an intuitive geometric picture that can be drawn. But with information theory if I had to explain a result to someone at comparable level to a picture I would not be able to.
When I do problems its just abstract symbolic manipulations and trial and error. I am looking for an explanation (not these blah gives information about blah explanations) of the various terms that will make the solutions to problems appear in a meaningful way.
Right now I feel like someone trying to do algebraic topology purely symbolically without thinking about geometric pictures.
Is there a book that will help my curse?
You can still ask an expert for help

• Questions are typically answered in as fast as 30 minutes

Solve your problem for the price of one coffee

• Math expert for every subject
• Pay only if we can solve it

komizmtk
Christopher Olah wrote an excellent intuitive explanation of Information Theory called - Visual Information Theory. It provides thougtful visualizations for understanding these concepts.
In addition there was a paper that introduced a tool for visualizing mutual information called The Mutual Information Diagram for Uncertainty Visualization that may be useful.
###### Not exactly what you’re looking for?
Briana Petty
Since you have an intuitive understanding of entropy based on the compression theorem, you should look into the operational meaning of mutual information, which is the channel coding theorem.
It says if you have a noisy channel with a joint distribution p(X,Y), then it can transmit information encoded in X to a receiving party with access to Y at a rate of I(X;Y) bits per symbol.