Recommended reading: Notebooks of Shalizi

I enjoyed the ‘superstatistics’ the ‘information theory’ example and many others. There are even some regarding the turbulent problems. For example this: “Why do physicists care about power laws so much?” ๐Ÿ™‚

whenever we see power law correlations, we assume
there must be something complex and interesting going on to produce them. (If
this sounds like the fallacy of affirming the consequent, that’s because it
is.) By a kind of transitivity, this makes power laws interesting in

or this:

badly are we infatuated that there is now a huge, rapidly growing literature
devoted to “Tsallis statistics” or “non-extensive
, which is a recipe for modifying normal statistical
mechanics so that it produces power law distributions; and this, so far as I
can see, is its only good feature.

and this I sign (not tenured :-))

I will not attempt, here, to
support that sweeping negative verdict on the work of many people who have more
credentials and experience than I do.


Imagine that someone hands you a sealed envelope, containing, say, a
telegram. You want to know what the message is, but you can’t just open it up
and read it. Instead you have to play a game with the messenger: you get to
ask yes-or-no questions about the contents of the envelope, to which he’ll
respond truthfully. Question: assuming this rather contrived and boring
exercise is repeated many times over, and you get as clever at choosing your
questions as possible, what’s the smallest number of questions needed, on
average, to get the contents of the message nailed down?

This question actually has an answer. Suppose there are only a finite
number of messages (“Yes”; “No”; “Marry me?”; “In Reno, divorce final”; “All is
known stop fly at once stop”; or just that there’s a limit on the length of the
messages, say a thousand characters). Then we can number the messages from 1
to N. Call the message we get on this trial S. Since the game is
repeated many times, it makes sense to say that there’s a probability

$ p_i $

of getting message number i on any given
trial, i.e. Prob(S=i) =

$ p_i $
. Now, the number of yes-no questions needed to pick out any
given message is, at most,

$ \log{N} $
, taking the logarithm to base two. (If you were allowed to
ask questions with three possible answers, it’d be log to the base three.
Natural logarithms would seem to imply the idea of their being 2.718… answers
per question, but nonetheless make sense mathematically.) But one can do
better than that: if message i is more frequent than
message j (if

$ p_i > p_j $
), it makes sense to ask whether the message is i
before considering the possibility that it’s j; you’ll save time. One
can in fact show, with a bit of algebra, that the smallest average number of
yes-no questions is
\[ -\sum_{i}{p_i\log{p_i}} \]
. This gives us

$ \log{N} $
when all the
$ p_i $
are equal, which makes sense:
then there are no prefered messages, and the order of asking doesn’t make any
difference. The sum is called, variously, the information, the information
content, the self-information, the entropy or the Shannon entropy of the
message, conventionally written H[S].

powered by performancing firefox


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this:
search previous next tag category expand menu location phone mail time cart zoom edit close