http://www.cscs.umich.edu/~crshalizi/notebooks/

I enjoyed the ‘superstatistics’ the ‘information theory’ example and many others. There are even some regarding the turbulent problems. For example this: “Why do physicists care about power laws so much?” 🙂

whenever we see power law correlations, we assume

there must be something complex and interesting going on to produce them. (If

this sounds like the fallacy of affirming the consequent, that’s because it

is.) By a kind of transitivity, this makes power laws interesting in

themselves.

or this:

So

badly are we infatuated that there is now a huge, rapidly growing literature

devoted to “Tsallis statistics” or “non-extensive

thermodynamics”, which is a recipe for modifying normal statistical

mechanics so that it produces power law distributions; and this, so far as I

can see, is itsonlygood feature.

and this I sign (not tenured :-))

I will not attempt, here, to

support that sweeping negative verdict on the work of many people who have more

credentials and experience than I do.

Enjoy.

http://www.cscs.umich.edu/~crshalizi/notebooks/information-theory.html

Imagine that someone hands you a sealed envelope, containing, say, a

telegram. You want to know what the message is, but you can’t just open it up

and read it. Instead you have to play a game with the messenger: you get to

ask yes-or-no questions about the contents of the envelope, to which he’ll

respond truthfully. Question: assuming this rather contrived and boring

exercise is repeated many times over, and you get as clever at choosing your

questions as possible, what’s the smallest number of questions needed, on

average, to get the contents of the message nailed down?This question actually has an answer. Suppose there are only a finite

number of messages (“Yes”; “No”; “Marry me?”; “In Reno, divorce final”; “All is

known stop fly at once stop”; or just that there’s a limit on the length of the

messages, say a thousand characters). Then we can number the messages from 1

to N. Call the message we get on this trialS. Since the game is

repeated many times, it makes sense to say that there’s a probabilityof getting message number

ion any given

trial, i.e. Prob(S=i) =

. Now, the number of yes-no questions needed to pick out any

given message is, at most,

, taking the logarithm to base two. (If you were allowed to

ask questions with three possible answers, it’d be log to the base three.

Natural logarithms would seem to imply the idea of their being 2.718… answers

per question, but nonetheless make sense mathematically.) But one can do

better than that: if messageiis more frequent than

messagej(if

), it makes sense to ask whether the message isi

before considering the possibility that it’sj; you’ll save time. One

can in fact show, with a bit of algebra, that the smallest average number of

yes-no questions is

. This gives us

when all the

are equal, which makes sense:

then there are no prefered messages, and the order of asking doesn’t make any

difference. The sum is called, variously, the information, the information

content, the self-information, the entropy or the Shannon entropy of the

message, conventionally writtenH[S].

powered by performancing firefox