How can we measure the amount of information that some text has in it, since we can’t just use the length (for example, a long repetitive document might have less information than a short informative one even though it has more characters in it.)
The presenter secretly writes down a word or sentence for the student to guess. Ensure the student doesn’t see what has been written. The example sentence here was used in an article about this idea, written in 1951 by Claude Shannon. However, you can use any sentence you want that is appropriate to the student’s ability to know the words.
Draw about 10 boxes across the top of a piece of paper, with each box big enough to write a letter in (the student or the presenter can do this).
I’ve written down a word/sentence I’d like you to guess. What do you think the first letter is?
If the student’s guess of the first letter is correct, write it at the top of the paper in the first box, otherwise write it underneath the box. In the example here, the student guessed "I" incorrectly, so it is written under the first box.
Sorry, that’s not it. I’ll write your guess down here so you don’t guess the same letter again. What’s your next guess? (Repeat this until they guess the first letter correctly.)
Each wrong guess is written underneath; in this case the wrong guesses are I, A, Y and O. The student then guesses T, which is right, so that is written in the first box. Note that they must strictly guess each letter of the sentence before moving on to the next one. Sometimes when you can’t work out a letter, you almost need to go through every letter of the alphabet!
What do you think the next letter is?
The student then guesses the second letter; of course, the first letter will give them a clue. In this case, they immediately guess that the next letter is H.
In this game, spaces and punctuation count as a "letter", so it can be better to say "what comes next?" rather than "what’s the next letter". You can draw a space by writing a line; in this example the student guessed a space after "THE" (wrong) and then after "THERE" (right).
Some letters will be easy to guess, others may take quite a few attempts. This is the point of the activity - the ones that are hardest to guess are the most interesting ones!
Continue allowing the student to guess until they have the whole word or sentence. You may need to use more paper! Discuss with the student what they think information is. Which of the letters in this guessing game had the most information in them? (The ones that were hard to guess). Which didn’t have much information (the ones that were obvious).
This kind of experiment is used by computer scientists to measure the amount of "information" in a document. Being able to have a good guess at what is coming next is the basis of working out how to reduce the amount of data used for sending text, photos and videos (this is called compression, and is the underlying theory behind showing videos over the internet and storing lots of photos or songs on a mobile phone.
If you had a 400 page book containing the phrase "blah, blah, blah" over and over, how easy would the letters be to guess using this game? (Probably very easy - there’s not much information there). What about a 400-page instruction manual for flying a plane? (Probably hard to guess a lot of words - it has a lot of information).
Explain that computer scientists measure information by how surprising a message (or book!) is. Telling you something that you know already — for example, when a friend who always walks to school says "I walked to school today" — doesn’t give you any information, because it isn’t surprising. If your friend said instead, "I got a ride to school in a helicopter today," that would be surprising, and would therefore tell us a lot of information. The number of yes/no guesses indicates how much information there is in the text.
This kind of prediction is also used for auto-completing text while you are typing. The computer is guessing what is most likely to come next (based on what people usually type).
Click here to view Arnold's challenges:
Arnold's Challenges