From mblum@cs.cmu.edu Thu Sep 20 12:13:58 2001
Date: Thu, 20 Sep 2001 12:05:53 -0400
From: Manuel Blum
To: hopper@cs.cmu.edu
Subject: crypto lecture 3 (with partial answers to some hw problems)
[ The following text is in the "iso-8859-1" character set. ]
[ Your display is set for the "US-ASCII" character set. ]
[ Some characters may be displayed incorrectly. ]
SECURITY and CRYPTOGRAPHY 15-827 19 SEP 01
Lecture #3
M.B.
M.B.
4615 Wean
HANDOUT: First 36 pages of "The MEMORY BOOK" by Harry Lorayne and Jerry
Lucas, Ballantine Books, ISBN:0-345-41002-5 (1974. softcover edition 1996).
As you have noticed, this is not a standard class on Security and
Cryptography. Let me remind you again: The goal of this class is to write a
joint research paper in the area of Human Interactive Protocols (HIP).
This will include Human Oriented IDentification (HumanOID) and CAPTCHA.
Right now, we are still trying to get moving on HumanOID.
Our first goal for HumanOID is to generate a set of instructions enabling
"almost any" human that can read and write English, age 6 to 60, to
"easily" construct for herself an "unbounded" source of "personal"
CHALLENGE - RESPONSE pairs.
ALMOST ANY HUMAN means at least 1/100 of all people who can read and write
English. The actual fraction can be determined statistically.
EASILY CONSTRUCT means that after an hour of practice it takes only a few
minutes to construct a new challenge-response pair. UNBOUNDED means can
construct at least one a day.
PERSONAL means that the pairs should be unique to each human:
no two humans have the same pairs except with some extremely small probability.
CHALLENGE-RESPONSE means... a CHALLENGE consisting of a word, a few words,
or a short sentence. a RESPONSE consisting of a sentence or
"random-looking" string of 6 characters.
RANDOM-LOOKING means that the string of at least 6 characters is no more
probable than any one of 10^6 other possible strings. (Note: This means
that the chance of an eavesdropper randomly choosing the right string is
less than 1/10^6. I am not asking for 1/26^6.)
The pairs should have the property that an EAVESDROPPER who overhears any
subset of these pairs cannot COMPUTE the correct response to even one new
challenge, except by some small chance probability, like 10^-6.
EAVESDROPPER = a person who overhears or otherwise has access to a subset
of challenge-response pairs.
COMPUTE means that the eavesdropper has a powerful computer at his
disposal: a CRAY for a day.
1000 workstations for a month.
full access to the web.
I am grateful to Scott CROSBY for spending an hour with me after the last
class to point out two things:
1. I better get the problem a lot better defined. This is important in
order to have a chance to decide whether a solution IS POSSIBLE or NOT.
2. Access to the web invalidates a great many of my own personally
proposed challenge-response pairs.
I am grateful to Rachel RUE for pointing out that The MEMORY BOOK
(handout), and indeed memory builders in general, are an important resource
for this project.
The Memory Book gives methods for remembering a great many things including
people's names. It can be turned to our purposes.
Challenge: My Polish student who studied at ETH Switzerland.
Hidden: Bartosz Przydatek -> Bar-tek, Prince of data ->
Response: BARTEKPOD
Challenge: Chinese student I invited to Guang Zhou and CMU.
Hidden: Ke YANG -> YANG Ke ->
A Connecticut Yankee in King Arthur's court.
Response: ACYIKAC
Challenge: My PhD student Minnesota born and raised.
Hidden: Nick Hopper -> Nickles hopping on a table.
Response: NHOAT
Challenge: My Guatemalan PhD student
Hidden: Luis von Ahn -> Don't lose your fountain Pen!
Response: DLYFP
Challenge: The Reverend Charles Dodgeson
Hidden: Lewis Carroll -> My son's middle name. My wife's
middle name. ->
RESPONSE: MSMNMWMN
The idea: Give a CHALLENGE that evokes HIDDEN possibly web-searchable info,
then turn that into a PRECISE RESPONSE of related memorable information.
PRECISE means that it can be written down in ascii and a computer can
check for a simple exact correspondence.
EXAMPLES:
I asked my wife to give me 3 responses to the following 3 challenges:
Challenge: Your father
Response: Coney Island
Challenge: Your mother
Response: Orchid
Challenge: Your sister
Response: Kosovo
It's not clear to me just how long she'll be able to remember it, but I'll
find out. I've made notes in my calendar to ask for her responses in a
day, a week, a month, a year and 10 years from now. The purpose in this is
twofold: to check how well she does and to solidify her memory.
I asked my 84 year old mother to give the nicknames of her mother's (my
grandomother's) 10 siblings, and a tidbit for each. No problem. For example,
CHALLENGE: Your mother's sister Gusta.
HIDDEN: I was the flower girl at her wedding.
RESPONSE: flower girl
CHALLENGE: Your mother's sister Klara.
HIDDEN: Her husband was the doctor in his town. He never
charged relatives.
RESPONSE: No charge.
CHALLENGE: Your mother's brother Moishe.
HIDDEN: He was the best (most generous) of them.
If I needed anything, I would go to him.
RESPONSE: The best.
We are still far from a "virtually infinite" number of challenge-response
pairs.
HOMEWORK SET #2
(To be turned in to Nick Hopper hopper@cs before class next Tuesday.)
HW 2.1
Give 10 challenge-response pairs. Each response should have at least 6
random-looking characters. (Definition of random - looking has been given
above.)
You should store these pairs and check your memory in a day, a week, and a
month from now. At any moment in time in this course, you may be asked to
reply to 1 or more of your random challenges.
HW 2.2
Suggest a virtually infinite source of personal challenge - response pairs.
For both problems below, give an "excellent approximation" that works well
in the limit and also works well for small numbers.
HW 2.3 (BIRTHDAY PARADOX).
In a world with d days per year, what is the probability pr that no two
people in a class of p people have the same birthday?
* Give an exact formula
* Substitute p = sqrt(d) and show that in the limit as d -> infinity, the
correct answer is quite pretty:
* Give an approximation that is easy to compute on a calculator for very
large d yet works "well" also for small numbers, like d=10.)
HW 2.4 (COUPON COLLECTOR'S PROBLEM)
Give an exact or very good approximate solution to this problem (see its
statement below) that you can use on a simple calculator. Your solution
CC(n) should be correct in the limit as n -> infinity, in the sense that
the ratio of your approximation to the actual value of CC(n) should go to
1. In addition, your approximation should give good results for small n,
like n=10.
Recall definitions from last lecture:
BIRTHDAY PARADOX
QUESTION: In a world with n days per year, how many people should one
invite to a party so that there is a roughly 50% chance that at least 2
people have the same birthday?
ANSWER: (1.2)*sqrt(n)
COUPON COLLECTORS PROBLEM
QUESTION: A cereal box contains one of n coupons, each coupon chosen
uniformly at random (i.e. each coupon is equally likely to appear in a
box). How many cereal boxes should one expect to buy in order to get all n
coupons?
ANSWER: approximately n*(lg n).
QUESTION: What base?
In a class of 35 students, the probab that all 35 have different birthdays
is 365*...*(365-35+1)
--- -------- = .185
365 365
In a class of 40, it is approx = .108