\documentclass{llncs}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{amsfonts}
\makeatletter
\newcommand{\suc}[2]
{\mbox{\bf Succ}^{#1}_{#2}}
\newcommand{\DD}
{\mathcal{C}}
\newcommand{\SE}
{SE}
\newcommand{\SD}
{SD}
\newcommand{\RS}
{RS}
\newcommand{\KK} {\mathcal{K}}
\newcommand{\WW}
{\mathcal{W}}
\newcommand{\OO}
{\mathcal{O}}
\newcommand{\II}
{\mathcal{I}}
\newcommand{\TT}
{\mathcal{T}}
\newcommand{\UU}
{\mathcal{U}_I}
\newcommand{\PP}{\mathcal{P}}
\newcommand{\find}{{\mathcal{P}1}}
\newcommand{\labl}{{\mathcal{P}2}}
\newcommand{\CC} {\mathcal{C}}
\newcommand{\LL} {\mathcal{L}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands.
\providecommand{\LyX}{L\kern-.1667em\lower.25em\hbox{Y}\kern-.125emX\@}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Textclass specific LaTeX commands.
%\theoremstyle{plain}
\newtheorem{thm}{Theorem}
\newtheorem{cor}{Corollary}
%\numberwithin{equation}{section} %% Comment out for sequentially-numbered
%\numberwithin{figure}{section} %% Comment out for sequentially-numbered
%\theoremstyle{definition}
\spnewtheorem{defn}{Definition}{\bfseries}{}
%\newtheorem{lemma}[thm]{Lemma}
\newtheorem{cons}{Construction}
\spnewtheorem*{defn*}{Definition}{\bfseries}{}
\newcommand{\comment}[1]{}
\spnewtheorem*{prob}{Problem Family}{\bfseries}{}
\makeatother
\begin{document}
\title{CAPTCHA:\\Using Hard AI Problems For Security}
\author{Luis von Ahn \inst{1} \and Manuel Blum \inst{1} \and Nicholas J.
Hopper \inst{1}
\and John
Langford \inst{2}}
\institute{Computer Science Dept., Carnegie Mellon University,
Pittsburgh PA 15213, USA \and IBM T.J. Watson Research Center, Yorktown
Heights NY 10598, USA}
\date{}
\maketitle
%\vspace{-5ex}
\begin{abstract} We introduce {\sc captcha}, an automated test that humans
can pass, but current computer programs can't pass: any program that
has high success over a {\sc captcha} can be used to solve an unsolved
Artificial Intelligence (AI) problem. We provide several novel
constructions of {\sc captcha}s. Since {\sc captcha}s have many
applications in practical security, our approach introduces a new
class of hard problems that can be exploited for security
purposes. Much like research in cryptography has had a positive impact
on algorithms for factoring and discrete log, we hope that the use of
hard AI problems for security purposes allows us to advance the field
of Artificial Intelligence. We introduce two families of AI problems
that can be used to construct {\sc captcha}s and we show that
solutions to such problems can be used for steganographic
communication. {\sc captcha}s based on these AI problem families,
then, imply a win-win situation: either the problems remain unsolved
and there is a way to differentiate humans from computers, or the
problems are solved and there is a way to communicate covertly on some
channels.
\end{abstract}
%\vspace{-1ex}
\section{Introduction}
A {\sc captcha} is a program that can generate and grade tests that: (A)
most humans can pass, but (B) current computer programs can't pass. Such
a program can be used to differentiate humans from computers and has
many applications for practical security, including (but not limited to):
\begin{itemize}
\item {\bf Online Polls.} In November 1999, {\it slashdot.com} released an
online poll asking which was the best graduate school in computer science
(a dangerous question to ask over the web!). As is the case with most
online polls, IP addresses of voters were recorded in order to prevent
single users from voting more than once. However, students at Carnegie
Mellon found a way to stuff the ballots by using programs that voted for
CMU thousands of times. CMU's score started growing rapidly. The next
day, students at MIT wrote their own voting program and the poll became a
contest between voting ``bots''. MIT finished with 21,156 votes, Carnegie
Mellon with 21,032 and every other school with less than 1,000. Can the
result of any online poll be trusted? Not unless the poll requires that
only humans can vote.
\item {\bf Free Email Services.} Several companies (Yahoo!, Microsoft,
etc.) offer free email services, most of which suffer from a specific
type of attack: ``bots'' that sign up for thousands of email accounts
every minute.
This situation can be improved by requiring users to prove they
are human before they can get a free email account. Yahoo!, for
instance, uses a {\sc captcha} of our design to prevent bots from
registering for accounts. Their {\sc captcha} asks users to read a
distorted word such as the one shown below (current computer programs are
not as good as humans at reading distorted text).
%\vspace{-4ex}
\begin{figure}[h]
\begin{center}
\includegraphics*[scale=.39]{vonahn-ez.ps}
\caption{The Yahoo! {\sc captcha}. }
\end{center}
\end{figure}
%\vspace{-3ex}
\item {\bf Search Engine Bots.} Some web sites don't want to be
indexed by search engines.
There is an {\it
html} tag to prevent search engine bots from reading web pages, but the
tag doesn't guarantee that bots won't read the pages; it only serves
to say ``no bots, please''. Search engine bots, since they usually belong
to large companies, respect web pages that don't want to allow them in.
However, in order to truly {\it guarantee} that bots won't enter a web
site, {\sc captcha}s are needed.
\item {\bf Worms and Spam.} {\sc captcha}s also offer a plausible
solution against email worms and spam: only accept an email if you
know there is a human behind the other computer. A few companies, such
as {\it www.spamarrest.com} are already marketing this idea.
\item {\bf Preventing Dictionary Attacks.} Pinkas and Sander \cite{PINKAS}
have suggested using {\sc captcha}s to prevent dictionary
attacks in password systems. The idea is simple: prevent a computer from
being able to iterate through the entire space of passwords by requiring a
human to type the passwords.
\end{itemize}
\noindent The goals of this paper are to lay a solid theoretical
foundation
for {\sc captcha}s, to introduce the concept to the cryptography
community, and to present several novel constructions.
%\vspace{-1ex}
\subsection*{Lazy Cryptographers Doing AI}
Note that from a mechanistic point of view, there is no way to {\em prove}
that a program cannot pass a test which a human can pass, since there is a
program --- the human brain --- which passes the test. All we can
do is to present evidence that it's hard to write a program that can pass
the test. In this paper, we take an approach familiar to cryptographers:
investigate state-of-the-art algorithmic developments having to do with
some problem, assume that the adversary does not have algorithms for that
problem that are are much better than the state-of-the-art algorithms, and
then prove a reduction between passing a test and exceeding the
performance of state-of-the-art algorithms. In the case of ordinary
cryptography, it is assumed (for example) that the adversary cannot factor
$1024$-bit
integers in any reasonable amount of time. In our case, we assume that the
adversary cannot solve an Artificial Intelligence problem with higher
accuracy than what's currently known to the AI community. This approach,
if it
achieves widespread adoption, has the beneficial side effect of inducing
security researchers, as well as otherwise malicious programmers, to
advance the field of AI (much like computational number theory has been
advanced since the advent of modern cryptography).
%\vspace{-1ex}
\begin{quote}
{\it A {\sc captcha} is a cryptographic protocol whose underlying hardness
assumption is based on an AI problem.}
\end{quote}
%\vspace{-1ex}
\noindent An important component of the success of modern
cryptography is the practice of stating, very precisely and clearly,
the assumptions under which cryptographic protocols are secure. This
allows the rest of the community to evaluate the assumptions and to
attempt to break them. In the case of Artificial Intelligence, it's
rare for problems to be precisely stated, but using them for security
purposes forces protocol designers to do so. We believe that precisely
stating unsolved AI problems can accelerate the development of
Artificial Intelligence: most AI problems that have been precisely
stated and publicized have eventually been solved (take chess as an
example). For this reason it makes practical sense for AI problems that
are used
for security purposes to also be {\it useful}. If the underlying AI
problem is useful, a {\sc captcha} implies a win-win situation: either
the {\sc captcha} is not broken and there is a way to differentiate humans
from computers, or the {\sc captcha} is broken and a useful AI
problem is solved. Such is not the case for most other
cryptographic assumptions: the primary reason algorithms for factoring
large numbers are useful is because factoring has applications in cryptanalysis.
In this paper we will present constructions of {\sc captcha}s based on
certain AI problems and we will show that solving the {\sc captcha}s
implies solving the AI problems. The AI problems we chose have several
applications, and we will show that solutions to them can be used, among
other things, for steganographic communication (see Section 5).
%\vspace{-1ex}
\subsection*{Related Work}
The first mention of ideas related
to ``Automated Turing Tests'' seems to appear in an unpublished manuscript
by
Moni Naor \cite{NAOR}. This excellent manuscript contains some of the
crucial notions and intuitions, but gives no proposal for an Automated
Turing Test, nor a formal definition. The first practical example of an
Automated Turing Test was the system developed by Altavista \cite{BRODER}
to prevent ``bots'' from automatically registering web pages. Their system
was based on the difficulty of reading slightly distorted characters and
worked well in practice, but was only meant to defeat off-the-shelf
Optical Character Recognition (OCR) technology. (Coates et al
\cite{COATES}, inspired by our work, and Xu et al \cite{JXU} developed
similar systems and provided more concrete analyses.) In 2000
\cite{CAPTCHA_WEB}, we introduced the notion of a {\sc captcha} as well as
several practical proposals for Automated Turing Tests.
This paper is the first to conduct a rigorous investigation of Automated
Turing Tests and to address the issue of {\it proving} that it is
difficult to write a computer program that can pass the tests. This, in
turn, leads to a discussion of using AI problems for security purposes,
which has never appeared in the literature. We also introduce the
first Automated Turing Tests not based on the difficulty of Optical
Character Recognition. A related general interest paper
\cite{CAPTCHA_CACM} has been accepted by {\it Communications of the ACM}.
That paper reports on our work, without formalizing the notions or
providing security guarantees.
\section{Definitions and Notation}
Let $\DD$ be a probability distribution. We use $[\DD]$ to denote the
support of $\DD$.
If $P(\cdot)$ is a probabilistic program, we will denote by $P_r(\cdot)$
the deterministic program that results when $P$ uses random coins $r$.
Let $(P,V)$ be a pair of probabilistic interacting programs.
We denote the output of $V$ after the interaction between $P$ and $V$ with
random coins $u_1$ and $u_2$, assuming this interaction terminates, by
$\langle P_{u_1}, V_{u_2} \rangle$
(the subscripts are omitted in case the
programs are deterministic). A program $V$ is called a {\it test} if
for all $P$ and
$u_1,u_2$, the interaction between $P_{u_1}$ and $V_{u_2}$ terminates and
$\langle P_{u_1}, V_{u_2} \rangle\in\{accept,reject\}$.
We call $V$ the {\em verifier} or {\em tester} and any $P$
which interacts with $V$ the {\em prover}.
\begin{defn}
Define the {\em success} of an entity $A$ over a test $V$ by
$$\suc{V}{A} = \Pr_{r,r'}[\langle A_{r},V_{r'}\rangle=accept].$$
\noindent We assume that $A$ can have precise knowledge of how $V$ works;
the
only piece of information that $A$ can't know is $r'$, the
internal randomness of $V$.
%%% \noindent Here $r'$ is any coins which the entity uses.
%%% That line was particularly redundant.
\end{defn}
\subsection*{CAPTCHA}
Intuitively, a {\sc captcha} is a test $V$
over which most humans have success close to 1, and for which {\it it is
hard to write a computer program that has high success over $V$.}
We will say that it is hard to write a computer
program that has high success over $V$ if {\it any program that has high
success over $V$ can be used to solve a hard AI problem.}
\begin{defn}
A test $V$ is said to be {\em $(\alpha, \beta)$-human executable} if at
least an $\alpha$ portion of the human population has success greater
than $\beta$ over $V$.
\end{defn}
Notice that a statement of the form ``$V$ is $(\alpha, \beta)$-human
executable'' can only be proven empirically. Also, the success of
different groups of humans might depend on their origin language or
sensory disabilities: color-blind individuals, for instance, might
have low success on tests that require the differentiation of colors.
\begin{defn}
An {\it AI problem} is a triple ${\mathcal P}=(S,D,f)$,
where $S$ is a set of problem instances, $D$ is a probability distribution over
the problem set $S$, and $f:S\rightarrow \{0,1\}^*$ answers the instances.
Let $\delta\in (0,1]$.
We require that for an $\alpha>0$ fraction of the humans $H$,
$\Pr_{x \leftarrow D}[H(x)=f(x)]>\delta$.
\end{defn}
\begin{defn}
An AI problem $\PP$ is said to be {\em $(\delta,\tau)$-solved} if there
exists a program $A$, running in time at most $\tau$ on any input from
$S$, such that
$$\Pr_{x \leftarrow D,r}[A_r(x) = f(x)] \ge \delta\ .$$
($A$ is said to be a $(\delta, \tau)$ solution to $\PP$.) $\PP$ is
said to be a {\em $(\delta, \tau)$-hard AI problem} if no current
program is a $(\delta,\tau)$ solution to $\PP$, and the AI community
agrees it is hard to find such a solution.
\end{defn}
\begin{defn}
A $(\alpha, \beta, \eta)$-{\sc captcha} is a test $V$ that is $(\alpha,
\beta)$-human executable, and which has the following property:
\begin{quote}
There exists a $(\delta,\tau)$-hard AI problem ${\mathcal P}$ and a
program $A$, such that if $B$ has success greater than $\eta$ over $V$
then $A^B$ is a $(\delta, \tau)$ solution to ${\mathcal P}$. (Here
$A^B$ is defined to take into account $B$'s running time too.)
\end{quote}
\end{defn}
We stress that $V$ should be a {\it program} whose code is publicly
available.
\subsection*{Remarks}
\begin{enumerate}
\item The definition of an AI problem as a triple $(S,D,f)$ should not be
inspected with a philosophical eye. We are not trying to capture all the
problems that fall under the umbrella of Artificial Intelligence.
We want the definition to be easy to understand, we want
some AI problems to be captured by it, and we want the AI community to
agree that these are indeed hard AI problems. More complex definitions can
be substituted for Definition 3 and the rest of the paper remains
unaffected.
\item A crucial characteristic of an AI problem is that a certain fraction
of the human population be able to solve it. Notice that we don't impose a
limit on how long it would take humans to solve the problem. All that we
require is that some humans be able to solve it (even if we have to assume
they will live hundreds of years to do so). The case is not the same
for {\sc captcha}s. Although our definition says nothing about how long it
should take a human to solve a {\sc captcha}, it is preferable for humans
to be able to solve {\sc captcha}s in a very short time. {\sc
captcha}s which take a long time for humans to solve are probably useless
for all practical purposes.
\comment{\item A {\sc captcha} implies a win-win situation: either the AI
problem
is truly hard and the {\sc captcha} remains a valid way to differentiate
humans from computers, or the {\sc captcha} is broken and a previously
unsolved AI problem is solved. Because of this, it is preferable that {\sc
captcha}s be based on {\it useful} AI problems.}
\end{enumerate}
\subsection*{AI Problems as Security Primitives}
Notice that we define {\it hard} in terms of the consensus of a community:
an AI problem is said to be hard if the people working on it agree that
it's hard. This notion should not be surprising to cryptographers: the
security of most modern cryptosystems is based on assumptions agreed upon
by the community (e.g., we {\it assume} that $1024$-bit integers can't be
factored). The concept of a hard AI problem as a foundational assumption,
of course, is more questionable than $P \neq NP$, since many people in the
AI community agree that all hard AI problems are eventually going to be
solved. However, hard AI problems may be a more reasonable assumption
than the hardness of factoring, given the possibility of constructing a
quantum computer. Moreover, even if factoring is shown to be hard in an
asymptotic sense, picking a concrete value for the security parameter
usually means making an assumption about current factoring algorithms: we
only {\it assume} that current factoring algorithms that run in current
computers can't factor $1024$-bit integers. In the same way that AI
researchers believe that all AI problems will be solved eventually, we
believe that at some point we will have the computational power and
algorithmic ability to factor $1024$-bit integers. (Shamir and Tromer
\cite{Shamir}, for instance, have proposed a machine that could factor
1024-bit integers; the machine would cost about ten million dollars in
materials.)
An important difference between popular cryptographic primitives and AI
problems is the notion of a security parameter. If we believe that an
adversary can factor $1024$-bit integers, we can use $2048$-bit integers
instead. No such concept exists in hard AI problems. AI problems, as we
have defined them, do not deal with asymptotics. However, as long as there
is a small gap between human and computer ability with respect to some
problem, this problem can potentially be used as a primitive for security:
rather than asking the prover to solve the problem once, we can ask it
to solve the problem twice. If the prover gets good at solving the
problem twice, we can ask it to solve the problem three times, etc.
There is an additional factor that simplifies the use of hard AI problems
as security primitives. Most applications of {\sc captcha}s require the
tests to be answered within a short time after they are presented. If a
new program solves the hard AI problems that are currently used, then a
different set of problems can be used, and the new program cannot affect
the security of applications that were run before it was developed.
Compare this to encryption schemes: in many applications the
information that is encrypted must remain confidential for years, and
therefore the underlying problem must be hard against programs that run
for a long time, and against programs that will be developed in the
future.\footnote{We thank one of our anonymous Eurocrypt reviewers for
pointing this out.}
We also note that not all hard AI problems can be used to construct a {\sc
captcha}. In order for an AI problem to be useful for security purposes,
there needs to be an {\it automated} way to generate problem instances
along with their solution. The case is similar for computational problems:
not all hard computational problems yield cryptographic primitives.
\subsection*{Who Knows What?}
Our definitions imply that an adversary attempting to write a program that
has high success over a {\sc captcha} knows exactly how the {\sc captcha}
works. The only piece of information that is hidden from the adversary is
a small amount of randomness that the verifier uses in each
interaction.
This choice greatly
affects the nature of our definitions and makes the problem of creating
{\sc captcha}s more challenging. Imagine an Automated Turing Test that
owns a large secret book written in English and to test an entity $A$ it
either picks a paragraph from its secret book or generates a paragraph
using the best known text-generation algorithm, and then asks $A$ whether
the paragraph makes sense (the best text-generation algorithms cannot
produce an entire paragraph that would make sense to a human being). Such
an Automated Turing Test might be able to distinguish humans from
computers (it is usually the case that the best text-generation algorithms
and the best algorithms that try to determine whether something makes
sense are tightly related). However, this test cannot
be a {\sc captcha}:
an adversary with knowledge of the secret book could achieve high success
against this test without advancing the algorithmic state of the art.
{\it We do not allow {\sc captcha}s
to base their security in the secrecy of a database or a piece of
code.}
\subsection*{Gap Amplification}
We stress that {\it any} positive gap between the success of humans and current
computer programs against a {\sc captcha} can be amplified to
a gap arbitrarily close to $1$ by {\it serial} repetition.
The case for
parallel repetition is more complicated and is addressed by Bellare,
Impagliazzo and Naor in \cite{BIN}.
Let $V$ be an $(\alpha, \beta, \eta)$-{\sc captcha}, and let $V^m_k$ be
the test that results by repeating $V$ $m$ times in series (with fresh
new randomness each time) and accepting only if the prover passes $V$
more than $k$ times. Then for any $\epsilon > 0 $ there exist $m$ and
$k$ with $0 \leq k \leq m$ such that $V^m_k$ is an $(\alpha,
1-\epsilon, \epsilon)$-{\sc captcha}.
In general, we will have $m = O(1/(\beta - \eta)^2 \ln (1/
\epsilon))$ and sometimes much smaller. Since {\sc captcha}s involve
human use, it is desirable to find the smallest $m$ possible. This
can be done by solving the following optimization problem:
\[
\min_{m} \left\{ \exists k: \sum_{i=k+1}^m {m \choose i}\beta^i
(1-\beta)^{m-i} \geq 1-\epsilon \textrm{ \& } \sum_{i=0}^k {m \choose
i}\eta^i (1-\eta)^{m-i} \leq \epsilon \right\}
\]
Notice that amplifying a gap can roughly be thought of as increasing the
security parameter of a {\sc captcha}: if the best computer program now
has success $0.10$ against a given {\sc captcha} (for example), then we
can ask the prover to pass the {\sc captcha} twice (in series) to reduce
the best computer program's success probability to $0.01$.
%\vspace{-1ex}
\section{Two AI Problem Families}
In this section we introduce two families of AI problems that can be used
to construct {\sc captcha}s. The section can be viewed as a precise
statement of the kind of hardness assumptions that our cryptographic
protocols are based on. We stress that solutions to the problems will also
be shown to be useful.
For the purposes of this paper, we define an {\it image} as an $h\times w$
matrix ($h$ for height and $w$ for width) whose entries are {\it pixels}.
A pixel is defined as a triple of integers $(R,G,B)$, where $0\leq R,G,B
\leq M$ for some constant $M$. An {\it image transformation} is a
function that takes as input an image and outputs another image (not
necessarily of the same width and height). Examples of image
transformations include: turning an image into its black-and-white
version, changing its size, etc.
Let $\II$ be a distribution on images and $\TT$ be a distribution on image
transformations. We assume for simplicity that if $i,i'\in [\II]$
and $i\neq i'$ then $T(i)\neq
T'(i')$ for any $T,T'\in [\TT]$. (Recall that $[\II]$ denotes the support
of $\II$.)
\begin{prob}[$\find$] Consider the following experiment: choose an image
$i\leftarrow \II$ and choose a transformation $t\leftarrow \TT$; output
$t(i)$. $\find_{\II,\TT}$ consists of writing a program that takes $t(i)$
as input
and outputs $i$ (we assume that the program has precise knowledge of
$\TT$ and $\II$). More formally, let $S_{\II,\TT}=\{t(i)$ : $t\in [\TT]$
and
$i\in
[\II]\}$, $D_{\II,\TT}$ be the distribution on $S_{\II,\TT}$ that results
from
performing the above experiment and $f_{\II,\TT}:S_{\II,\TT} \rightarrow
[\II]$ be
such that $f_{\II,\TT}(t(i))=i$. Then $\find_{\II,\TT}
=(S_{\II,\TT},D_{\II,\TT},f_{\II,\TT})$.
\end{prob}
\begin{prob}[$\labl$]
In addition to the distributions $\II$ and $\TT$, let $L$ be a finite
set of ``labels''. Let $\lambda : [\II] \rightarrow L$ compute the label
of an image. The set of problem instances is $S_{\II,\TT} = \{t(i)$ : $
t \in [\TT]$ and $i
\in
[\II]\}$, and the
distribution on instances $D_{\II,\TT}$ is the one induced by choosing
$i\leftarrow \II$ and $t\leftarrow \TT$. Define $g_{\II,\TT,\lambda}$ so
that
$g_{\II,\TT,\lambda}(t(i)) = \lambda(i)$. Then $\labl_{\II,\TT,\lambda} =
(S_{\II,\TT}, D_{\II,\TT}, g_{\II,\TT,\lambda})$ consists of writing a
program that takes $t(i)$ as input and outputs $\lambda(i)$.
\end{prob}
\subsection*{Remarks}
\begin{enumerate}
\item
Note that a $(\delta, \tau)$ solution $A$ to an instance of
$\find$ also yields a $(\delta', \tau + \tau')$ solution to an
instance of $\labl$ (where $\delta'\geq \delta$ and $\tau' \le \log
|[\II]|$ is the time that it takes to compute $\lambda$),
specifically, computing $\lambda(A(x))$. However, this may be
unsatisfactory for small $\delta$, and we might hope to do better by
restricting to a smaller set of labels. Conversely, $\find$ can be
seen as a special case of $\labl$ with $\lambda$ the identity function
and $L = [\II]$.
Formally, problem families $\find$ and $\labl$ can be shown to
be isomorphic. Nonetheless, it is useful to make a distinction here
because in some applications it appears unnatural to talk about
labels.
\item We stress that in all the instantiations of $\find$ and $\labl$ that
we consider, $\II,\TT$ and, in the case of $\labl$, $\lambda$,
will be such that humans have no problem solving
$\find_{\II,\TT}$ and $\labl_{\II,\TT,\lambda}$.
That is, $[\TT]$ is a set of transformations that humans can easily undo
in $[\II]$. Additionally, it has to be possible to perform all the
transformations in
$[\TT]$ using
current computer programs.
\item There is nothing specific to {\em images} about these
problem definitions; any other space of objects which humans recognize
under reasonable transformations (e.g., organized sounds such as music
or speech, animations, {\em et cetera}) could be substituted without
changing our results.
\item
It is easy to build a $(\delta_{\II}, \tau_{\II})$-solution to
$\find_{\II,\TT}$,
where $\delta_{\II}=\max\{\Pr_{j\leftarrow \II}[j=i]: i\in [\II]\}$ and
$\tau_{\II}$ is the time that it takes to describe an element of
$[{\II}]$, by always guessing the image with the highest probability in
$\II$. Similarly,
it is easy to build a $(\delta_{\II, \lambda},
\tau_{\II,\lambda})$-solution to
$\labl_{\II,\TT,\lambda}$, where
$\delta_{\II,\lambda}=\max\{\Pr_{j\leftarrow \II}[\lambda(j)=\lambda(i)]:
i\in [\II]\}$ and $\tau_{\II,\lambda}$ is the time that it takes to
describe a label in $L$. Therefore, we restrict our attention
to finding solutions to $\find_{\II,\TT}$ where $\delta>\delta_{\II}$ and
solutions to $\labl_{\II,\TT,\lambda}$ where
$\delta>\delta_{\II,\lambda}$.
\end{enumerate}
\subsection*{Hard Problems in $\find$ and $\labl$}
We believe that $\find$ and $\labl$ contain several hard problems. For
example, the {\sc captcha} shown in Section 1 (and other {\sc captcha}s based on the
difficulty of reading slightly distorted text) could be defeated using
solutions to $\labl$. To see this, let $W$ be a set of images of words in
different fonts. All the images in $W$ should be undistorted and contain
exactly one word each. Let $\II_W$ be a distribution on $W$, let $\TT_W$
be a distribution on image transformations, and let $\lambda_W$ map an
image to the word that is contained in it. A solution to
$\labl_{\II_W,\TT_W,\lambda_W}$ is a program that can defeat a {\sc
captcha} such as the one that Yahoo! uses (assuming $\TT_W$ is the same
set of
transformations they use). So the problem of determining the word in a
distorted image is an instantiation of $\labl$ (it can be easily seen to
be an instantiation of $\find$ too). Reading slightly distorted text has
been an open problem in machine vision for quite some time. (For a good
overview of the difficulties of reading slightly distorted text, see
\cite{NAGY}.)
But $\find$ and $\labl$ are much more general, and reading slightly
distorted text is a somewhat easy instance of these problems. In
general it will not be the case that the problem is reduced to matching
$26*2+10$ different characters (upper and lowercase letters plus the
digits).
The hardness of problems in $\find$ and $\labl$ mostly relies on $\TT$. In
particular, it should be computationally infeasible to enumerate all of
the elements of $[\TT]$, since $\II$ will normally be such that
enumeration of $[\II]$ is feasible. Thus we are mainly interested in
$(\delta, \tau)$ solutions where $\tau \ll | [\TT] |$, while $\tau > |
[\II] |$ may sometimes be acceptable. In addition to the size of the
transformation set, the character of the transformations is also
important: it is necessary to defeat many simple checks such as color
histogram comparisons, frequency domain checks, etc.
Since instantiations of $\find$ and $\labl$ have never been
precisely stated and published as challenges to the AI and security
communities, there is no way to tell if they will withstand the test of
time. For now we refer the reader to {\it www.captcha.net} for
examples of $\II$'s and $\TT$'s which are believed to be good candidates.
Any instantiation of $\find$ and $\labl$ for security purposes
requires that the precise $\II$ and $\TT$ be published and thoroughly
described.
\section{Two Families of {\sc captcha}s}
We now describe two families of {\sc captcha}s whose security is based
on the hardness of problems in $\find$ and $\labl$. Notice that if
$\find_{\II,\TT}$ is $(\delta, \tau)$-hard then $\find_{\II,\TT}$ can
be used to construct a {\sc captcha} trivially: the verifier simply
gives the prover $t(i)$ and asks the prover to output $i$. According
to our definition, this would be a perfectly valid {\sc
captcha}. However, it would also be a very impractical one: if $[\II]$
is large, then humans would take a long time to answer. The {\sc
captcha}s we present in this section can be quickly answered by
humans. The first family of {\sc captcha}s, {\sc matcha}, is somewhat
impractical, but the second family, {\sc pix}, is very practical and
in fact several instantiations of it are already in use.
\subsection{MATCHA}
A {\sc matcha} instance is described by a triple $M = (\II,\TT, \tau)$, where
$\II$ is a distribution on images and $\TT$ is a distribution on image
transformations that can be easily computed using current computer
programs. {\sc matcha} is a {\sc captcha} with the following property:
{\it any program that has high success over $M = (\II,\TT)$ can be used to
solve $\find_{\II,\TT}$.}
The {\sc matcha} verifier starts by choosing a transformation
$t\leftarrow\TT$. It then flips a fair unbiased coin. If the result is
heads, it picks $k\leftarrow \II$ and sets $(i,j)=(k,k)$. If the result is
tails, it sets $j \leftarrow \II$ and $i \leftarrow U([\II]-\{j\})$ where
$U(S)$ is the uniform distribution on the set $S$. The {\sc matcha}
verifier sends the prover $(i,t(j))$ and sets a timer to expire in time
$\tau$; the prover responds with $res \in \{0,1\}$. Informally, $res=1$
means that $i=j$, while $res=0$ means that $i \neq j$. If the verifier's
timer expires before
the prover responds, the verifier rejects. Otherwise, the verifier makes
a decision based on the prover's response $res$ and whether $i$ is equal
to $j$: \begin{itemize} \item If $i=j$ and $res = 1$, then {\sc matcha}
accepts. \item If $i=j$ and $res = 0$, then {\sc matcha} rejects. \item
If $i\neq j$ and $res = 1$, then {\sc matcha} rejects. \item If $i\neq j$
and $res = 0$, then {\sc matcha} plays another round. \end{itemize}
\noindent In the last case, {\sc matcha} starts over (with a fresh new set
of random coins): it flips another fair unbiased coin, picks another pair
of images $(i,j)$ depending on the outcome of the coin, etc.
\\ \\
{\bf Remarks}
\begin{enumerate}
\item It is quite easy to write a computer program that has success
probability $1/2$ over {\sc matcha} by simply answering with $res = 1$.
For most
applications, a distinguishing probability of $1/2$ is unacceptable. In
order for {\sc matcha} to be of practical use, the test has to be repeated
several times.
\item Our description of {\sc matcha} contains an obvious asymmetry: when
$i=j$, {\sc matcha} presents the prover with $(i,t(i))$ for $i\leftarrow
\II$, and when $i\neq j$ {\sc matcha} presents $(i,t(j))$, where $i$ is
chosen {\it uniformly} from the set $[\II]-\{j\}$. This gives the prover a
simple strategy to gain advantage over $M$: if $i$ seems to come from
$\II$, guess that $t(j)$ is a transformation of $i$; otherwise guess that
it isn't. The reason for the asymmetry is to make the proof of Lemma 1
easier to follow. We note that a stronger {\sc captcha} can be built by
choosing $i$ from the distribution $\II$ restricted to the set
$[\II]-\{j\}$.
\item The intuition for why {\sc matcha} plays another round when $i\neq
j$ and $res=0$ is that we are trying to convert high success against {\sc
matcha} into high success in solving $\find$; a program that solves
$\find$ by comparing $t(j)$ to every image in $[\II]$ will
encounter that most of the images in $[\II]$ are different
from $t(j)$.
\item
In the following Lemma we assume that a program with
high success over $M$ always terminates with a response in time at
most $\tau$. Any program which does not satisfy this requirement can
be rewritten into one which does, by stopping after $\tau$ time and
sending the response $1$, which never decreases the success probability.
We also assume that the unit of time that {\sc matcha}
uses is the same as one computational step.
\end{enumerate}
\begin{lemma}
Any program that has success greater than $\eta$
over $M=(\II,\TT, \tau)$ can be used to $(\delta, \tau|[\II]|)$-solve
$\find_{\II,\TT}$,
where
$$\delta\geq \frac{\eta}{1+2|[\II]|(1-\eta)}.$$ \end{lemma}
\begin{proof} Let $B$ be a program that runs in time at most $\tau$ and has success
$\sigma_B\geq\eta$ over $M$. Using $B$ we construct a program $A^B$
that is a $(\delta, \tau|[\II]|)$-solution to $\find_{\II,\TT}$.
The input to $A^B$ will be an image, and the
output will be another image. On input
$j$, $A^B$ will loop over the entire database of images of $M$ (i.e., the
set $[\II]$), each time feeding $B$ the pair of images $(i,j)$, where $i\in [\II]$.
Afterwards, $A^B$ collects all the images $i\in [\II]$ on which $B$ returned $1$
(i.e., all the images in $[\II]$ that $B$
thinks $j$ is a transformation of). Call the set of these images
$S$. If $S$ is empty, then $A^B$ returns an element chosen uniformly
from $[\II]$. Otherwise, it picks an element
$\mathcal{\ell}$ of $S$ uniformly at random.
We show that $A^B$ is a $(\delta,\tau|[\II]|)$-solution to $\find$.
Let $p_0 = \Pr_{\TT,\II,r}[B_r(i,t(i)) =
0]$ and let $p_1 =
\Pr_{\TT,j\leftarrow \II, i,r}[B_r(i,t(j)) = 1]$. Note that $$\Pr_{r,r'}[\langle
M_r,B_{r'}\rangle=reject]
= 1-\sigma_B = \frac{p_0}{2} + \frac{p_1 + (1-p_1)(1-\sigma_B)}{2} =
\frac{p_0
+ p_1}{1 + p_1}\ ,$$
\noindent which gives $\sigma_B \le 1-p_0$ and $p_1 \le 2(1 - \sigma_B)$.
Hence:
\begin{eqnarray}
\Pr_{\TT,\II,r}[A_r^B(t(j))=j]
& \geq &
\sum_{s=1}^{n}\Pr_{\TT,\II,r}[A_r^B(t(j))=j||S|=s]\Pr_{\TT,\II}[|S|=s]
\label{eq:bayes} \\
& = &
\sum_{s=1}^{n} \frac{1-p_0}{s}\Pr_{\TT,\II}[|S|=s]
\label{eq:def_of_F} \\
& \geq & \frac{1-p_0}{E_{\TT,\II}[|S|]}
\label{eq:jensen} \\
& \geq & \frac{1-p_0}{1+|[\II]|p_1}
\label{eq:linexp} \\
& \geq & \frac{\sigma_B}{1+2|[\II]|(1-\sigma_B)}
\label{eq:above}
\end{eqnarray}
\eqref{eq:def_of_F} follows by the definition of the procedure $A^B$,
\eqref{eq:jensen} follows by Jensen's inequality and the fact that
$f(x) = 1/x$ is concave, \eqref{eq:linexp} follows because
$$E_{\TT,\II}[|S|] \leq 1 + \sum_{i \neq j} \Pr_{\II,r} (B(i,t(j)) = 1) $$
and \eqref{eq:above} follows by the inequalities for $p_0$, $p_1$ and
$\sigma_B$ given above. This completes the proof.
\end{proof}
\begin{theorem} If $\find_{\II,\TT}$ is $(\delta,\tau|[\II]|)$-hard and $M =
(\II,\TT,\tau)$ is $(\alpha, \beta)$-human executable, then $M$ is a
$(\alpha,\beta, \frac{(2 - |[\II]|)\delta}{2\delta - |[\II]|})$-{\sc
captcha}. \end{theorem}
\subsection{PIX}
An instance $\labl_{\II,\TT,\lambda}$ can sometimes be used almost
directly as a {\sc captcha}. For instance, if $\II$ is a distribution
over images containing a single word and $\lambda$ maps an image to
the word contained in it, then $\labl_{\II,\TT,\lambda}$ can be used
directly as a {\sc captcha}. Similarly, if all the images in $[\II]$
are pictures of simple concrete objects and $\lambda$ maps an image to
the object that is contained in the image, then
$\labl_{\II,\TT,\lambda}$ can be used as a {\sc captcha}.
Formally, a {\sc pix} instance is a tuple $X = (\II,\TT,L,\lambda, \tau)$. The {\sc
pix} verifier works as follows. First, $V$ draws $i \leftarrow \II$, and
$t \leftarrow \TT$. $V$ then sends to $P$ the message $(t(i), L)$, and
sets a
timer for $\tau$. $P$ responds with a label $l \in L$. $V$ accepts if $l
= \lambda(i)$ and its timer has not expired, and rejects otherwise.
\begin{theorem}
If $\labl_{\II,\TT,\lambda}$ is $(\delta,\tau)$-hard and
$X = (\II,\TT,L,\lambda,\tau)$ is $(\alpha,\beta)$-human executable, then
$X$ is a $(\alpha,\beta,\delta)$-{\sc captcha}.
\end{theorem}
Various instantiations of {\sc pix} are in use at major internet
portals, like Yahoo! and Hotmail. Other less conventional ones, like {\sc
Animal-PIX}, can be found at {\it www.captcha.net}. {\sc Animal-PIX}
presents the prover with a distorted picture of a common
animal (like the one shown below) and asks it to choose between twenty
different possibilities (monkey, horse, cow, {\it et cetera}).
%\vspace{-4ex}
\begin{figure}[h]
\begin{center}
\includegraphics*[scale=.35]{vonahn-pix.ps}
\caption{{\sc Animal-Pix}.}
\end{center}
\end{figure}
%\vspace{-9ex}
\section{An Application: Robust Image-Based Steganography}
We detail a useful application of
$(\delta,\tau)$-solutions to
instantiations of $\find$ and $\labl$ (other than reading slightly
distorted text, which was mentioned before). We hope to convey by this
application that our problems were not chosen just because they can
create {\sc captcha}s but because they in fact have applications related
to security. Our problems also serve to illustrate that there is a need
for better AI in security as well. Areas such a Digital Rights
Management, for instance, could benefit from better AI:
a program that can find slightly distorted versions of original songs or
images on the world wide web would be a very useful tool for copyright
owners.
There are many applications of solutions to $\find$ and $\labl$ that we
don't mention here.
$\find$, for instance, is interesting in its own right and a solution for
the instantiation when $\II$ is a distribution on images of works of art
would benefit museum curators, who often have to answer questions such as
``what
painting is this a photograph of?"
\subsection*{Robust Image-Based Steganography.}
{\em Robust Steganography} is concerned with the problem of covertly
communicating
messages on a public channel which is subject to modification by a
restricted adversary. For example, Alice may have some distribution on
images which she is allowed to draw from and send to Bob; she may wish to
communicate additional information with these pictures, in such a way that
anyone observing her communications can not detect this additional
information. The situation may be complicated by an adversary who
transforms all transmitted images in an effort to remove any hidden
information. In this section we will show how to use $(\delta,
\tau)$-solutions to instantiations of $\find_{\II,\TT}$ or
$\labl_{\II,\TT,\lambda}$ to implement a secure robust steganographic
protocol for image channels with distribution $\II$, when the adversary
chooses transformations from $\TT$.
Note that if we require security
for arbitrary $\II,\TT$, we will require a $(\delta,\tau)$-solution to
$\find$ for arbitrary $\II,\TT$; if no solution works for arbitrary
$(\II,\TT)$ this implies the existence of specific $\II,\TT$ for which
$\find$ is still hard. Thus either our stegosystem can be implemented by
computers for arbitrary image channels or their is a (non-constructive)
hard AI problem that can be used to construct a {\sc captcha}.
The results of this subsection can be seen as providing an implementation
of the ``supraliminal channel'' postulated by Craver \cite{Cra98}.
Indeed, Craver's observation that the adversary's transformations should
be
restricted to those which do not significantly impact human interpretation
of the images (because the adversary should not unduly burden ``innocent''
correspondents) is what leads to the applicability of our hard AI
problems.
\subsection*{Steganography Definitions}
Fix a distribution over images $\II$, and a set of keys $\KK$. A {\em
steganographic protocol} or {\em stegosystem} for $\II$ is a pair of
efficient probabilistic algorithms $(SE,SD)$ where $SE : \KK \times
\{0,1\}
\rightarrow [\II]^\ell$ and $SD : \KK \times [\II]^\ell \rightarrow
\{0,1\}$, which have the additional property that
$\Pr_{K,r,r'}[SD_r(K,SE_{r'}(K,\sigma)) \ne \sigma]$ is negligible (in
$\ell$ and $|K|$) for any $\sigma \in \{0,1\}$. We will describe a
protocol for transmitting a single bit $\sigma \in \{0,1\}$, but it is
straightforward to extend our protocol and proofs by serial composition to
any message in $\{0,1\}^\ast$ with at most linear decrease in security.
\begin{defn}
A stegosystem is {\em steganographically secret} for $\II$ if the
distributions $\{ SE_r(K, \sigma) : K \leftarrow \KK, r \leftarrow
\{0,1\}^\ast\}$ and $\II^\ell$ are computationally indistinguishable
for any $\sigma \in \{0,1\}$.
\end{defn}
Steganographic secrecy ensures that an eavesdropper cannot distinguish
traffic produced by $SE$ from $\II$. Alice, however, is worried about a
somewhat malicious adversary who transforms the images she transmits to
Bob. This adversary is restricted by the fact that he must transform the
images transmitted between many pairs of correspondents, and may not
transform them in ways so that they are unrecognizable to humans, since he
may not disrupt the communications of legitimate correspondents. Thus the
adversary's actions, on seeing the image $i$, are restricted to selecting
some transformation $t$ according to a distribution $\TT$, and replacing
$i$ by $t(i)$. Denote by $t_{1\ldots\ell} \leftarrow \TT^\ell$ the action
of independently selecting $\ell$ transformations according to $\TT$, and
denote by $t_{1\ldots\ell}(i_{1\ldots\ell})$ the action of element-wise
applying $\ell$ transformations to $\ell$ images.
\begin{defn} A stegosystem $(SE,SD)$ is {\em steganographically robust against
$\TT$}
if it is steganographically secret and $$\Pr_{t_{1\ldots\ell}\leftarrow
\TT^\ell,r,r',K}[SD_r(K,t_{1\ldots\ell}(SE_r(K,\sigma))) \ne \sigma]$$ is
negligible (in $|K|$ and $\ell$) for any $\sigma \in \{0,1\}$. \end{defn}
Let $F : \KK \times \{1,\ldots,\ell\} \times L \rightarrow \{0,1\}$ be a
pseudorandom function family. We assume that $\lambda : [\II] \rightarrow
L$ is efficiently computable and $A: S_\labl \rightarrow L$ is a
$(\delta, \tau)$-solution to $\labl_{\II,\TT,\lambda}$ as defined above
(recall that $\find$ is a special case of $\labl$),
i.e. $A$ operates in time $\tau$ and $$\Pr_{t,i,r}[A_r(t(i)) = \lambda(i)]
\ge \delta\ .$$
\noindent Let $c = \Pr_{i,j\leftarrow \II}[\lambda(i) = \lambda(j)]$. We
{\it
require} that $c < 1$ (that is, we require that there is enough
variability in the labels of the images to be useful for communication).
Notice that $L$ can simply be equal to $[\II]$ and $\lambda$ can be the
identity function (in case the images in $[\II]$ have no labels as in
$\find$).
We prove in the Appendix that
the following construction
is an efficient, robust stegosystem for $\TT$.
\begin{cons} \label{cons:stego} \end{cons}
\begin{center}
\begin{minipage}[t]{3in}
\begin{tabbing}
{\bf Procedure} $SE$: \\
{\bf Input:} $K \in \KK$, $\sigma \in \{0,1\}$\\
for \= $j = 1 \ldots \ell$ do \+ \\
draw $d_0 \leftarrow \II$, $d_1 \leftarrow \II$\\
if\= \ $F_K(j,\lambda(d_0)) = \sigma$ then \+\\
set $i_j = d_0$\-\\
el\=se\+\\
set $i_j = d_1$\-\-\\
{\bf Output:} $i_1, i_2, \ldots, i_\ell$
\end{tabbing}
\end{minipage} \hspace{0.25in} \vline \hspace{0.25in}
\begin{minipage}[t]{3in}
\begin{tabbing}
{\bf Procedure} $SD$: \\
{\bf Input:} $K \in \KK$, $i'_{1\ldots\ell} \in [\II]^\ell$ \\
for \= $j = 1 \ldots l$ do \+ \\
set $\sigma_j = F_K(j,A(i'_j))$ \- \\
{\bf Output:} $majority(\sigma_1,\ldots,\sigma_\ell)$
\end{tabbing}
\end{minipage}
\end{center}
\begin{proposition}
Construction~\ref{cons:stego} is steganographically secret and
robust for $\II,\TT$.
\end{proposition}
The proof of Proposition 1 is similar in flavor to those of Hopper,
Langford and von Ahn
\cite{HLvA02} and relies on the fact that when $A$ returns the correct
solution on received image $i'_j$, the recovered bit $\sigma_j$ is equal
to the intended bit $\sigma$
with
probability approximately $\frac{1}{2} + \frac{1}{4}(1-c)$ and otherwise
$\sigma_j=\sigma$ with probability 1/2; therefore the probability that the
majority of the $\sigma_j$ are incorrect is negligible in $\ell$. For
details, see the Appendix.
\\
\\
{\bf Remarks}
\begin{enumerate}
\item Better solutions to $\labl_{\II,\TT,\lambda}$ imply more efficient
stegosystems: if
$\delta$ is larger, then $\ell$ can be smaller and less images need to be
transmitted to send a bit secretively and robustly.
\item Since we assume that $\labl_{\II,\TT,\lambda}$ (or, as it might be the
case, $\find_{\II,\TT}$) is easy for humans, our protocol {\em could} be
implemented as a cooperative effort between the human recipient and the
decoding procedure (without the need for a solution to $\find_{\II,\TT}$
or $\labl_{\II,\TT,\lambda}$). However, decoding each bit of the secret
message will require classifying many images, so that a human would likely
fail to complete the decoding well before any sizeable hidden message
could be extracted (this is especially true in case we are dealing with
$\find_{\II,\TT}$ and a large set $[\II]$: a human would have to search
the entire set $[\II]$ as many as $\ell$ times for each transmitted bit).
Thus to be {\em practical}, a $(\delta,\tau)$-solution (for small $\tau$)
to $\find_{\II,\TT}$ or $\labl_{\II,\TT,\lambda}$ will be required.
\end{enumerate}
%\vspace{-1ex}
\section{Discussion and Closing Remarks}
\subsection*{Interaction with the AI community}
A primary goal of the {\sc captcha} project is to serve as a challenge to
the Artificial Intelligence community. We believe that having a
well-specified set of goals will contribute greatly to the advancement of
the field. A good example of this process is the recent progress in
reading distorted text images driven by the {\sc captcha} in use at
Yahoo!. In response to the challenge provided by this test, Malik and
Mori \cite{MM02} have developed a program which can pass the test with
probability roughly $0.8$. Despite the fact that this {\sc captcha} has
no formal proof that a program which can pass it can read under other
distributions of image transformations, Malik and Mori claim that their
algorithm represents significant progress in the general area of text
recognition; it
is encouraging to see such progress.
For this reason,
it is important that even Automated Turing Tests without
formal reductions attempt to test
ability in general problem domains; and even though these tests may
have specific weaknesses it is also important that AI researchers
attempting to pass them strive for solutions that generalize.
\subsection*{Other AI problem domains}
The problems defined in this paper are both of a similar character, and
deal with the advantage of humans in sensory processing. It is an open
question whether {\sc captcha}s in other areas can be constructed. The
construction of a {\sc captcha} based on a text domain such as text
understanding or generation is an important goal for the project (as {\sc
captcha}s based on sensory abilities can't be used on sensory-impaired
human beings). As mentioned earlier, the main obstacle to designing these
tests seems to be the similar levels of program ability in text generation
and understanding.
Logic problems have also been suggested as a basis for {\sc captcha}s and
these present similar difficulties, as generation seems to be
difficult. One possible source of logic problems are those proposed by
Bongard \cite{BP} in the 70s; indeed \cite{CAPTCHA_WEB} presents a test
based on this problem set. However, recent progress in AI has also
yielded programs which solve these problems with very high success
probability, exceeding that of humans.
\subsection*{Conclusion}
We believe that the fields of cryptography and artificial intelligence
have much to contribute to one another. {\sc captcha}s represent a small
example of this possible symbiosis. Reductions, as they are used in
cryptography, can be extremely useful for the progress of algorithmic
development. We encourage security researchers to create {\sc captcha}s
based on different AI problems.
\subsection*{Acknowledgments}
We are greatful to Udi Manber for his suggestions. We also thank
Lenore Blum, Roni Rosenfeld, Brighten Godfrey, Moni Naor, Henry Baird and the anonymous Eurocrypt
reviewers for helpful discussions and comments. This work was partially
supported by the National Science Foundation (NSF) grants CCR-0122581 and
CCR-0085982 (The Aladdin Center). Nick Hopper is also partially
supported by an NSF graduate research fellowship.
\begin{thebibliography}{2}
\bibitem{CAPTCHA_WEB} Luis von Ahn, Manuel Blum, Nicholas J. Hopper and
John Langford. The CAPTCHA Web Page: {\tt http://www.captcha.net}. 2000.
\bibitem{CAPTCHA_CACM} Luis von Ahn, Manuel Blum and John Langford.
Telling Humans and Computers Apart (Automatically) or How Lazy
Cryptographers do AI. To appear in {\it Communications of the ACM.}
\bibitem{BIN}
Mihir Bellare, Russell Impagliazzo and Moni Naor. Does Parallel Repetition
Lower the Error in Computationally Sound Protocols? In {\it 38th IEEE
Symposium on Foundations of Computer Science (FOCS' 97)}, pages 374-383.
IEEE Computer Society, 1997.
\bibitem{BP} Mikhail M. Bongard. {\it Pattern Recognition}.
Spartan Books, Rochelle Park NJ, 1970.
\bibitem{COATES} A. L. Coates, H. S. Baird, and R. J. Fateman. Pessimal
Print: A Reverse Turing Test. In {\it Proceedings of the International
Conference
on Document Analysis and Recognition (ICDAR' 01)}, pages 1154-1159.
Seattle WA, 2001.
\bibitem{Cra98} Scott Craver. On Public-key Steganography in the Presence
of an Active Warden. In {\it Proceedings of the Second International
Information Hiding Workshop}, pages 355-368. Springer, 1998.
\bibitem{HLvA02} Nicholas J. Hopper, John Langford and Luis von Ahn.
Provably Secure Steganography. In {\it Advances in Cryptology, CRYPTO'
02}, volume 2442 of {\it Lecture Notes in Computer Science}, pages 77-92.
Santa Barbara, CA, 2002.
\bibitem{BRODER} M. D. Lillibridge, M. Adabi, K. Bharat, and A. Broder.
Method for selectively restricting access to computer systems. Technical
report, US Patent 6,195,698. Applied April 1998 and Approved February
2001.
\bibitem{MM02} Greg Mori and Jitendra Malik. Breaking a Visual
CAPTCHA. Unpublished Manuscript, 2002. Available electronically:
{\tt http://www.cs.berkeley.edu/\~{ }mori/gimpy/gimpy.pdf}.
\bibitem{NAOR} Moni Naor. Verification of a human in the loop or
Identification via the Turing Test. Unpublished Manuscript, 1997.
Available electronically:
{\tt http://www.wisdom.weizmann.ac.il/\~{ }naor/PAPERS/human.ps}.
\bibitem{PINKAS} Benny Pinkas and Tomas Sander. Securing Passwords Against
Dictionary Attacks. In {\it Proceedings of the ACM Computer and Security
Conference (CCS' 02)}, pages 161-170. ACM Press, November 2002.
\bibitem{NAGY} S. Rice, G. Nagy, and T. Nartker. {\it Optical Character
Recognition: An Illustrated Guide to the Frontier.} Kluwer Academic
Publishers, Boston, 1999.
\bibitem{Shamir} Adi Shamir and Eran Tromer. Factoring Large Numbers with
the TWIRL Device. Unpublished Manuscript, 2003. Available electronically:
{\tt http://www.cryptome.org/twirl.ps.gz}.
\bibitem{JXU} J. Xu, R. Lipton and I. Essa. Hello, are you human.
Technical Report GIT-CC-00-28, Georgia Institute of Technology, November
2000.
\end{thebibliography}
\appendix
\section{Proof of Proposition 1}
\noindent
{\bf Lemma 1.} {\em Construction~\ref{cons:stego} is steganographically
secret.}
\begin{proof}
Consider for any $1 \le j \le \ell$ and $x\in [\II]$ the probability
$\rho^j_x$ that $i_j = x$, i.e. $\rho^j_x = \Pr[i_j = x]$. The image $x$ is
returned in the $jth$ step only under one of the following conditions:
\begin{enumerate}
\item $D_0$: $d_0 = x$ and $F_K(j,\lambda(d_0)) = \sigma$\ ; or
\item $D_1$: $d_1 = x$ and $F_K(j,\lambda(d_0)) = 1-\sigma$
\end{enumerate}
Note that these events are mutually exclusive, so that $\rho^j_x =
\Pr[D_0] + \Pr[D_1]$. Suppose that we replace $F_K$ by a random
function $f : \{1,\ldots,\ell\} \times L \rightarrow \{0,1\}$. Then we
have that $\Pr_{f,d_0}[D_0] = \frac{1}{2}\Pr_{\II}[x]$ by
independence of $f$ and $d_0$, and $\Pr_{f,d_1}[D_1] =
\frac{1}{2}\Pr_{\II}[x]$, by the same reasoning. Thus $\rho^j_x =
\Pr_{\II}[x]$
when $F_K$ is replaced by a random function. Further, for a random
function the $\rho^j_x$ are all independent. Thus for any
$\sigma \in \{0,1\}$ we see that $SE(K,\sigma)$ and $\II^\ell$ are
computationally indistinguishable, by the pseudorandomness of $F$.
\end{proof}
\noindent
{\bf Lemma 2.} {\em Construction~\ref{cons:stego} is steganographically robust for $\TT$.}
\begin{proof}
Suppose we prove a constant bound $\rho > \frac{1}{2}$ such that for all $j$,
$\Pr[\sigma_j=\sigma] > \rho$. Then by a Chernoff bound we will have
that $\Pr[majority(\sigma_1,\ldots,\sigma_\ell) \ne \sigma]$ is negligible,
proving the lemma.
Consider replacing $F_K$ for a random $K$ with a randomly chosen function $f:
\{1,\ldots,\ell\} \times L \rightarrow \{0,1\}$ in $SE,SD$. We would like to assess
the probability $\rho_j = \Pr[\sigma_j = \sigma]$. Let $A_j$ be the
event $A(i'_j) = \lambda(i_j)$ and $\overline{A_j}$ be the event $A(i'_j) \ne
\lambda(i_j)$, and write $\lambda(d_b^j)$ as $l_b^j$. We have the following:
\begin{align}
\Pr_{f,i_j,t}[\sigma_j = \sigma] &= \Pr[\sigma_j = \sigma | A_j]\Pr[A_j] + \Pr[\sigma_j =
\sigma | \overline{A_j}]\Pr[\overline{A_j}] \\
&= \delta \Pr[\sigma_j = \sigma | A_j ] + \frac{1}{2}(1-\delta) \\
&= \delta(\Pr[f(j,l_0^j) = \sigma \text{\ or\ }
(f(j,l_0^j) = 1 - \sigma \text{\ and\ } f(j,l_1^j)) =
\sigma)] + \frac{1-\delta}{2} \\
&= \delta(\frac{1}{2} + \Pr[f(j,l_0^j) = 1-\sigma \text{\ and\ } f(j,l_1^j) = \sigma \text{\ and\ } l_0^j \ne l_1^j])
+ \frac{1-\delta}{2} \\
&= \delta(\frac{1}{2} + \frac{1}{4}(1 - c)) + \frac{1-\delta}{2} \\
&= \frac{1}{2} + \frac{\delta}{4}(1-c)
\end{align}
Here (7) follows because if $A(i'_j) = l \ne \lambda(i_j)$ then
$\Pr_f[f(j,l) = \sigma] = \frac{1}{2}$, and (8) follows because if
$A(i'_j) = \lambda(i_j)$, then $f(j,A(i'_j)) = \sigma$ iff
$f(j,\lambda(i_j)) = 0$; the expression results from expanding the event
$f(j,\lambda(i_j)) = 0$ with the definition of $i_j$ in the encoding
routine.
For any constant $\delta > 0$, a Chernoff bound implies that the
probability of decoding failure is negligible in $\ell$ when $F_K$ is
a random function. Thus the pseudorandomness of $F_K$ implies that
the probability of decoding failure for Construction~\ref{cons:stego}
is also negligible, proving the lemma.
\end{proof}
\end{document}