\section{Changes in transfer time and rank}
\label{sec:drdt}

A client using a highly ranked server is interested in warning signs
that may indicate that the server's ranking has changed dramatically.
The client cannot measure rankings without measuring all of the mirror
servers; it can only observe the transfer times it is experiencing on
the currently chosen server.  The natural question then is what, if
any, relationship exists between the changes in transfer time a client
observes and the changes in rank the server experiences.  Our study
shows that while a relationship does exist, it is very marginal.

The approach we present here is to estimate the cumulative probability
of rank changes over increasing changes in observed transfer times.
We have also examined the cumulative probability of rank changes over
increasing {\em percentage} changes in transfer time, and we have found
results similar to those presented in this section.

Consider a single mirror server.  From the point of view of a single
client using the set of mirrors, we have a sequence of samples of that
server's transfer times and their corresponding ranks.  We form the
cross product of these samples and select a random subset of 100,000
of these sample pairs.  For each pair of samples in the subset, we
subtract the transfer times and ranks.

\begin{figure*}
\small
\centerline{
\begin{tabular}{|l|c|c|c|c|c|c|c|c|c|c|}
\hline
 & \multicolumn{5}{c|}{Changes in transfer time (seconds)} & \multicolumn{5}{c|}{Changes in rank} \\
\cline{2-11}
Dataset/Doc & Mean & StdDev & Median & Min & Max & Mean & StdDev & Median & Min & Max \\
\hline
Apache/5 & 	0.0039 & 	8.4087 	& 	0 & 	-123.8000 &  226.4100&	 0.0091 &   4.2022 &    0	&   -10	 &   10 \\
Apache/6 &  -0.0810	&  10.3995	&  0	&-295.7300	&
267.8000	&  0.0010 &   4.1948	&   0	&  -10	 &  10\\
Apache/7 &  -0.0503	&   9.0000	&   0	&-292.5000	& 205.9000	&  -0.0621	&   4.0177	&   0	&  -10	 &  10\\
Apache/8 &  -0.5457	&  25.4940	&  0	&-285.3000	& 276.5000	&  -0.0196	&   3.8818	&   0	&  -10	 &  10\\
Apache/9 &  -0.1912	&  31.8086	&  0.1 &-278.0100	& 287.7000	&  -0.0367	&   3.8072	&   0	&  -10	 &  10\\
Mars/0  &   0.1068	&   6.0450	&   0	&-227.9100	& 221.4600	&   0.0244	&   7.1711	&   0	&  -17	 &  17\\
Mars/1   &  0.1218	&   8.1173	&  0	& -184.0000	        & 232.5900	&   0.1330	&   7.0711	&   0	&  -17	 &  17\\
Mars/2  &  0.1189	&  14.3483	&  0		&-285.2000	& 287.4000	&  -0.0685	&   7.1195	&   0	&  -17	 &  17\\
Mars/3 & -0.0226	&  17.5260	&  0		&-253.6000	& 282.1000	&  -0.0038	&   7.0849	&   0	&  -17	 &  17\\
Mars/4 &    0.3308	&  34.5870	&  0		&-286.9000	& 288.6000	&   0.0194	&   7.0992	&   0	&  -17	 &  17\\
News/0 &    0.0282	&  17.1793	& 0      	&-298.8300	& 293.8300	&  -0.0316	&   5.8363	&   0		&  -14	 &  14\\
\hline
\end{tabular}
}
\normalsize
\caption{Summary statistics of changes in transfer time and changes in corresponding ranks.}
\label{fig:drdtsum}
\end{figure*}

Figure~\ref{fig:drdtsum} shows the summary statistics of these changes
in transfer time and corresponding rank.  We see that the mean and
median changes in both quantities are almost exactly zero.  The
distributions of these changes are also quite symmetric about zero.
For this reason, we concentrate on absolute changes in transfer time
and rank.  

After taking absolute values, we count occurrences of value pairs to
estimate the joint cumulative probability of absolute changes in rank
and absolute changes in transfer time, $P[|r_{t_{i}}-r_{t_{j}}| \leq R
\wedge |d_{t_i}-d_{t_j}| \leq D]$ where $R$ is the rank change and $D$
is the change in transfer time.  Since changes in rank are
categorical, we can then trivially compute the cumulative probability
of an absolute change in rank {\em given} an absolute change in
transfer time of $D$ or smaller.  Notationally, this is
$P[|r_{t_i}-r_{t_j}| \leq R \mid |d_{t_i}-d_{t_j}| \leq D]$.  We
aggregate the probabilities from all clients for each dataset and
document to obtain the point of view of a random client interacting
with a random server within the set of mirrors.  The reader may object
that this scheme also aggregates changes happening at all time scales.
This is true.  However, recall from Section~\ref{sec:ranktime} that
changes in rank are virtually independent of time scale.

\begin{figure}
\centerline{
\begin{tabular}{c}
\epsfxsize=3in
\epsfbox{drdt.cumprobdrgivendt.apache9.epsf} \\
\end{tabular}
} 
\caption{Cumulative probability of rank change given changes in transfer time less than $D$ ($P[|r_{t_i}-r_{t_j}| \leq R \mid |d_{t_i}-d_{t_j}| \leq D]$) for Apache/4, plotted for several different values of $D$. All other plots are similar.}
\label{fig:drdtnews} 
\end{figure}

Figure~\ref{fig:drdtnews} shows a representative plot of the
cumulative probability for the Apache/4 dataset.  The plots for all
of the datasets are similar.  The way to read the plot is to pick a
change in duration, follow the corresponding curve horizontally to the
maximum rank change that is of interest, and then read the cumulative
probability from the vertical axis.  For example, we see that for a
transfer time change of 128 seconds or less, about 90\% of rank
changes are of four or less.

We can see that large changes in transfer time are more likely than
small changes to indicate large rank changes.  The curves for
increasingly larger changes in transfer time shift toward the right
(toward larger rank changes.)  However, the difference is slight.  For
example, a rank change of three or smaller is 90\% probable with a
change in transfer time of one second or smaller, while a change of
transfer time of up to 128 seconds reduces the probability only to
80\%.  This is typical of the Apache data, and the relationship is
even less pronounced for the other data.


\begin{figure}
\centerline{
\epsfxsize=3in
\epsfbox{drdtlinearfit.news10sec.epsf}
} 
\caption{Changes in rank versus changes in transfer time (limited to
+/- 10 seconds) for News/0 dataset.  Note the inferiority of linear
fit ($R^2=0.36$.)  There is little relationship between changes in
transfer time and changes in ranking.}
\label{fig:drdtlinfit} 
\end{figure}

Another way to see the limited relationship of changes of rank to
changes in transfer time is to plot rank changes against their
corresponding transfer time changes.  Figure~\ref{fig:drdtlinfit}
shows a representative plot for the News/0 data, where we have focused
on transfer time changes in the $[-10,10]$ range.  We have fit a least
squares line to the data and have found that the relationship is
marginal at best.  The $R^2$ value for the line is only 0.36.  For a
wider range of transfer times, the fit is even worse.  Examining each
client and server pair of all the datasets individually, we find that
only 10\% of combinations yielded $R^2$ values greater that 0.5, fewer
than 1\% yielded $R^2$ values greater than 0.8, and the highest $R^2$
value was only 0.88.  Clearly, there is only a limited relationship
between changes in transfer time and changes in rank.

