Newsgroups: comp.ai,comp.ai.genetic,comp.ai.neural-nets,comp.programming
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!ix.netcom.com!netcom.com!timin
From: timin@netcom.com (Mitchell E. Timin)
Subject: TSP CONTEST RESULTS - Final Writeup, part 1
Message-ID: <timinCzEyx1.Kr@netcom.com>
Summary: Final Report fo the TSP contest
Keywords: TSP contest
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
Date: Thu, 17 Nov 1994 13:31:49 GMT
Lines: 212
Xref: glinda.oz.cs.cmu.edu comp.ai:25265 comp.ai.genetic:4320 comp.ai.neural-nets:20202 comp.programming:12841



                 Results of the TSP Competition
                   (Part 1 of Final Writeup)

This announcement concerns the TSP contest which was announced this
past spring and summer.  (For the curious reader who is not familiar
with this contest, e-mail to timin@netcom.com asking for the original
contest announcement.)  

First Place in this Competition goes to Henning Klaskala of Germany!

Second Place goes to Jukka Honkanen of Finland!

Third Place goes to Frank Sven Nestel of Germany.

Fourth Place is for Randy Saint of Texas, U.S.A.

There is a three-way tie for fifth place between:
     Chad Hurwitz
     Juergen Klose
     The Koeln group

The above will receive a total of about $1500 retail value of
software, (Street Wizard city/county maps of the U.S.A, by Adept
Computer Solutions)  Part 2 of this writeup, to appear soon, will
have information about the contestants, their methods, and their
prizes.

A total of fifteen testable programs were received.  (Several other 
programs were received but they were not testable for a variety of
reasons, i.e. they didn't work right!)  The programs were each tested
with three sets of data, for 47, 110 and 111 points.  (Details of the
method of generating these files is given below, as is the method of
ranking the results to determine the winners.)  They were each run
several times to find out if their results were randomly variable. 
Mr. Klaskala's program was of this type; its execution time was
constant for a given problem file, but the routes differed from run
to run.  I ran his program seven times for each input file and used
the simple average of the route cost for evaluation.  All of the
other leading programs were deterministic, giving the same route and
execution time for every run.  Timing was done with a stop watch,
manually.  The same computer was used for all runs; it is a 486DX66
with 8 megabytes of RAM and VESA local bus.  Microsoft Smartdrv was
active so that files could be read from RAM.  All programs ran
under DOS.

As examples of the performance of some of the better programs, Mr. 
Honkanen's program required only 20.3 seconds to find a route with
111 points, and the cost of that route was 88370.  Here is the route:
30 49 21 53 90 89 72 13 10 105 96 37 107 12 104 52 60 7 25 62 97 46
45 67 64 20 86 26 78 33 103 63 18 75 73 76 38 40 41 5 19 99 39 24 4
87 70 102 61 35 59 11 17 77 98 0 71 28 94 56 85 83 51 47 92 81 55 2
16 36 43 48 23 110 54 101 82 22 27 80 57 93 32 34 68 108 15 91 79 29
84 14 100 95 8 1 50 6 69 88 42 44 9 74 3 65 109 106 58 66 31 

Mr. Klaskala's program only required 1.5 seconds to find the
following route, with a cost of 88106:
30 104 52 60 7 62 25 37 107 12 21 49 53 90 89 72 13 79 91 29 34 32 15
108 68 84 14 100 95 8 1 69 6 50 58 44 42 88 93 57 80 27 106 22 65 109
71 28 74 3 101 82 54 110 92 47 51 85 83 94 56 9 102 61 35 59 11 77 0
98 17 87 70 39 24 4 99 19 5 41 38 40 76 73 81 55 2 16 36 18 75 43 48
63 103 20 86 26 33 78 64 67 23 45 46 97 66 105 96 10 31 

(The seven runs of the program had costs ranging from 87445 to 88480.)

The lowest cost route found for the 111 point case was done by Mr.
Klose's program in 83.3 seconds; its cost was 86800.  The route is:
30 104 10 13 79 90 89 72 53 49 21 52 60 7 37 107 12 25 62 97 46 45 67
64 20 86 26 78 33 103 63 18 75 36 16 55 2 92 81 73 76 38 40 41 51 47
85 83 94 56 9 102 61 35 50 6 69 8 1 95 100 14 59 11 17 87 70 39 24 4
99 19 5 0 98 77 71 28 74 3 65 109 44 42 88 58 57 93 32 34 91 29 84 68
108 15 80 27 106 22 101 82 54 110 43 48 23 66 105 96 31

Here are the lowest cost routes found for the other two problems:
For the 47 point case the lowest cost found was by Mr. Honkanen's
program in 2.9 seconds with this route:
39 31 25 43 18 4 20 42 5 15 12 17 33 21 35 0 14 37 13 27 6 46 22 41 38
11 29 40 32 2 1 16 23 26 10 8 19 36 28 9 34 3 45 44 7 24 30 
The cost is 45750.

For the 110 point case the lowest cost found was by Mr. Klose's
program in 89.5 seconds with this route:
107 58 35 87 76 12 79 67 52 77 60 98 5 39 97 84 85 0 103 105 53 99 74
86 93 55 40 16 81 57 8 78 19 42 96 88 29 65 100 32 30 13 46 1 73 4 51
82 25 10 34 20 69 2 43 27 56 37 14 15 64 61 28 11 92 47 54 33 91 75
26 68 3 71 104 36 63 41 22 23 90 50 21 24 72 9 80 108 45 6 89 17 109
31 95 66 49 59 106 83 94 38 101 7 70 18 48 44 62 102
The cost is 87882

The three problem files, for 47, 110 and 111 points, are available by
return e-mail, as is the initial contest announcement explaining how
to interpret these files.

                      Problem Generation

Each problem, i.e. cost matrix, was generated by a program which
takes N (the number of points enroute) as a command line parameter.
(The program was implemented in Borland Turbo C++, ver. 3.0 for DOS.)
The program outputs an ASCII file with N on the first line, followed
by a line of costs from the starting point to each point enroute,
then a line of costs from each point enroute to the ending point, and
finally the N X N matrix of point-to-point costs.  Here is an output
of the program when N = 7:  (The numbers will change with each run.)

7
3305 1515 2358 773 2255 3660 2710 
3259 3231 3824 2478 1503 3594 2345 
0 4132 3307 3315 2781 832 1331 
3965 0 2225 1338 2860 4174 3488 
3178 2253 0 2389 3319 3093 3292 
3442 1436 2513 0 2204 3469 2671 
2693 3011 3453 2192 0 2892 1832 
703 4192 3069 3677 3041 0 1687 
1504 3637 3329 2744 1650 1776 0 

This program begins by choosing random points in three dimensional
space.  (The Borland random() function was used.)  The points have
integer valued coordinates and are within a rectangular region of
dimensions 4000 X 4000 X 1000.  The values of x and y range from
-2000 to +2000; z goes from 0 to 1000.  N points enroute are chosen,
plus the starting point and ending point, hence N + 2 sets of three
coordinate values are so chosen.  These values are converted to
double floating point so that the following calculations are done
in floating point:

Then for each pair of points the following steps are performed:
1.  The three dimensional Euclidean distance is computed. 
    (call this cost1)
2.  cost2 = 2000 * (dist/2000)^.85 is computed.  This has the effect
    of making travel between nearby points somewhat more expensive
    per unit distance, and travel between distant points somewhat
    less expensive, per unit distance.
3.  A small random cost is added, producing cost3.  The amount added
    is uniformly distributed between 0 and 295.   
4.  An amount is added that is a function of the x and y coordinates
    of the two points, as follows:  Imagine a kind of a "mountain"
    sitting on the x - y plane.  This mountain is a tetrahedron with
    a peak height of 1000.  Its triangular base, with zero altitude,
    spans the domain of the x and y values.  It reaches zero where
    x = 2000, and along the line from (-2000,0) to (2000,-2000),
    and along the line from (-2000,0) to (2000, 2000).  The amount
    added depends on the (x,y) location of the midpoint of a line
    segment joining the two points.  If this midoint is not under the
    mountain then nothing is added; if the midpoint is under the
    mountain then the height of the mountain over the midpoint is the
    amount added.
5.  The double floating value is truncated to produce an integer cost.
     
               Evaluating and Ranking the Programs

The procedure described below was repeated for each the the three
problem files (47, 110 and 111 points).  Each program was run several
times to determine the elapsed time, the route found, and the cost of
the route.  A separate program, was run on each output file to make
sure that the cost of the route was the same as that calculated by
the program being tested.  Timing, with a stopwatch, was done on the
second and later runs, so that the problem file would be read from
RAM rather than disk.  (Microsoft Smartdrv was assumed to handle that.)

Since we were interested in both short run times and low route cost, a
quadratic penalty function was applied to both of these.  In the case
of the cost it was a quadratic function of the excess cost over that
of the lowest cost found by any program.  Hence the combined penalty
function was:

         P(C,T) = K * T^2 + (C - C0)^2

Where C is cost, T is time, and C0 is the lowest known cost.
K determines the relative importance of time compared to cost.  When
K is large this favors very fast programs that may only achieve
mediocre costs.  If K is small this favors programs that achieve very
low cost, even if they spend considerable time searching for the best
route.

It was decided to use three different values of K.  Our intermediate
value was chosen such that a cost 5% greater than the lowest known
cost would have the same penalty as a running time of 120 seconds.
(That was for the 110 and 111 point cases; 60 seconds was used for
the 47 point case.)  Another value of K was obtained by multiplying
the first value by 5, and the third value was obtained by dividing
the first value by 5.  Three sets of three K values were thus
calculated, one set each for 47, 110 and 111 points.

The three penalty functions were applied to each competitor, and for
each of the three problem files.  The best programs are of course
those with the smallest calculated values of the penalty functions.
The programs were ranked using each of the three penalty functions.
In order to produce one ranking list instead of three, a single
combined merit function was defined by normalizing and inverting the
function values for a given K, so that the best program would have a
score of 1000.  This is the formula:

     score = 1000 * (P - med(P)) / (min(P)-med(P))

Where P is the penalty function for a fixed K, med(P) is a median
value of P from the group of ten best performing programs, and min(P)
is the lowest value of P.  The best program has a score of 1000, the
others have lower scores.  Each program has three scores, one for
each K.  Finally the three scores for each program are simply summed
to arrive at a total, and this total is the score used to rank the
programs.  (The "score" is negative for programs with P < med(P); 
such scores are converted to zero before summing.)  This procedure
was applied separately for the three problems.

Here are the run times and route costs for the top three contestants:

               47 pt.         110 pt.         111 pt.
            cost   time     cost   time     cost   time   
Klaskala    45980  0.5      88689  1.5      87808  1.5   
Honkanen    45750  2.9      88753  21.3     88370  20.3            
Nestel      46235  1.2      88829  32.5     86835  32.1   
