Lecture 10/2/97 MIDTERM NEXT THURS (10/9) OPEN BOOK + 1 SHEET OF NOTES o Sorting overview o selection sort 1 handout: old midterm o insertion sort o Using the qsort library function ----------------------------------------------------------------------------- What is the problem: Take an array of data and rearrange it to be in numerical or alphabetical order. Why is this useful? There are many examples of interesting applications of sorting. Here are some examples: Storing a static set of names, to allow efficient search (using binary search). Finding all duplicates in a set: Sort it, then scan it looking for identical adjacent elements in it. One good way of creating the WordGraph from assignment 2: to find pairs of words that differ in just their ith letter, sort them using a comparison function that ignores the ith letter. Then, adjacent words will be adjacent in the list. Graphics: often need to sort objects by x or y coordinates Besides being a basic problem, we will spend a fair amount of time on sorting in this course because it illustrates useful algorithmic techniques. Today: talk about two simple methods: selection sort and insertion sort. Also: how to use the qsort standard-library function. General scenario: have a bunch of objects, each one with a KEY. You want to sort the objects in order by KEY. E.g., sort a bunch of student records based on name. For simplicitly, will mainly talk about sorting arrays of numbers. SELECTION SORT -------------- Find the smallest item. Put into position 0 (swap with item already in position 0). Find second-smallest item, and put into position 1, etc. // selection sort. Say A[] is an array of n integers. void ssort(int A[], int n) { for(int i=0; i < n; ++i) { // find the ith smallest int min,j; for(min=i, j=i+1; j < n; ++j) { if (A[j] < A[min]) min = j; } int temp=A[i]; A[i] = A[min]; A[min] = temp; // swap them } } E.g., say we had array: 8 3 1 5 2 What do we get? 1 3 8 5 2 1 2 8 5 3 1 2 3 5 8 1 2 3 5 8 # comparisons performed = (n-1)+(n-2)+...+1 = n(n-1)/2 = Theta(n^2) # comparisons if already sorted? Theta(n^2) (drawback) "in place" -- a sorting algorithm that uses no extra storage for objects (except perhaps a little for swapping) Selection sort is "in place" since just needs one extra place for storing object (temp). INSERTION SORT -------------- Insertion sort is what you might use for sorting playing cards. Idea: sort first two. Then put third into correct place, then 4th into correct place, etc. EG: sort 3 5 9 2 7 6 first 3 already sorted. Then get 2 and move to correct spot 2 3 5 9 7 6 Then move 7: 2 3 5 7 9 6 Then move 6: 2 3 5 6 7 9. Idea: prefix always sorted. C++ code: void isort(int A[], int n) { for(int i=1; i < n; ++i) { int current = A[i], j; for(j=i-1; (j >=0) && (current < A[j]); --j) { A[j+1] = A[j]; } A[j+1] = current; } } In place? Yes What's worst-case running time? THETA(n^2). Might be as bad as 1+2+3+...+n like in Selection sort. What about "average case": if numbers in a random order. --> still THETA(n^2) since on average, still have to move halfway back. Another easy way to see this: Think of n as big, like 1000. last 500 (n/2) numbers each have to move on average back of at least 250 (n/4). What about "best case". Best case means best possible arrangement of input. --> best case is O(n). --> already sorted. * Insertion sort is good for sorting arrays that are "almost" sorted. Next time: talk about some better methods. But first..... The SUPER-PRACTICAL approach to sorting: ---------------------------------------- -- use the built in qsort function. This uses "quicksort" that we'll talk about next time. How to use qsort: void qsort( , , , *b, negative if *a<*b and 0 if *a == *b.> ) For example: #include struct Person { char *name; int age; }; Person arr[]; // an array of Person structs. int CompareNames(Person *a, Person *b) { // compare by name return strcmp(a->name, b->name); } int CompareAges(Person *a, Person *b) { // compare by age return (a->age - b->age); } // sort arr by name (say "len" is number of elements in arr): qsort(arr, len, sizeof(Person), CompareNames); // sort arr by age: qsort(arr, len, sizeof(Person), CompareAges);