CARNEGIE MELLON UNIVERSITY

15-826 - Multimedia databases and data mining

Spring 2007

Homework 1 - Due: Tuesday Feb. 6, 1:30pm, in class.

Important:

Q1: SQL [30 pts]

Retrieve the following tables and load them in a DBMS of your choice: MS-Access is recommended, but any other is acceptable:

These tables (in comma-separated ASCII format, derived from Sean Lahman's Baseball Database) describe historical data about American baseball teams and players:

Given this data, please answer the following queries (feel free to use views):

  1. [10 pts] Find the single highest salary ever paid, and report the teamID, year and amount. In case of tie, report all tied entries.
  2. [10 pts] Report the first and last name of all left-handed players who played for PIT in 1985.
  3. [10 pts] List the names of all stadiums the New York Yankees have ever played in.

For each query both print out and e-mail:

Caution: The ASCII data files have ms-windows/dos end-of-line termination. For unix/linux use "dos2unix" to convert them.

 

Q2: Z-order [10 pts]

Write the code to compute the z-value of a 2-d point, as well as the inverse. You may use C/C++, perl, java or python. If you'd like to use a different language, please ask the TA first.
  1. [2.5 pts] zorder should return the z-value of the given (x,y) point. The command-line syntax should be:
  2. zorder -n <order-of-curve> <xvalue> <yvalue>
    Thus:
    zorder -n 2 0 0 # should return '0'
    zorder -n 3 0 1 # should return '1'
    • Note that these examples determine the orientation of the Z-curve.
  3. [2.5 pts] izorder should give the inverse. The command-line syntax should be:
  4. izorder -n <order-of-curve> <zvalue>
    Thus:

  5. [2.5 pts] Give the results of your programs on this input file.
  6. Make sure you echo the input, so that it is clear which answer refers to which question
  7. [2.5 pts] Using your izorder, plot a z-curve (perhaps using gnuplot) of order 7 (128x128 grid) and hand in the plot.
Hand in your source code on hard copy along with e-mailing it in.


NOTES:

 

Q3: R trees [60 pts]

You are required to extend the capabilities of an R-tree package, and implement two different algorithms for closest pair queries. You may use C/C++, perl, java or python. If you'd like to use a different language, please ask the TA first.



Last updated by Christos Faloutsos, Jan. 24, 2007.