Project #3 - Introduction to Hadoop
Due: Tuesday, November 16, 2010 at 11:59PM

Overview

The purpose of this assignment is just to get you set up with the tools of the trade. You are asked to work your way through the Hadoop tutorial and to perform some basic tasks. The rest of Project #3 will involve a much more extravagent application of Hadoop.

Grading

This assignment is 20% of Project 3's grade.

No Partners

As the only goal of this assignment is to "Turn a very small wheel once", you are asked to do it by yourself. You'll have trouble contributing as an equal partner in the next lab, unless you can do something small-ish by yourself, to be sure.

Links

Cluster Information

You can log into any ghcXX.ghc.andrew.cmu.edu machine, where XX=01-81, or 86

The hadoop home directory is here: /usr/local/cs/hadoop-0.20.2/

The JobTracker is here: http://ghc82.ghc.andrew.cmu.edu:50030/
The name node is here: http://ghc82.ghc.andrew.cmu.edu:50070/
The scheduling queue can be found here: http://ghc82.ghc.andrew.cmu.edu:50030/scheduler

Your Tasks

Submission

We're Here To Help!

As always -- remember, we're here to help!