MineNet 2007
workshop on Mining Network Data

June 12, 2007
San Diego, CA

Held in conjunction with ACM SIGMETRICS '07 and FCRC '07


8:50am-9:00am Welcome by Workshop Chairs

9:00am-10:00am Keynote Address

Inference in Network and Systems Management
Speaker: Albert Greenberg (Microsoft Research)

In this talk, we discuss challenges and approaches towards managing huge operational IP networks, seeking high levels of performance and reliability, while the network undergoes continuous evolution in scale, in architecture, and in technology. An important component of the work is capturing and mining massive and problematic granular usage, inventory, workflow, and control data, for real time and non-real time applications. Some of the more difficult problems occur at the seams between network and system layers, where unforeseen dependencies and problems lead to anomalous and sometimes egregious results. Specific applications are discussed for IP/Optical networks, IP/MPLS networks, and enterprise networks. Inference proves a powerful tool for fault localization. We then move to speculations on how to get to the next level, the next "9" in reliability, via more direct measurement, and approaches that fuse a variety of techniques.

10:00am-10:15am Coffee Break

10:15am-11:20am Paper Session I: Security and Anomaly Detection

A three-tier IDS via data mining approach
Tsong Song Hwang, Tsung-Ju Lee, Yuh-Jye Lee (National Taiwan Univ. of Science and Technology)

Identifying and tracking suspicious activities through IP grey space analysis
Yu Jin, Zhi-Li Zhang (Univ. of Minnesota), Kuai Xu (Yahoo! Inc),
Feng Cao (Cisco Systems), Sambit Sahu (IBM Research)

ANEX: A novel density-based measure for anomaly exposure
Daniela Brauckhoff, Martin May, Bernhard Plattner (ETH Zurich)

12:30pm - 1:45pm Lunch Break

1:45pm-2:35pm Invited Talk

Anomaly Detection in Large Network Using Approximation Techniques
Speaker: Nina Taft, Intel Berkeley
A tremendous enthusiasm for amassing enormous amounts of network measurement data has spurred the development of numerous applications that incorporate data mining techniques. In this talk we question the hidden assumption in these applications that one needs to collect "all the data all the time". We consider this question in the context of an anomaly detection application. We study the popular "Subspace method detector" that is based on PCA analysis. This method normally collects data from many parts of the network, centralizes the data, and then analyzes it to uncover anomalies. In our research, we ask whether we can't reduce the amount of data needed. Can we still do anomaly detection accurately without all the data?

To avoid backhauling large amounts of data through networks, we present a framework that couples filtering at local monitors with centralized detectors that can operate on approximate views of the global data (i.e. network state). We show that the errors made by the central detector - due to the use of approximate data - can be upper bounded using matrix perturbation theory. The challenge is to design the filtering parameters; these are determined by target bounds on detection errors and the criteria being tracked for detection. Our
approximate anomaly detector can detect anomalies with 80 to 90% less data than the original method, and incurs less than a 1% reduction in detection accuracy. Finally, we comment on issues and future directions for data reduction in the context of anomaly detection.

2:40pm-3:30pm Paper Session II: Infrastructure and Experience

Authentication Anomaly detection: A case study on a VPN
Michael J. Chapple, Nitesh Chawla, Aaron Striegel (Univ. of Notre Dame)

Building a prototype for Network Measurement Virtual Observatory
Peter Matray, Istvan Csabai, Peter Haga, Jozsef Steger, Laszlo Dobos, Gabor Vattay (Eotvos University)

3:30pm-4:00pm Coffee Break

4:00pm-5:30pm Paper Session III: Traffic Classification and monitoring

A Markovian Signature-based approach to IP traffic classification
Hamza Dahmouni, Sandrine Vaton (ENST Bretagne), David Rosse (France Telecom)

Byte Me: A case for byte accuracy in traffic classification
Jeffrey J Erman (Univ. of Calgary), Anirban Mahanti (IIT, Delhi), Martin Arlitt (HP Labs)

SIP-based VoIP traffic behavior profiling and its applications
Hun Jeong Kang, Zhi-Li Zhang (Univ of Minnesota), Supranamaya Ranjan, Antonio Nucci (Narus Inc)

Real-time monitoring of SIP infrastructure using message classification
Arup Acharya, Nilanjan Banerjee, Bikram Sengupta, Xiping Wang, Charles P. Wright (IBM Research)