Priya Narasimhan, and pictures of some of her research projects (football engineering, YinzCam, Trinetra)

Problem diagnosis (or fingerpointing) involves instrumenting systems to yield meaningful data, detecting errors and/or failures within these systems, and ascertaining their root-cause, i.e., the underlying fault. Fingerpointing is difficult because the distributed interactions, protocols and inter-component dependencies in computer systems can cause a problem to change ``shape'' or manifestation, leading to potential red herrings in problem determination. We are currently developing a variety of techniques for automated fingerpointing in a number of distributed systems, such as Hadoop, PVFS and Lustre -- the aim is to perform online and offline root-cause analyses in order to identify a faulty node/process, diagnose the source of the problem, and report it to the user or administrator through some form of visualization. We have developed techniques for log analysis, black-box performance analysis and hardware performance-counter analysis, in the interests of localizing the origin of the problem in large distributed systems.
More information: Fingerpointing research group website
Follow us on twitter: @CloudAtCMU

Implementing online software upgrades (enabling changes in the behavior, configuration, code, data or topology of an executing distributed application) is challenging in distributed systems. This functionality is essential for enabling the self-regulating, autonomic management and maintenance of enterprise computer systems. We are addressing the challenges of maintaining the existing (and potentially unknown) dependencies between distributed components and services, handling API evolution, performing upgrades across mutually-distrustful administrative domains, transferring state (which may require long running data migration and conversion tasks executing in parallel with regular requests for the same data), assessing and minimizing the impact of upgrades on the running services while improving the value of the infrastructure according to some well-defined metrics and tolerating faults during the upgrading process.
Follow us on twitter: @CloudAtCMU

YinzCam allows fans, right from their seats during the game, to create/view their own instant replays (through short- and long-rewinds of live video), catch the action from 4-8 different live camera angles (e.g., Follow-Crosby Cam for the Pittsburgh Penguins), get automated instant replays seconds after a play has happened (from all the different camera angles), and get real-time statistics, game rules, players' roster, etc., all on their own wifi-enabled smartphones, and all without violating broadcast rights. YinzCam was deployed as a pilot for the Pittsburgh Penguins for 40 home games from October 2008--May 2009 (including the 2009 Stanley Cup playoffs and the Stanley Cup Final) for the Pittsburgh Penguins, a National Hockey League (NHL) team in the United States. YinzCam is platform-agnostic and was supported on the fans' own smartphones (as a browser-based service, without requiring any software installation on the phone), including on the iPhone, the iPod Touch, the Blackberry Bold, the Nokia N95, the Samsung Omnia, the T-Mobile Android G1, the HTC Touch Pro, and 25-odd different wifi-enabled Windows Mobile phones.
More information: YinzCam website
Follow us on twitter: @yinzcam

We are developing multiple mobile cloud-computing middleware platforms to enable new large-scale mobile-cluster applications. Smart phones and other wireless mobile devices are increasingly becoming larger in compute power, networking, memory, storage, etc. The aim is to leverage mobile devices as the nodes of a large-scale cloud-computing infrastructure and will provide the middleware for these devices to work together seamlessly, in a peer-to-peer manner, to support a variety of new mobile-cluster applications. Hyrax is a platform that we have developed for large-scale data intensive computing on mobile devices. Hyrax is based on MapReduce, is derived from Hadoop and runs on the Android platform. Agora is a sensor middleware platform that we have developed to support interoperability and ease-of-use in developing sensor-network applications that encompass a variety of hardware and software architectures.
Follow us on twitter: @CloudAtCMU

Our research group is focused on improving the viewing experience, refereeing, scouting, and sports performance aspects of (American) football through engineering and research. Our approach is to use a synergistic combination of sensors, communication protocols, computer vision, and machine learning techniques to provide enhanced tracking and motion analysis during practice or games. We have currently developed a smart football that can be used to track the trajectory and landing position of the football in the field of play. We have also developed embedded coaching aids to help running backs, quarterbacks, wide receivers and punt kickers train reproducibly and independently of their coaches. The resulting data can also be used to indicate the performance of an individual player.
More information: Football Engineering website
Follow us on twitter: @SportsTechAtCMU

Trinetra aims to develop cost-effective, smartphone-enabled assistive technologies to provide visually impaired people with greater independence and an enhanced quality of life in their daily activities. The broad objective is to harness the collective capability of diverse networked embedded devices to support location-aware and context-aware applications, including first-responder support, building navigation, retail shopping, smart transportation, etc. To date, we have researched and developed a portable barcode-based solution involving an Internet- and Bluetooth-enabled smartphone to aid grocery shopping at the Carnegie Mellon campus convenience store, Entropy. We have also extended this to assist both sighted and visually impaired commuters with their transportation and commute-planning needs, using a smart phone to convey notifications of arrivals, departures, etc. We have also developed a phone-based currency identifier for the visually impaired.
More information: Trinetra website
Follow us on twitter: @AssistTechAtCMU

This is a more recent research effort to connect individuals with their government and its functioning through technology. The central idea is to put control into people's hands by allowing them to view their city/state/federal government in action, and to provide them with more direct and ready means to communicate issues of concern to them. Working closely with the Pittsburgh City Council, we have developed the first mobile app for e-government, which allows the residents of Pittsburgh to communicate auto-geotagged visual complaints (potholes, grafitti, fallen trees, etc.) directly into their City's 311 system for resolution, directly from their cellphones.
Follow us on twitter: @cityZenMobile


MEAD: Real-Time Fault-Tolerant Middleware
Enhances distributed middleware (CORBA and Java) applications with dependability, including: 1) transparent, yet reconfigurable, fault tolerance at runtime, 2) configuration advice to tailor an application's fault-tolerance to its reliability and resource needs, 3) proactive fault tolerance based on failure prediction, 4) resource-aware system adaptation to failures, and 5) enabling distributed, fault-tolerant applications to live realistically with nondeterminism. The significant contributions of MEAD included its analysis of the three-way trade-offs between resources, timeliness and fault-tolerance. MEAD was also unique in exploiting compile-time program analysis and run-time dynamic analysis to provide consistent and efficient (albeit lazy) replication, even for nondeterministic multithreaded CORBA/Java applications.

Survivable Distributed Systems -- Vajra, Elephant, Thema, Immune
The Elephant work focuses on live updates of intrusion detection systems, such as Snort. The Thema work focuses on Byzantine-fault tolerance for multi-tier distributed applications based on Web Services. Vajra focuses on benchmarking the survivability of various distributed infrastructures (such as Castro-Liskov BFT, Immune, Fleet, etc.) through fault-injection of benign and malicious failures. Immune was a collaborative research effort with Prof. Kim P. Kihlstrom that led to the development of a survivable infrastructure for CORBA applications. Immune enables CORBA applications to continue operating, despite faults that occur within the system, as well as intrusions or malicious/Byzantine attacks that damage the underlying system. Majority voting on the traffic between replicated CORBA objects, value fault detection, and secure multicast protocols (which employ message digests and digital signatures) are Immune's building blocks.

Eternal: Transparent Fault-Tolerant Middleware
Eternal is a transparent fault-tolerant infrastructure that supports reliable CORBA/Java applications, without requiring any modification to the application, to the OS or the middleware. Eternal provides support for active and passive replication, overcomes the non-determinism inherent in multithreaded CORBA/Java applications, and provides for gateways to support external clients. The key contributions of this research work are the support for strong replica consistency, the sanitization of non-deterministic multithreading, and most importantly, the transparency of the fault tolerance. This transparency frees CORBA application programmers from worrying about the difficult issues of reliability, and allows them to focus on their area of expertise - the application. This also leads to considerable savings in terms of development time because, as soon as the application logic is ready, fault tolerance is available to be deployed "out-of-the-box" at run-time. Eternal provided transparent fault tolerance to different implementations of CORBA: VisiBroker (Borland), Orbix (Iona), CORBAplus (Expersoft), TAO (Washington University, St. Louis), e*ORB (Vertel), omniORB2 (AT & T Laboratories, UK), ORBacus (Object-Oriented Concepts) and ILU (Xerox PARC). The understanding and insights gained from Eternal significantly influenced the Fault-Tolerant CORBA Standard.

Last updated: 7 September 2009, Priya Narasimhan