15-821/18-843: Mobile and Pervasive Computing (IoT)

Fall 2025

Project Descriptions  (Updated  2025-08-20-08:06)

Title

Mentor

Students

Description

1

Tiered Offload for Mobile Sensing Platforms

Mihir Bala

1-2 students

Mobile sensing platforms, like drones, have seen huge technological advancement in recent years. However, the hardware on these platforms is still limited by weight and power constraints, restricting the types of AI models they can use onboard. One solution is to use edge computing. By offloading computation to a network proximal cloudlet, mobile devices can supplement their onboard hardware with powerful GPU support. Unfortunately, network offloading isn’t free, and can cost valuable power, latency, and bandwidth. How do we find an optimal balance? In this project, you will design a tiered offload system that will choose when to offload and how to fuse results from local and remote computation.

Required skills: Python

Learned skills: Mobile networking, edge AI deployment, PyTorch

2

Modernizing the Sinfonia server backend

Jan Harkes

1-2 students

Sinfonia is a software system to discover nearby compute resources and deploy

application backends for edge native applications using Kubernetes.

The current implementation was written in 2020 based around Python 3.6.  It does not utilize modern Python features. The implementation uses dependencies that since have been deprecated or are no longer maintained.    Since this implementation uses  a thread for each target during the discovery and deployment process, it does not scale well. It also serializes various discovery and deployment steps that could be done in parallel.   While the current implementation is quite small (under 2000 lines combined, for Tier-1 and Tier-2), a new implementation with FastAPI is likely to be even smaller.

   

The goal of this project is to create a modernized version of Sinfonia that:

  • Uses modern Python async frameworks. i.e. FastAPI + asyncio + aiohttp.
  • Uses modern Python features. i.e. dataclasses, type hints, etc.
  • Uses modern dependencies, avoid obscure and/or deprecated libraries.
  • Maintains compatibility with the existing clients.
  • Scales better.
  • Is easier to maintain and extend.

3

Extend Live Learning in Hawk to support audio

Jan Harkes

1-2 students

Hawk is a Live Learning system where an initial (bootstrap) model is trained on a small example set (20 positive / 100 negative examples).  This model is then iteratively improved as more positive examples are uncovered from a large (unlabeled) dataset.  Hawk currently supports image datasets, radar datasets and video datasets. The goal of this project is to add support for audio data to Hawk. This will involve:

  • Identifying interesting audio datasets and creating a Hawk retriever to extract samples for inferencing and training.  Examples include:
  • Adding support for a baseline audio recognition/detection model for training and inferencing that can be finetuned for a specific use case.
  • Adding support for labeling audio fragments in the Hawk labeling UI.

4

Air Traffic Control for Edge-enabled Autonomous Drone Swarm

Tom Eiszler

1 or 2 students

The SteelEagle project takes light weight COTS drones and imbues them with the ability to perform autonomous active vision tasks by leveraging edge computing. The mission centric design of SteelEagle allows a user to perform various tasks utilizing multiple drones while abstracting away some of the complexity of the underlying drone platforms. This project will focus on developing a system for air traffic control to be employed by the swarm controller component at runtime. This dynamic ATC should take into account the mission parameters and the current telemetry/status from the drones involved in the mission to ensure that drones don't collide with each other. For example, the swarm controller ATC might preempt the patrol task of drone B when drone A enters it's flight space because it is tracking a moving target. This ATC will be tested using simulation tools that we have built.

Technologies: docker, python, ZMQ, potentially some UI development in Streamlit

5

Exploring Mobile AI Accelerators for Real-time Vision Tasks

Qifei Dong

1-2 students

Computational offload was first used to enable computationally-intensive services on Size, Weight, and Power (SWaP) optimized mobile devices over a quarter century ago. Since then, these devices have incorporated multi-core CPUs, GPUs, and tensor processing units. In light of such advances, we conducted our research and demonstrated that edge computing is still needed for most mobile real-time AI. However, the landscape is continually changing, with ever-more capable mobile platforms released every year, and more effective AI techniques pushing the state of the art. The question of whether mobile computing capabilities are sufficient for real-time AI will need to be revisited regularly as hardware capabilities, optimizations for NPUs, and accuracy expectations all change over time.

The goal of this project is to implement an Android application that performs one (or more) of the following real-time vision tasks of your choice: Object Detection, Image Segmentation, or Multi-Object Tracking (MOT). (You are welcome to choose more challenging tasks like real-time sign language recognition.) Your app will be able to utilize the hardware accelerator dedicated for running deep neural network (DNN) inference to perform the task locally. You will also implement an "edge" version that always offloads the AI workload to a cloudlet. Finally, you will run measurements and compare the accuracy and performance of the two approaches. You can choose to develop on a Google Pixel 9 Pro (with TPU) or a OnePlus 13 (with Qualcomm Hexagon NPU), and you are free to use any pre-trained DNN models online.

This project will help you develop skills in computer vision, deep learning on mobile devices, and building edge-native applications. You should be familiar with deep learning, Python, and Android. Basic knowledge of computer vision will be helpful as well.

6

Edge-native Indoor Navigation with Advanced Visual-Geometric SLAM

Jingao Xu

1 or 2 students

Visual SLAM (Simultaneous Localization and Mapping) is a key technology that enables devices to understand their location while simultaneously building a map of unknown environments using camera input. It plays a crucial role in indoor navigation systems where GPS is unavailable. This project aims to develop a next-generation indoor navigation system for GHC floors 6-9 using Visual-Geometric Graph Transformer (VGGT) based SLAM methods, which offer significant improvements over traditional approaches.

 

The project involves three main components: (1) building dense 3D point cloud maps of GHC using the currently open-sourced VGGT-SLAM; (2) aligning these point clouds with official floor plans to create room-level semantic navigation information; and (3) implementing image localization based on (1) and (2) where users/robots can determine their location and navigate to target destinations by simply taking a photo.

Following the edge computing paradigm, images will be offloaded to edge servers for processing, leveraging computational resources beyond mobile device capabilities.

 

Students will gain experience with modern VGGT and SLAM techniques, 3D computer vision, semantic mapping, and edge-native application development using our Gabriel offloading framework.

Students should have basic familiarity with Python, Android development, and DNN-based (ViT, DINO, etc.) computer vision concepts. This project provides hands-on experience with state-of-the-art CV advances while building a practical indoor navigation solution.

7

Running Large Models on Small, Mobile  Devices.

Babu Pillai

1 or 2 students

LLMs and other large AI models are currently the epitome of AI research, pushing the bounds of what AI systems are capable of doing.  Such models have been developed and trained on large machines with multiple GPUs and assume highly capable machines for deployment.  Modern mobile devices often include GPUs and neural network accelerators, though with less capability and size than large AI machines.  Can such models be deployed on small, relatively resource-poor mobile devices?  This project seeks to deploy some models on mobile devices using the latest available research software for inference on mobile hardware, employing techniques such as quantization to make  large models fit.  The goal is to demonstrate execution of a large model on the mobile device and evaluate it both in terms of accuracy and speed relative to an offloaded inference engine running on a large GPU machine.

8

Casual Time-Lapse Photography

Babu Pillai

1 or 2 students

iTme lapse photography is an invaluable aid in the study of relatively slow phenomena, such as movement of ice or growth of plants.  However, setting up the equipment to capture a time-lapse video can be quite challenging, as this involves long-term installation of weather-, temper-, and theft-resistant equipment.  In this project, we seek to build a system for capturing time-lapse in a more casual way, without installation of specialized equipment.  Instead, we wish to leverage images captured by mobile devices when they happen to view the scene of interest.  These images may be crowd sourced or captured by autonomous vehicles or drones that go by the scene frequently.  The key deliverable of this project is a cloud-based service to collect the images and merge them into a time-lapse video.  Two key challenges are that the images will not be from precisely the same camera position or with the same camera parameters (lighting, angle of view, etc).  The cloud service will need to register the images against each other and may need to transform or warp images to match as closely as possible using various image matching and computer vision techniques.  Secondly, the images will not be spaced evenly in time.  This may require discarding some images, while interpolating between others to produce a temporally accurate and smooth final video.  A potential final demo may be a time lapse of a sprouting plant, constructed from images captured by multiple devices / people in an informal crowd sourced way.

9

3-D Localization of Distributed Edge Devices in a Cloudlet System

Alex Gaudio and Asim Smailagic

1 or 2 students

A common problem in mobile computing with a network of edge devices is to map the 3-D location of the edge devices in relationship to each other. Possible approaches to achieve 3-D localization can include the analysis of RF signals (e.g. WiFi, NFC, etc), sound, and other technologies.   Moreover, the localization system can be active, such as by emitting an impulse with one network device and receiving the impulse response on other network devices, or the system can be passive, such as by observing ambient signals simultaneously with all sensors and coordinating a location based on that distributed observation. Moreover, technologies can differ greatly in their robustness to environmental noises, and in how precisely the system can localize a node.  The task in this project is to develop a 3-D localization technology of your choice that can identify where an edge device node is within a cloudlet system of similar devices.  We encourage the hardware implementation of the project (e.g. with STM32 devices or ESP32 devices).   It is your responsibility to review the technologies, define an approach and expected resolution (e.g. cubic centimeters), and develop a 3-D localization technology wherein each node can identify where it is relative to other nodes.  The nodes don't need to be continuously connected, but an expected output of the system when all nodes are synchronized should be a 3-D visualization of all node locations.  All processing should be feasible to perform on embedded hardware, such as STM32, ESP32, Raspberry Pi or mobile phone