Hi, I’m Nirav!

I’m a fourth-year Ph.D. student in the Computer Science Department (CSD) at Carnegie Mellon University (CMU), where I’m fortunate to be advised by Prof. Justine Sherry. My research area is Networking, and I’m part of the Systems, Networking, and Performance Lab (SNAP) at CMU; I’m secondarily affiliated with CONIX and CMU’s Parallel Data Lab (PDL). My research interests broadly lie at the intersection of systems and performance modeling (i.e. building high-performance networked systems and proving theoretical properties about their performance and stability).

Prior to starting graduate school, I completed my undergraduate degree (B.A.Sc.) in Computer Engineering at the University of Toronto, Canada, in 2018. Despite my apparent penchant for cold, capricious weather, I spent most of life in Mumbai, India. In my spare time, I enjoy reading (fiction, philosophy), imbibing copious amounts of espresso, and playing chess, golf, and soccer.


Algorithmic complexity attacks (ACAs) are a class of denial-of-service (DoS) attacks where an attacker uses a small amount of adversarial traffic to induce a large amount of work in the target system, pushing the system into overload and causing it to drop packets from innocent users. ACAs are particularly dangerous because, unlike volumetric DoS attacks, ACAs don't require a significant network bandwidth investment from the attacker. Today, network functions (NFs) on the Internet must be painstakingly designed and engineered on a case-by-case basis to mitigate the debilitating impact of ACAs. Further, the resulting designs tend to be overly conservative in their attack mitigation strategy, limiting the innocent traffic that the NF can serve under common-case operation.

In this work, we propose a general framework to make any NF more resilient to ACAs without the limitations of prior approaches. Our framework, SurgeProtector, uses the NF's scheduler to mitigate the impact of ACAs using a very traditional scheduling algorithm: weighted-shortest-job-first (WSJF). To evaluate SurgeProtector, we propose a new metric of vulnerability called the Displacement Factor (DF), which quantifies the 'harm per unit effort' which an adversary can inflict on the system. We provide novel, adversarial analysis of WSJF and show that any system using this policy has a worst-case DF of only a small constant, where traditional schedulers place no upper bound on the DF. Illustrating that SurgeProtector is not only theoretically, but practically robust, we integrate SurgeProtector into an open source intrusion detection system (IDS). Under simulated attack, the SurgeProtector-augmented IDS suffers 90-99% lower innocent traffic loss than the original system.
  author = {Atre, Nirav and Sadok, Hugo and Chiang, Erica and Wang, Weina and Sherry, Justine},
  title = {SurgeProtector: Mitigating Temporal Algorithmic Complexity Attacks using Adversarial Scheduling},
  year = {2022},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  booktitle = {Proceedings of the 2022 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM)}
Kernel-bypass network APIs, which allow applications to circumvent the kernel and interface directly with the NIC hardware, have recently emerged as one of the main tools for improving application network performance. However, allowing applications to circumvent the kernel makes it impossible to use tools (e.g., tcpdump) or impose policies (e.g., QoS and filters) that need to consider traffic sent by different applications running on a host. This makes maintainability and manageability a challenge for kernel-bypass applications. In response we propose Kernel On-Path Interposition (KOPI), in which traditional kernel dataplane functionality is retained but implemented in a fully programmable SmartNIC. We hypothesize that KOPI can support the same tools and policies as the kernel stack while retaining the performance benefits of kernel bypass.
  author = {Sadok, Hugo and Zhao, Zhipeng and Choung, Valerie and Atre, Nirav and Berger, Daniel S. and Hoe, James C. and Panda, Aurojit and Sherry, Justine},
  title = {We Need Kernel Interposition over the Network Dataplane},
  year = {2021},
  isbn = {},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3458336.3465281},
  doi = {10.1145/3458336.3465281},
  booktitle = {Proceedings of the Workshop on Hot Topics in Operating Systems},
  pages = {152--158},
  month = may,
  series = {HotOS '21}
Intrusion Detection and Prevention Systems (IDS/IPS) are among the most demanding stateful network functions. Today's network operators are faced with securing 100Gbps networks with 100K+ concurrent connections by deploying IDS/IPSes to search for 10K+ rules concurrently. In this paper we set an ambitious goal: Can we do all of the above in a single server? Through the Pigasus IDS/IPS, we show that this goal is achievable, perhaps for the first time, by building on recent advances in FPGA-capable SmartNICs. Pigasus' design takes an FPGA-first approach, where the majority of processing, and all state and control flow are managed on the FPGA. However, doing so requires careful design of algorithms and data structures to ensure fast common-case performance while densely utilizing system memory resources. Our experiments with a variety of traces show that Pigasus can support 100Gbps using an average of 5 cores and 1 FPGA, using 38x less power than a CPU-only approach.
  author = {Zhao, Zhipeng and Sadok, Hugo and Atre, Nirav and Hoe, James and Sekar, Vyas and Sherry, Justine},
  title = {Achieving 100Gbps Intrusion Prevention on a Single Server},
  booktitle = {14th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 20)},
  year = {2020},
  address = {Banff, Alberta},
  url = {https://www.usenix.org/conference/osdi20/presentation/zhao-zhipeng},
  publisher = {USENIX Association},
  month = nov,
Caches are at the heart of latency-sensitive systems. In this paper, we identify a growing challenge for the design of latency-minimizing caches called delayed hits. Delayed hits occur at high throughput, when multiple requests to the same object queue up before an outstanding cache miss is resolved. This effect increases latencies beyond the predictions of traditional caching models and simulations; in fact, caching algorithms are designed as if delayed hits simply didn’t exist. We show that traditional caching strategies – even so called ‘optimal’ algorithms – can fail to minimize latency in the presence of delayed hits. We design a new, latency-optimal offline caching algorithm called BELATEDLY which reduces average latencies by up to 45% compared to the traditional, hit-rate optimal Belady’s algorithm. Using BELATEDLY as our guide, we show that incorporating an object’s ‘aggregate delay’ into online caching heuristics can improve latencies for practical caching systems by up to 40%. We implement a prototype, Minimum-AggregateDelay (MAD), within a CDN caching node. Using a CDN production trace and backends deployed in different geographic locations, we show that MAD can reduce latencies by 12-18% depending on the backend RTTs.
  author = {Atre, Nirav and Sherry, Justine and Wang, Weina and Berger, Daniel S.},
  title = {Caching with Delayed Hits},
  year = {2020},
  isbn = {9781450379557},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3387514.3405883},
  doi = {10.1145/3387514.3405883},
  booktitle = {Proceedings of the 2020 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM)},
  pages = {495–513},
  numpages = {19},
  keywords = {Caching, Belatedly, Delayed hits},
  location = {Virtual Event, USA},
  series = {SIGCOMM ’20}

Research Talks

SurgeProtector: Mitigating Temporal Algorithmic Complexity Attacks (ACAs) using Adversarial Scheduling
Caching with Delayed Hits

Industry Experience

Summer, 2020 (4 months)
Summer, 2018 (4 months)
Sep, 2016 - Aug, 2017 (1 year)
Summer, 2016 (4 months)