ML for ML: Learning Cost Semantics by Experiment
Carnegie Mellon University
Advisor: Prof. Jan Hoffmann
It is an open problem in static resource bound analysis to connect high-level resource bounds with the actual execution time and memory usage of compiled machine code. We employed machine learning techniques like linear regression, to derive a cost semantics for a high-level source language that approximates the execution time (with garbage collection) and memory behavior of programs on a specific hardware platform. The proposed technique starts by fixing a cost semantics for the source language in which certain constants are unknown. To learn the constants for a specific hardware, a machine learning algorithm measures the resource cost of a set of training programs and compares the cost with the prediction of the cost semantics. The quality of the learned cost model is evaluated by comparing the model with the measured cost on a set of independent control programs. We implemented this technique for a subset of OCaml using Inria's OCaml compiler. The error in prediction for control programs was about 13%.
Angelic Verification in the Presence of Aliasing
Microsoft Research, Bangalore
Guides: Akash Lal and Shuvendu Lahiri
Program verification is the act of proving correctness of programs with respect to a specification using formal logic. We call the verification angelic because we make angelic assumptions about the environment, i.e. we report bugs only when there doesn’t exist a specification of the environment where the assertion doesn’t fail. Before performing verification, we perform alias analysis to eliminate trivial assertions for scalability. We instrument the program using compiler optimizations like common subexpression elimination and global value numbering to improve the performance of alias analysis without forgoing scalability. This is followed by a context insensitive flow insensitive alias analysis. We are able to successfully infer 95% of the assertions using the above approach.
Termination of Initialized Integer Linear Programs
Senior Undergraduate Thesis, IIT Bombay
Guides: Prof. Supratik Chakraborty and Prof. Akshay S
Linear Programs are simple while loop programs where the loop constraint and update are linear functions on program variables which are initialized with some value. We are studying the problem of whether termination of such programs is decidable. Transition systems, especially petri nets, resemble linear programs closely. We have successfully simulated such programs using different classes of petri nets, classical nets with reset and inhibitor arcs, simultaneous disjoint transfer nets, time petri nets, timed arc petri nets with age invariants, but unfortunately, we are unable to prove that these nets are well structured transition systems, and hence, proving termination or coverability for such nets is nontrivial.
Pure Nash Equilibria in Stochastic Games
Research & Development Project, IIT Bombay
Guides: Prof. Krishna Shankar and Prof. Ashutosh Trivedi
Ummels and Wojtczak initiated the study of finding Nash equilibria in simple stochastic multi-player games satisfying specific bounds. They showed that deciding the existence of pure-strategy Nash equilibria where a fixed player wins with probability 1 is undecidable for games with 9 players. They also showed that the problem remains undecidable for the finite-strategy Nash equilibrium with 14 players. We improved their undecidability results by showing that PureNE and FinNE problems remain undecidable for 5 or more players. We also show that for the quantitative versions of these problems – existence of Nash equilibrium where a fixed player wins with a probability p ∈ [0, 1) – four players are sufficient for undecidability. We have submitted a paper on the above results in LATA 2015.
Robust Method to find Layout Similarity between Documents
Summer Internship (2013), Adobe Research, Noida
Guide: Ram Bhushan Agrawal
Document layout represents the arrangements of visual elements on a page. The algorithm begins by segmenting the layout into blocks of two types - text blocks and non-text blocks. The blocks are then matched according to similarity, measuring the overlap and the Manhattan distance between the position of the blocks, and using these features to develop a metric. The unmatched blocks are then coalesced together to form larger blocks and matching is done again, but after introducing a coalescing penalty. This algorithm terminates once the penalty exceeds the similarity computed. We have successfully filed for a US Patent for the above algorithm in the United States Patent and Trademark Office.
Average Case Analysis of Standard Reachability Algorithm on Büchi Games
Summer Internship (2012), IST Austria, Klosterneuburg
Guide: Prof. Krishnendu Chatterjee
Büchi games are standard game graphs with Büchi objectives, where each vertex in a set needs to be reached infinitely often. The classical algorithm for Büchi games has quadratic worst case time complexity, experiments suggested that the average time is actually linear. We developed a working model to simulate Büchi games, obtained expressions for probability of reaching the whole graph in linear time, and tried to maximize this probability as a parameter of target size. The target sizes obtained as a result of this correctly matched the experimental data, and we tried to see whether the target sizes not covered in this region are infinitesimally small.