Mastering PhD

Am extremely passionate about my PhD at CMU. I would like it to be a systematic work of art, hence am binding my experience to a concrete framework.


I love and am invested in the grey of becoming an excellent researcher. I would like to think of it as the process of accumulating the 5 infinity stones: Algorithms, Collaboration, Data, Development, Infrastructure. PhD to me gives an excellent opportunity to earn the #aukaat and perhaps even get a head start on some of these infinity stones that I wish to harness when going for others. Long term, I see myself as Researcher++.


I subscribe to the idea that one needs to identify the big ideas and enforce great execution against them.
In my PhD, my big argument is that we still are not completely certain that we can deploy AI as a key enabling technology. I believe there are three big problems the technology side of AI needs to solve:
(1) The Scalability problem
(2) The Flexibility Problem
(3) The Explainability Problem

Any technology that aspires to be part of human toolbox needs to, I argue, solve these three issues before getting into interesting (and important) topics like ethics and bias. Sadly, I dont see AI addressing these issues. Hence in my thesis I propose a framework to address these three problems simultaneously. Read more and hit me up with arguments!


The following requisites can be drawn from the philosophy mentioned before:
(1) Working Knowledge: To work on language processing, one needs complete working knowledge of atleast one and competitive working knowledge in most domains in language technologies. FALCON is my attempt in this direction.
(2) Application Knowledge: To work on the rich diverse set of tasks in language processing, one needs both working knowledge as well as application knowledge of different domains. I aim to address this using fruitful collaborations.
(3) Exhaustive Experimentation Skills: Any meaningful research involving human deployment would require large scale exhaustive experimentation skills. To me this means ability to build scalable infrastructures from ground up. For instance, ability to hack together a GPU machine all the way through agility to massively parallelize AWS.

Since it is impractical for a student to aspire to acquire all the requisites at once, one needs to strategize the path. To me this is a beautiful design problem where we can naturally apply the engineering principles. For example, we can prestart acquisition of some skills ( like docker-ing, working with crowds, etc), do things in parallel (by collaborating), etc.

Post with suggestions for prospective students : Optimizing returns from coursework in LTI