Computer Science Thesis Proposal

  • Gates Hillman Centers
  • Traffic21 Classroom 6501
  • Ph.D. Student
  • Computer Science Department
  • Carnegie Mellon University
Thesis Proposals

Efficient Deep Learning

It is well-known that training deep neural networks is a computationally intensive process. Given the proven utility of deep learning, efficiency is thus an important concern. In the thesis, we will review our previous related work on reducing the communication overhead in distributed deep learning, speeding up learning by boosting the error gradients, and how to implement neural networks efficiently on GPUs. We propose a new and simple method for layer-wise training of deep neural networks, that allows for the incremental addition of layers, such that the final architecture need not be known in advance. In conjunction, we explore a novel optimization method for non-linear regression problems, that uses error deltas instead of gradients, and which performs very well in simulations. We will investigate how this algorithm compares to gradient descent, and how it may be applied to training neural networks. Our end-goal is to make deep network training faster, simpler, and less reliant on expert knowledge.

Thesis Committee:
Roger B. Dannenberg (Co-Chair)
Bhiksha Raj (Co-Chair)
Zico Kolter
Ruslan Salakhutdinov
Douglas Eck (Google Brain/Magenta)

Copy of Thesis Summary


For More Information, Please Contact: