Research

Explainable AI

My thesis research is focusing on what we mean when we talk about properties of an Artificial Intelligence like explainability / interpretability / intelligibility. In approaching the problem, I argue that technology and design are co-dependent. In order for any technology to be adopted by end-users, it needs to be trusted / understood depending on the variable aims of different end-users. Can we define interpretability axiomatically? How do we enable trust / understanding when each end-user has different biases to take into account? How can we efficiently communicate knowledge between a human and a machine? In seeking answers to these questions, I am laying out a philosophical foundation for future work on the topic.

Bounding Box Classifier

This is one approach to a model which is inherently 'explainable'. All design decisions are made such that we choose the more simple of the options. For instance, we find boxes in 2D subspaces of data, making visualization straightforward. Additionally, such subspaces are axis-aligned in the native feature space, meaning that the machine is constrained to reasoning in the same space as the humans who created the data; there are no feature transformations allowed. By building different ensembles with these 2D boxes, we can make different types of robust learning models.

Recently, the Auton Lab published an open-source version of the tool. You can check it out here: Bounding Box Classifier

I recently gave a talk on the subject, which you can view below. To summarize, I talk about the bounding box classifier and show a few use cases where it is possible to leverage simple structure in data.