Explaining Decision Trees is Harder Than You Would Think!

March 29, 2023 (GHC 8102)

Abstract: For a long time the folklore wisdom inside the Machine Learning community has been that “Decision Trees are the most interpretable models”. However, this position has been challenged in the last few years by studying the interpretability of models through a formal lens: by considering well-defined explanations as computational problems, we can study their computational complexity over different classes of models, thus establishing which models are easier or harder to explain (and thus to interpret). Surprisingly, decision trees are not as interpretable as folklore wisdom would suggest. In this talk I’ll explain several results on the area I have published over the years (NeurIPS 2020, 2021, and 2022), showing that computing minimum size explanations for decision trees is NP-hard, and this hardness is preserved for probabilistic certificates, approximate certificates, and parameterized complexity. These results on decision trees will be contrasted with neural networks and linear models. Finally, if time allows it, I will briefly mention new unpublished results on PAC-explanations, and sharp inapproximability results under the assumption that NP is not a subset of QP.