Boolean Is Not Too Restrictive: Learning Optimal Decision Trees via MaxSAT
My Session Status
The goal of this talk is to argue that the answer to this dilemma is positive in more cases than one might expect. We use decision trees as a case study. They are highly accurate yet interpretable machine learning models and are commonly used in sensitive applications. The majority of decision tree learning problems are NP-hard and as such can theoretically be solved using SAT. However, decision tree learning typically requires support for non-binary data and objectives, which are challenging to encode in the boolean world of SAT.
We present a set of techniques and encodings that overcome these challenges, enabling us to solve certain variants of this problem for the first time and improve the state of the art for others. We further discuss the benefits of an exact approach to learning decision trees, including modularity between components, theoretical guarantees, and support for user-specified constraints. We conclude with an overview of experimental results demonstrating the practical effectiveness of the approach.
References
Pouya Shati, Eldan Cohen, and Sheila McIlraith. “Optimal decision trees for interpretable clustering with constraints”. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence. 2023, pp. 2022–2030.
Pouya Shati, Eldan Cohen, and Sheila McIlraith. “SAT-Based Approach for Learning Optimal Decision Trees with Non-Binary Features”. In: 27th International Conference on Principles and Practice of Constraint Programming (CP 2021). Schloss Dagstuhl-Leibniz-Zentrum f ̈ur Informatik. 2021.
Florent Avellaneda. “Efficient inference of optimal decision trees”. In: AAAI Conference on Artificial Intelligence (AAAI). 2020, pp. 3195–3202.
Hao Hu. “Interpretable Machine Learning Models via Maximum Boolean Satisfiability”. PhD thesis. INSA de Toulouse, 2022.