The vision of automated machine learning (AutoML) entails enabling non-expert users to use machine learning technology and expert users to support them in their daily business and relieve them from tedious tasks. However, AutoML systems tend to function as a black box with few to no means for interaction. I envision AutoML tools that allow for interaction with users, making the AutoML process more transparent to the user, providing information about the learning problem and the observations made, to synergize human experience and intuition with the efficient and effective algorithms of AutoML.
Research on automated machine learning (AutoML) is intrinsically motivated by improving efficiency to obtain well-performing machine learning solutions as quickly as possible. However, efficiency does not necessarily entail sustainability. On the contrary, AutoML methods typically optimize for a single objective, exploiting available resources to the maximum degree. However, in my research, I focus on methods that use available resources in a responsible way and consider sustainability on three levels: for machine learning pipelines configured by AutoML methods, the AutoML process itself, and across multiple AutoML processes.
In algorithm selection, we aim to predict which algorithm should be used for what input which can result in immense speedups when considering computationally hard problems. For this, observations of algorithm runtimes need to be collected which are per definition very expensive to obtain and some algorithm executions may even take virtually forever. In such cases, algorithm runs are terminated early resulting in right-censored data which needs to be treated in a reasonable way.
Multi-label classification denotes a supervised learning setting where not only a single class label but a subset of class labels is predicted. In my research, I heavily investigated automated machine learning methods for configuring multi-label classifiers. In this course, I have worked on assessing the quality of predictions and how the AutoML perspective may benefit the field of multi-label classification as a whole. For instance, I could demonstrate that selecting base learners individually for every label in binary relevance learning can yield significant performance improvements for label-wise averaged macro measures. This work was awarded the frontier prize at the intelligent data analysis conference for the most visionary contribution.
Reliability is a more and more requested property for machine learning applications and one way of improving reliability is to ensure that machine learning models can express their uncertainty in a truthful way. In this sense, predictions made by machine learning models should neither be over- nor underconfident but well-calibrated. In my research, I investigate how a good calibration can be ensured through automated machine learning.
Furthermore, labels of data might underlie some uncertainty, and labeling processes allow for uncertainties to be expressed.
Another fascinating topic to me is artificial evolution and how it can be interwoven with machine learning algorithms. In my research, I have worked on methods to synergize active learning and coevolution, devise (co-)evolutionary algorithms to optimize machine learning models, and evolutionary algorithms as optimizers for automated machine learning in more general.