This blog post has originally been published on automl.org
By Koen van der Blom, Holger Hoos, Alex Serban, Joost Visser
In our global survey among teams that build ML applications, we found ample room for increased adoption of AutoML techniques. While AutoML is adopted at least partially by more than 70% of teams in research labs and tech companies, for teams in non-tech and government organisations adoption is still below 50%. Full adoption remains limited to under 10% in research and tech, and to less than 5% in non-tech and government.
Software engineering for machine learning
With the inclusion of ML techniques in software, the risks and requirements of the software also changes. In turn, the best ways to maintain and develop software with ML components is different from traditional software engineering (TSE). We call this new field software engineering for machine learning (SE4ML).
Based on a study of the academic and “grey” literature (the latter comprises blog articles, presentation slides and white papers), we identified best practices for SE4ML, and composed a recommended reading list on the topic. These practices include the use of AutoML techniques to improve the use of ML components in software. All practices are described in our practice catalogue. We encourage readers to have a look and to send us their suggestions for additions.
Figure 1: Engineering best practices for machine learning.
Having a solid overview of the best practices, we set up a survey to find out the extent to which these SE4ML practices have already been adopted. We asked teams working on software with machine learning components to which degree they followed each practice in their work. In the resulting data we saw that tech companies have the highest adoption of the new ML related best practices, while research labs have the highest adoption of AutoML. Overall, more practices are adopted by teams that are larger and more experienced, with practices originating from TSE seeing slightly lower adoption than the new ML specific practices.
Effects of best practice adoption
Besides the adoption of best practices, we also investigated the effects of the practices. This resulted in insights into which specific practices relate to which desired outcomes. For instance, we found that software quality is influenced most by continuous integration, automated regression tests, and static analysis. On the other hand, agility is primarily affected by automated model deployment, having a cohesive- multi-disciplinary team, and enabling parallel training experiments. With these insights into the effectiveness of different practices, we hope to increase practice adoption and improve the quality of software with ML components.
A brief overview of additional survey results can be found in our piece titled “The 2020 State of Engineering Practices for Machine Learning”, and in our technical publication “Adoption and Effects of Software Engineering Best Practices in Machine Learning”.
How about automated ML?
A recent line of research advancements in ML focuses on automated machine learning (AutoML), an area that aims to automate (parts of) the construction and use of ML pipelines to enable a wider audience to make effective and responsible use of ML, without needing to become an expert in the field. We took a closer look at our survey results and found that, compared to non-tech companies and governmental organisations, research labs and tech companies are ahead in the adoption of AutoML practices (see Figure 2).
Figure 2: AutoML adoption by organisation type.
While overall AutoML adoption is similar across continents, non-profit and low-tech organisations see higher adoption in Europe than in North America. We also found that teams with multiple years of experience are more likely to adopt AutoML techniques. Finally, across the board, there is significant room to increase adoption of AutoML, but this is especially true for non-tech companies and governmental organisations.
More detailed information on our findings regarding AutoML adoption can be found in our piece titled “The 2020 State of AutoML Adoption”.
Based on feedback from and interviews with participants, we recently revised our survey to learn more. Specifically, in our latest version of the survey, we ask in more detail about responsible use of ML and about the adoption of different AutoML techniques such as automated feature selection and neural architecture search. You can help us by taking the 10-minute survey and by spreading the word. If you want to stay up to date with our progress, keep an eye on our website.