Engineering best practices for Machine Learning


The list below gathers a set of engineering best practices for developing software systems with machine learning (ML) components.

These practices were identified by engaging with ML engineering teams and reviewing relevant academic and grey literature. We are continuously running a global survey among ML engineering teams to measure the adoption of these practices.

The various practices are grouped into 6 categories, as illustrated in the diagram above, and listed below.

Data

"Whoever owns the data pipeline will own the production pipeline for machine learning." --Chip Huyen

Training

"No amount of experimentation can ever prove me right; a single experiment can prove me wrong." --Albert Einstein

Coding

"You can’t be an AI expert these days and not have some grounding in software engineering." --Grady Booch

Deployment

“If your model isn’t deployed into production, does it really exist?” --Anonymous

Team

"If you want to go fast, go alone; but if you want to go far, go together." -- African proverb, allegedly

Governance

"Where there is great power there is great responsibility." --Winston Churchill