Engineering best practices for Machine Learning


The list below gathers a set of engineering best practices for developing software systems with machine learning (ML) components.

These practices were identified by engaging with ML engineering teams and reviewing relevant academic and grey literature. We are continuously running a global survey among ML engineering teams to measure the adoption of these practices.

The various practices are grouped into 6 categories, as illustrated in the diagram above, and listed below.

The practices are labeled with their difficulty, their effects, and the requirements for trustworthy ML they help to satisfy.

Data

"Whoever owns the data pipeline will own the production pipeline for machine learning." --Chip Huyen

Training

"No amount of experimentation can ever prove me right; a single experiment can prove me wrong." --Albert Einstein

Coding

"You can’t be an AI expert these days and not have some grounding in software engineering." --Grady Booch

Deployment

“If your model isn’t deployed into production, does it really exist?” --Anonymous

Team

"If you want to go fast, go alone; but if you want to go far, go together." -- African proverb, allegedly

Governance

"Where there is great power there is great responsibility." --Winston Churchill

Index Practice Category Difficulty
1 Use Sanity Checks for All External Data Sources Data This practice was ranked as medium. Click to read more.
2 Check that Input Data is Complete, Balanced and Well Distributed Data This practice was ranked as basic. Click to read more.
3 Test for Social Bias in Training Data Data This practice was ranked as advanced. Click to read more.
4 Write Reusable Scripts for Data Cleaning and Merging Data This practice was ranked as basic. Click to read more.
5 Ensure Data Labelling is Performed in a Strictly Controlled Process Data This practice was ranked as basic. Click to read more.
6 Prevent Discriminatory Data Attributes Used As Model Features Data This practice was ranked as advanced. Click to read more.
7 Use Privacy-Preserving Machine Learning Techniques Data This practice was ranked as advanced. Click to read more.
8 Make Data Sets Available on Shared Infrastructure (private or public) Data This practice was ranked as basic. Click to read more.
9 Share a Clearly Defined Training Objective within the Team Training This practice was ranked as basic. Click to read more.
10 Capture the Training Objective in a Metric that is Easy to Measure and Understand Training This practice was ranked as basic. Click to read more.
11 Test all Feature Extraction Code Training This practice was ranked as medium. Click to read more.
12 Assign an Owner to Each Feature and Document its Rationale Training This practice was ranked as medium. Click to read more.
13 Actively Remove or Archive Features That are Not Used Training This practice was ranked as medium. Click to read more.
14 Employ Interpretable Models When Possible Training This practice was ranked as advanced. Click to read more.
15 Peer Review Training Scripts Training This practice was ranked as medium. Click to read more.
16 Enable Parallel Training Experiments Training This practice was ranked as basic. Click to read more.
17 Automate Feature Generation and Selection Training This practice was ranked as advanced. Click to read more.
18 Automate Hyper-Parameter Optimisation Training This practice was ranked as medium. Click to read more.
19 Automate Configuration of Algorithms or Model Structure Training This practice was ranked as advanced. Click to read more.
20 Continuously Measure Model Quality and Performance Training This practice was ranked as basic. Click to read more.
21 Assess and Manage Subgroup Bias Training This practice was ranked as advanced. Click to read more.
22 Use Versioning for Data, Model, Configurations and Training Scripts Training This practice was ranked as basic. Click to read more.
23 Share Status and Outcomes of Experiments Within the Team Training This practice was ranked as basic. Click to read more.
24 Run Automated Regression Tests Coding This practice was ranked as medium. Click to read more.
24 Use The Most Efficient Models Training This practice was not ranked. Click to read more.
25 Use Continuous Integration Coding This practice was ranked as advanced. Click to read more.
26 Use Static Analysis to Check Code Quality Coding This practice was ranked as medium. Click to read more.
27 Assure Application Security Coding This practice was ranked as advanced. Click to read more.
28 Automate Model Deployment Deployment This practice was ranked as medium. Click to read more.
29 Enable Shadow Deployment Deployment This practice was ranked as medium. Click to read more.
30 Continuously Monitor the Behaviour of Deployed Models Deployment This practice was ranked as medium. Click to read more.
31 Perform Checks to Detect Skew between Models Deployment This practice was ranked as medium. Click to read more.
32 Enable Automatic Roll Backs for Production Models Deployment This practice was ranked as medium. Click to read more.
33 Log Production Predictions with the Model's Version and Input Data Deployment This practice was ranked as medium. Click to read more.
34 Provide Audit Trails Deployment This practice was ranked as advanced. Click to read more.
35 Use A Collaborative Development Platform Team This practice was ranked as basic. Click to read more.
36 Work Against a Shared Backlog Team This practice was ranked as medium. Click to read more.
37 Communicate, Align, and Collaborate With Others Team This practice was ranked as basic. Click to read more.
38 Decide Trade-Offs through Defined Team Process Team This practice was ranked as medium. Click to read more.
39 Establish Responsible AI Values Governance This practice was ranked as advanced. Click to read more.
40 Perform Risk Assessments Governance This practice was ranked as advanced. Click to read more.
41 Enforce Fairness and Privacy Governance This practice was ranked as medium. Click to read more.
42 Inform Users on Machine Learning Usage Governance This practice was ranked as advanced. Click to read more.
43 Explain Results and Decisions to Users Governance This practice was ranked as advanced. Click to read more.
44 Provide Safe Channels to Raise Concerns Governance This practice was ranked as advanced. Click to read more.
45 Have Your Application Audited Governance This practice was ranked as advanced. Click to read more.