Engineering best practices for Machine Learning

The list below gathers a set of engineering best practices for developing software systems with machine learning (ML) components.

These practices were identified by engaging with ML engineering teams and reviewing relevant academic and grey literature. We are continuously running a global survey among ML engineering teams to measure the adoption of these practices.

The various practices are grouped into 6 categories, as illustrated in the diagram above, and listed below.

The practices are labeled with their difficulty, their effects, and the requirements for trustworthy ML they help to satisfy.

List View
Table View

Data

"Whoever owns the data pipeline will own the production pipeline for machine learning." --Chip Huyen

Training

"No amount of experimentation can ever prove me right; a single experiment can prove me wrong." --Albert Einstein

Coding

"You can’t be an AI expert these days and not have some grounding in software engineering." --Grady Booch

Deployment

“If your model isn’t deployed into production, does it really exist?” --Anonymous

Team

"If you want to go fast, go alone; but if you want to go far, go together." -- African proverb, allegedly

Governance

"Where there is great power there is great responsibility." --Winston Churchill

Index	Practice	Category	Difficulty
1	Use Sanity Checks for All External Data Sources	Data	This practice was ranked as medium. Click to read more.
2	Check that Input Data is Complete, Balanced and Well Distributed	Data	This practice was ranked as basic. Click to read more.
3	Test for Social Bias in Training Data	Data	This practice was ranked as advanced. Click to read more.
4	Write Reusable Scripts for Data Cleaning and Merging	Data	This practice was ranked as basic. Click to read more.
5	Ensure Data Labelling is Performed in a Strictly Controlled Process	Data	This practice was ranked as basic. Click to read more.
6	Prevent Discriminatory Data Attributes Used As Model Features	Data	This practice was ranked as advanced. Click to read more.
7	Use Privacy-Preserving Machine Learning Techniques	Data	This practice was ranked as advanced. Click to read more.
8	Make Data Sets Available on Shared Infrastructure (private or public)	Data	This practice was ranked as basic. Click to read more.
9	Share a Clearly Defined Training Objective within the Team	Training	This practice was ranked as basic. Click to read more.
10	Capture the Training Objective in a Metric that is Easy to Measure and Understand	Training	This practice was ranked as basic. Click to read more.
11	Test all Feature Extraction Code	Training	This practice was ranked as medium. Click to read more.
12	Assign an Owner to Each Feature and Document its Rationale	Training	This practice was ranked as medium. Click to read more.
13	Actively Remove or Archive Features That are Not Used	Training	This practice was ranked as medium. Click to read more.
14	Employ Interpretable Models When Possible	Training	This practice was ranked as advanced. Click to read more.
15	Peer Review Training Scripts	Training	This practice was ranked as medium. Click to read more.
16	Enable Parallel Training Experiments	Training	This practice was ranked as basic. Click to read more.
17	Automate Feature Generation and Selection	Training	This practice was ranked as advanced. Click to read more.
18	Automate Hyper-Parameter Optimisation	Training	This practice was ranked as medium. Click to read more.
19	Automate Configuration of Algorithms or Model Structure	Training	This practice was ranked as advanced. Click to read more.
20	Continuously Measure Model Quality and Performance	Training	This practice was ranked as basic. Click to read more.
21	Assess and Manage Subgroup Bias	Training	This practice was ranked as advanced. Click to read more.
22	Use Versioning for Data, Model, Configurations and Training Scripts	Training	This practice was ranked as basic. Click to read more.
23	Share Status and Outcomes of Experiments Within the Team	Training	This practice was ranked as basic. Click to read more.
24	Run Automated Regression Tests	Coding	This practice was ranked as medium. Click to read more.
24	Use The Most Efficient Models	Training	This practice was not ranked. Click to read more.
25	Use Continuous Integration	Coding	This practice was ranked as advanced. Click to read more.
26	Use Static Analysis to Check Code Quality	Coding	This practice was ranked as medium. Click to read more.
27	Assure Application Security	Coding	This practice was ranked as advanced. Click to read more.
28	Automate Model Deployment	Deployment	This practice was ranked as medium. Click to read more.
29	Enable Shadow Deployment	Deployment	This practice was ranked as medium. Click to read more.
30	Continuously Monitor the Behaviour of Deployed Models	Deployment	This practice was ranked as medium. Click to read more.
31	Perform Checks to Detect Skew between Models	Deployment	This practice was ranked as medium. Click to read more.
32	Enable Automatic Roll Backs for Production Models	Deployment	This practice was ranked as medium. Click to read more.
33	Log Production Predictions with the Model's Version and Input Data	Deployment	This practice was ranked as medium. Click to read more.
34	Provide Audit Trails	Deployment	This practice was ranked as advanced. Click to read more.
35	Use A Collaborative Development Platform	Team	This practice was ranked as basic. Click to read more.
36	Work Against a Shared Backlog	Team	This practice was ranked as medium. Click to read more.
37	Communicate, Align, and Collaborate With Others	Team	This practice was ranked as basic. Click to read more.
38	Decide Trade-Offs through Defined Team Process	Team	This practice was ranked as medium. Click to read more.
39	Establish Responsible AI Values	Governance	This practice was ranked as advanced. Click to read more.
40	Perform Risk Assessments	Governance	This practice was ranked as advanced. Click to read more.
41	Enforce Fairness and Privacy	Governance	This practice was ranked as medium. Click to read more.
42	Inform Users on Machine Learning Usage	Governance	This practice was ranked as advanced. Click to read more.
43	Explain Results and Decisions to Users	Governance	This practice was ranked as advanced. Click to read more.
44	Provide Safe Channels to Raise Concerns	Governance	This practice was ranked as advanced. Click to read more.
45	Have Your Application Audited	Governance	This practice was ranked as advanced. Click to read more.