Instead of deploying a model straight into production, one can assess its quality and performance using the data from production without allowing the model to make final decisions. This involves deploying the a model to “shadow” or “compete” with the model in production, and redirect the data to both models. The model that is already deployed will still handle all decisions, until the shadow model is assessed and promoted to production.
Using shadow models allows teams to avoid unintended behaviours in production – coming from skews between training and test data. However, it introduces more complexity in the deployment infrastructure. Luckily, tool support for shadow or canary deployment has already matured.
- Machine Learning Logistics
- TFX: A tensorflow-based Production-Scale ML Platform
- Versioning for end-to-end machine learning pipelines