When Trusting 11 Nines Nearly Broke a Production ML Pipeline
https://www.4shared.com/s/fNZfljlIjfa
When Trusting 11 Nines Nearly Broke Training: A Production Incident We had a model training cluster that churned through terabytes of image tiles every week