Jon Peck is a full-stack developer with two decades of industry experience, who now focuses on bringing scalable, discoverable, and secure machine-learning microservices to developers across a wide variety of platforms via Algorithmia.com
Over the next 18 months, companies will be completing the R&D phase of their AI/ML investments and will be deploying their models and algorithms to production. The proper execution of deploying your AI/ML models will separate the organizations who see an ROI on AI from those who don't. This talk will introduce the best practices of the tech companies already deploying, the tech stack that is needed, and the organization rhythms that are needed to be successful. This talk is ideal for engineers and leadership to attend together.
You’ve trained machine learning models on your data, but how do you put them into production? When you have tens of thousands of model versions, each written in any mix of frameworks (R/Java/Ruby/SciKit/Caffe/Tensorflow on GPUs etc) and exposed as REST API endpoints, and your users love to chain algorithms and run ensembles in parallel... how do you maintain a latency less than 20ms on just a few servers?
As much as AI has been a hot topic lately, with advances being made constantly in what is possible, there has not been as much discussion of the infrastructure and scaling challenges that come with it. As co-founder of Algorithmia, I’ve built, deployed, and scaled thousands of algorithms and machine learning models, using every kind of framework (from scikit-learn to tensorflow). We’ve seen many of the challenges faced in this area, and in this talk I’ll share some insights into the problems you’re likely to face, and how to approach solving them.
In brief, we’ll examine the need for, and implementations of, a complete “Operating System for AI” -- a common interface for different algorithms to be used and combined, and a general architecture for serverless machine learning which is discoverable, versioned, scalable and sharable.