Serverless real-time machine learning inference with AWS

For a machine learning project, usually it is divided into two main categories: research and production. For research ML project, the model would be created and used locally on a researcher’s machine. For a production ML project, a deployment would be involved. Usual pattern is to create a service to load a model, accept input, then return a prediction. Production ML is also divided into two main patterns: batch or real-time....

November 28, 2023 · 3 min · Karn Wong