Collaboration model for data science projects

Many data science teams are struggling with implementing end-to-end machine learning projects. It’s a very common phenomenon, so if you are experiencing this, you are not alone. Having worked in every stage of data science project lifecycle, in addition to normal web services deployments, this is what I think how we should collaborate. Collaboration model between teams Note: The diagram does not signify order of communication. Rather, it states the communication pathways between teams....

January 20, 2024 · 2 min · Karn Wong

Should data scientists deploy models to production?

Over the years I’ve heard stories of data teams struggling with deploying machine learning models to production. Clearly there is a pattern here. This article is my reflection on the matter. So what’s the problem? Data scientists, by definition, create mathematical models from data so some unknowns can become known. This is colloquially known as “prediction.” For example, if you have sales data from last year, you can use it to forecast sales performance of next year....

December 30, 2023 · 2 min · Karn Wong

Serverless real-time machine learning inference with AWS

For a machine learning project, usually it is divided into two main categories: research and production. For research ML project, the model would be created and used locally on a researcher’s machine. For a production ML project, a deployment would be involved. Usual pattern is to create a service to load a model, accept input, then return a prediction. Production ML is also divided into two main patterns: batch or real-time....

November 28, 2023 · 3 min · Karn Wong