You don’t need to be a great programmer, but you need to be a good programmer with the right habits and industry skills.

Before going into details, some thoughts!

I have observed, heard, and witnessed that as Data Scientist, my job is not to make code ready for productionalization. It is fair for the companies that have a proper team with roles define. But, lots of companies (including big ones) are still in the process of scalable AI/ML adoption.

For companies in AI/ML adoption phase - As a data scientist, your job is not merely to run production pipelines / models as adhoc scripts in Jupyter / R script.

Some of the easy steps a Data Scientist can do:

  • Move out from Jupyter / R Adhoc scripts as soon as the business is consuming your results.
  • Keep code clean with not many details in the comments, avoid printing variables, remove a block of code that is not useful.
  • Use functions to abstract the complexity and reusability, readability, and testing.

Productionalization of ML products is essential. Think of Scaling is necessary!

A couple of next steps that you should take:

  • Learn about Model Training - Real-time vs. Batch
  • Learn about Model Serving - Online Steaming using Kafka / Spark; building APIs modules and Device edge modeling (based on business use-cases)
  • Learn about Model Monitor and maintenance.

Embrace the technology - Docker, Kubernetes, Continuous Integration and Deployment, Monitoring tools.

You don’t need to be an expert in all these technologies. Work with your Software Engineers team - Teamwork is the dream work!

Next Step:

Introductory Session for API Building

✠  Previous Data Science in Real World ✠  Next Welcome to building API Series - 1