Roles in an ML team align very much with the organization's hierarchy and how it's projects are designed.
There is quite a difference in managing and delivering a Agile or a Waterfall based Project, in contrast to deliver a ML project. You see, the outputs in an ML project are not as clearly defined as they are in the normal projects. Multiple organizations use Machine Learning to manage and improve operations. While ML projects vary in scale and complexity requiring different data science teams, their general structure is the same. Organizations face challenges in scaling DS/ML projects because they lack the requisite skills, collaboration, tooling and know-how to create and manage a robust, production-grade DS/ML/AI pipeline.
Often, Data Scientists have to wear too many hats due to a dearth of talent across other roles in any DS/ML project.
Through 2023, the ML engineer role will be the fastest-growing role in the AI/ML space. Gartner estimates that today there is one ML engineer for every 10 data scientists, and it will likely change to between 5 and 10 by 2023.
Three Core Roles in a Machine Learning Team
- Data Engineers
Data Engineers makes the appropriate data available for Data Scientists. They focuses on data integration, modelling, optimization, quality and self service. Their responsibility is to prepare all the necessary data in a form that is consumable for their colleagues.They generally create a Data Lake for this purpose. AWS, Kafka, Airflow, Databases are some their key skills.- Data Scientist
Based on the inputs from Engineers, Data Scientist identifies use cases. They are responsible determining appropriate datasets. They design algorithms, experiments and builds AI models. One of the key question data scientist asks is 'How can we use this data to build a machine learning model for predicting something?' Python, Machine Learning, SQL are some of the key skills for this role.- ML Engineers
Deploys ML/AI models through effective scaling and ensuring production readiness, ensures continuous feedback loop. In the core team, act as the glue between data scientists and data engineers, operations (DevOps, DataOps, MLOps), and business unit leaders. Their focus is more on engineering than on modeling.Product Manager
A product manager is someone responsible for developing products. Their goal is to make sure that the team is building the right thing. They are typically less technical than the rest of the team: they don't focus on the implementation aspects of a problem, but rather the problem itself. Product managers do a lot of planning; they need to understand the problem, come up with a solution, and make sure the solution is implemented in a timely manner, assign the proper resources. To accomplish this, PMs need to know what's important and plan the work accordingly.