Why Data Science Projects Fail To Deliver

GUPTA, Gagan       Posted by GUPTA, Gagan
      Published: July 6, 2021
        |  

Enjoy listening to this Blog while you are working with something else !

   

According to Gartner analyst Nick Heudecker, over 85% of data science projects fail. A report from Dimensional Research indicated that only 4% of companies have succeeded in deploying ML models to production environment. For many companies, implementing data science into various aspects of their businesses can prove difficult if not daunting. Evidence suggests that the gap is widening between organizations successfully gaining value from data science and those struggling to do so.

One might be tempted to think it has to do with data and processing. You're not wrong. These are certainly challenges. But there are bigger problems. So let's explore some big data failure examples and dive into what drives these failures. Working as a technical manager at the interface between R&D and commercial operations has given me an insight into the traps that lie in our path; there are a number of factors that drive failure.

Not having the Right Talent

Finding, hiring, and retaining top tech talent is never easy. And the competition for qualified data talent is especially fierce. Data science/analytics skills are the the second most difficult skill set to find. For nearly two years, there has been a widespread talent shortage in the data science space. Popular research reported that there was a shortage of more than 150,000 individuals with data science skills. While the complex interdisciplinary approach of data science projects involves various subject matter experts such as mathematicians, data engineers, and many others, data scientists are often the most critical - and most difficult to recruit. This means companies are having a difficult time implementing and scaling their projects, which in turn, is slowing time to production. Additionally, many companies cannot afford the large teams required to run multiple projects simultaneously. For ETL, hire data engineers, for reporting hire BI analysts. Don't mix the roles.

Your Data is as good as your Data Governance

Without data governance, you don't have a data science project. End of story. Many companies lack data infrastructure or do not have enough volume or quality data. Data quality and data management issues are critical given the high reliance on good quality data by AI and ML projects. Yet, this data can be challenging to collect, create, or purchase. Having multiple (or zero) versions of the truth is one of the most common problems that organizations face. It is not that organizations don't have data, but rather that organizations don't properly marshal data into an environment where it can be analyzed and modeled. Due to a lack of data governance, data quality and integrity often inhibit analytics project success.

Our On-Premise Corporate Classroom Training is designed for your immediate training needs

Why Data Science Projects Fail To Deliver
Why Data Science Projects Fail To Deliver

Poor Project Management

Scope, Timing, Budget, Quality are critical components of any IT project. Failure to meet one or more of these measures is why the majority of IT projects are challenged or fail outright. Why do we think Analytics | Data Science projects are any different? Both involve 'writing code,' and our projects are experimental by nature, involving trial and error. No IT project is exempted from these four pressures. In my personal experience, If you use Agile in model development, then keep sprints to 2 weeks (time-box) and get to MVP ASAP. This helps to keep everyone focused and on track, in a measurable and budgeted way.

It's also beneficial to have a team member's who understands internal business operations to ensure the project remains aligned with original business goals.

Deployment

By the traditional definition of a project, a project ends whenever the scope is delivered. For data science, this is often the deployment phase. More often than not, appropriate consideration is not given to deployment and operations. This can happen because data science teams do not have an architectural view into how their projects will be integrated within the production pipeline since these are typically managed by IT teams. The IT teams, in turn, have finite insight into the actual data science development process and how a soon-to-be-developed project will fit into their environments. This misalignment can often result in one-off data science projects that don't deliver business value. Ensure that you have the proper staffing and focus to allow your models to continue to add value beyond the initial deployment.

Our On-Premise Corporate Classroom Training is designed for your immediate training needs

Lack of support by key stakeholders

Data science projects often impact many departments across the enterprise. Without the support and commitment of key stakeholders to implement changes, projects could be hindered or fail outright. Without established and clear methodologies for data science project management, organizations often resort to ad hoc project management processes which can lead to miscommunication, working in silos, inefficient information sharing, missed steps, and misinformed analyses.

Overly complex models

Recently, I was working in an AI project for Canada client. I learned that Data scientist there tend to create complex models when a simple one can just be as good or at times even superior. in many DS projects, there's frequently an inclination to complicate the problem statement and create solutions that are similarly complex. This practice just takes away focus from the big picture and diverts from the correct solution. Deliver and implement the core first. Your shall be highly flexible. Don't just take any request at face value. Rather, dive deeper to truly understand the underlying problem that needs to be solved. Always follow K.I.S.S., and live within the defined scope.

Conclusion

I can't personally verify whether data science projects indeed fail 85% of the time. However, given data science's experimental nature and unique challenges, I wouldn't be surprised. One of the critical challenges of organizations is the sheer time needed to complete AI and ML projects, usually in the order of months, and the incredible lack of qualified talent available to handle such projects. AutoML platforms address this by automating manual and iterative steps and enabling data science teams to rapidly test new features and validate models.

Although, historically, the failure rate of data science projects has been high, it doesn't mean that your organization's projects should meet the same fate.

To succeed in today's business climate, companies must leverage data science automation to gain greater agility and faster, more accurate decision-making.

Support our effort by subscribing to our youtube channel. Update yourself with our latest videos on Data Science.

Looking forward to see you soon, till then Keep Learning !

Our On-Premise Corporate Classroom Training is designed for your immediate training needs

Why Data Science Projects Fail To Deliver
                         



Corporate Scholarship Career Courses