Why Managing Data Scientists Is Different

Successfully managing a data science team requires skills and philosophies that are different from those that arise in managing other groups of smart professionals. It’s wise to be aware of the potential organizational frictions and trade-offs that can crop up.

Reading Time: 5 min 

Topics

While businesses are hiring more data scientists than ever, many struggle to realize the full organizational and financial benefits from investing in data analytics. This is forcing some managers to think carefully about how units with analytics talents are structured and managed.

How can organizations realize the promise of the evolving disciplines that we broadly call analytics?

Although financial firms were among the first to recruit “quants” to use sophisticated mathematical models and high-powered computing hardware, analytics groups have now taken hold in areas ranging from health care to political campaigns to retailing to sports. Organizations like these can benefit from the insights gained by financial service firms on how best to manage teams doing advanced analytics. It requires skills and philosophies that are different from those that arise in managing other groups of smart professionals.

Rather than just involving oversight and planning, managing a data science research effort tends to be a dynamic and self-correcting process; it is difficult to plan precisely either a project’s timing or final outcomes. For those unused to this type of work, this process can seem quite messy — an unexpected contrast to a field that, from the outside, seems to epitomize the rule of reason and the preeminence of data.

Compounding the friction that this uncertainty generates is the highly technical nature of quantitative research, which can strain relationships between data science teams and other business units. In most organizations, the consumers of data mining or analytic modeling are line managers. However, because many of them aren’t trained in data science, many managers aren’t easily able to evaluate the technical details of a project; as a result they aren’t able to judge the quality of the research — or determine whether a project should take as long as it does. The reverse is true as well: Less experienced data scientists sometimes ignore the rich business experience that line managers could offer them and thus miss out on essential insights that would improve the result or shorten the research process.

Given the high potential for mutual misunderstandings, discussions of analytics projects risk devolving into debates about time and cost. Unfortunately, time and cost discussions often obscure unintended trade-offs in analytic quality, which can have material financial and business consequences.

This brings us to a second challenge: Work in the field of analytics is highly sensitive to what is sometimes called the “TCQ triangle.” When businesses undertake new projects, they typically try to balance three factors — time, cost, and quality. In general, it’s impossible to maximize all three of these attributes at the same time. If you want to do a project quickly and cheaply, quality is likely to suffer; if you want to do the project quickly and to a high standard, it will probably cost more; if you want to do the project cheaply but at high quality, it will take longer.

Analytics teams often find themselves in the middle of implicit organizational debates about which of these factors matter most. Because businesses are accustomed to measuring time and cost (entire departments may exist to do this), these are the two dimensions over which businesses typically look to economize.

But, with analytics projects, savings along these dimensions very often come at the expense of technical quality. In principle, there is nothing wrong with making this compromise, provided decision makers understand the trade-offs. However, because of the technical nature of data science, many managers do not fully appreciate this trade-off. Risks to the organization grow when managers suspect that a data scientist’s caveats about quality are simply “academic” concerns without long-term business consequences.

Organizations ignore the costs of low quality in analytics projects at their peril. While data science within a business must be governed by prudent financial discipline, most experienced data scientists understand this. In many cases, though, the implication of lowering quality (to save money or time) is not an abstract or aesthetic issue. Rather, skimping on quality can have long-term implications. For example, it can make the difference between well-informed decisions and decisions that have key blind spots, or between strategies that are profitable and those that destroy value. It is often the case that a wrong answer is worse than no answer at all.

On the other hand, it’s important to keep in mind that just because it’s possible to develop elaborate predictions doesn’t mean that a company should always invest in refining analyses to deliver the highest analytic quality that is technically achievable; precision for the sake of precision shouldn’t be the goal. Not every project requires highly nuanced outputs or exceedingly robust performance; for some, a rough-and-ready solution that is directionally correct is sufficient.

This is the perhaps the most important point of this post: Differentiating between domains that exhibit low error tolerance (where poor quality analysis can lead to decisions that are outright wrong) and those that are more forgiving requires collaboration between data scientists and the business teams with whom they work. It also requires an organizational awareness of the options that are available and the costs and trade-offs they imply.

Analytic models and machine learning approaches do not take ideological positions. Reducing quality, cost or time is neither good nor bad. Decisions and expectations about methods and costs and timing are inherently business- and domain-specific.

Against this backdrop, it is clear that a significant component of a data scientist’s organizational role is to educate the organization about what is possible and, at the same time, to help other decision makers understand the consequences of reduced quality, shorter timelines, and smaller budgets that may result from different options. Because of their specialized backgrounds and expertise, data scientists are often uniquely positioned to inform these discussions within the organization.

Topics

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Comments (2)
Eric Skarsdale
This article is extraordinary. I've rarely seen such a blatant example of special pleading. "Data Scientists are different. Our managers don't understand what we do. We don't always know how long something's going to take before we start and it's difficult to evaluate the success after we're finished." I'm paraphrasing, of course, but I don't think unreasonably. As if all of these statements don't apply to a huge number of professionals in a vast number of domains.

As for the statement that "Analytic models and machine learning approaches do not take ideological positions", this is just breathtakingly naïve. Anyone with the slightest awareness of the debates around algorithmic bias (e.g. something like this https://www.theguardian.com/technology/2017/apr/13/ai-programs-exhibit-racist-and-sexist-biases-research-reveals) would be embarrassed to write that.

As ever I'm reminded of the old line - "What's the difference between a data scientist and a data analyst? About $50,000 and a sense of entitlement."
Bart Hamers
The usage of scrum project management together with design thinking techniques can ease the QTC dilemm in Data Science projects. Scrum will put extra attention on the added value of more advanced analytics. Design Thinking will simulate the team to include new techniques, while it forces them to test their brilliant ideas to the 'real world' by rapid prototyping.