Big Idea: Competing With Data & Analytics

Balance Efficiency With Transparency in Analytics-Driven Business

The ubiquity of algorithms in daily life raises questions about ethics, transparency, and who’s keeping tabs on how those algorithms work.

Sam Ransbotham June 27, 2017 Reading Time: 4 min

Topics

Competing With Data & Analytics

How does data inform business processes, offerings, and engagement with customers? This research looks at trends in the use of analytics, the evolution of analytics strategy, optimal team composition, and new opportunities for data-driven innovation.

More in this series

Twitter Facebook Linkedin

We have disturbingly little idea how many of the algorithms that affect our lives actually work. We consume their output, knowing little about the ingredients and recipe. And as analytics affects more and more of our lives and organizations, we need more transparency. But this transparency may be a bitter pill for businesses to swallow.

In 1906, Upton Sinclair’s The Jungle described the oppressed life of immigrant workers, specifically those in the meatpacking industry in Chicago. Sinclair’s intent in portraying the working conditions of a powerless class may have been to inspire political change. However, the graphic depictions of unsanitary food preparation helped bring transparency to manufacturing processes through the story’s nauseating clarity. The book heavily influenced the creation of regulatory oversight through organizations that eventually became the U.S. Food and Drug Administration.

We might be similarly horrified if we knew what evils lurked in the hearts of business algorithms in use today.

Some examples are lesser evils. Google search is widely used, but details about the order (and inclusion) of pages in its results aren’t public. Credit scores directly affect our finances, but the specific algorithms used to calculate them are secret. And the use of analytics to create algorithms is spreading rapidly to judicial processes, advertising, hiring, and many other daily decisions.

But these are the oxymoronic obvious unknowns. There may be greater evils lurking beneath the surface. The internal operations of businesses have always been a bit murky to consumers. There are algorithms in use within organizations that we as consumers don’t know that we don’t know about — preferential treatments, pricing differences, service prioritization, routing sequences, internal ratings, and so on. There is little opportunity even to know these algorithms exist, much less the analytical results on which they are based.

It actually makes sense that we lack good ways to see how analytical results are produced. Companies want to protect their intellectual property — this is their secret sauce. Whatever advantage companies get from data does not come without effort. Given the considerable investment underlying that effort, companies would certainly be reluctant to give away their hard-earned insights embedded in algorithms. Why would they even consider it?

The difficulty, as in The Jungle, is that others consume what is produced.

Topics

Competing With Data & Analytics

About the Author

Sam Ransbotham is an associate professor of information systems at the Carroll School of Management at Boston College and the MIT Sloan Management Review guest editor for the Data and Analytics Big Idea Initiative. He can be reached at sam.ransbotham@bc.edu and on Twitter @ransbotham.

Tags:

Data Collection Transparency

I agree with the general tendency of this post but I miss the other side of the coin, the positives: 1. the use of algorithms makes it possible to have this deeper discussion on ethics in decisioning that wasn't possible before (and that we definitely need to have) because all the algorithms were hidden in the brains of humans. An algorithm doesn't have to be automated, decision rules have been in place for years. Psychology has shown (and the former practice of knowledge elicitation in expert systems) these algorithms are very difficult if not impossible to make explicit. One of the issues currentlky about algorithms is the bias that comes from data. A lot of these biases (discriminatory effects f.e.) are the result of of these human algorithms from the past and the machine driven algorithms are 'rediscovering' what was actually happening. So Yes, there'a a danger for negative side effects but because of algorithms we can now see what negative side effects our historic decision making processes had and we can have a discussion about them and how to prevent that in the future. 2. The use of automated algorithms makes it possible to be more consistent in the decision making processes. In the past decision based on human judgment could turn out differently for cases that were exactly the same. Human decision making can be influenced by so many factors (the weather, mood, the way you look, ...) that there was no consistency in judging all cases against the same criteria. The use of automated algorithms can remove that inequality. 3. The pressure to use higher quality data brings an improvement in the way we make decisions. We tend to forget that in the past decisions were made on even worse data. 4. Most automated algorithms are developed with a clear process that is built around scientific data analysis approaches. This leads to algorithms that are based on a process of critical thinking, testing & validation where in the past the decision rules were often based on assumptions, prejudices, biases (in decision making & and in the limited analyses that were done at best) There's also a set of negatives that I'm missing: 1. We tend to discuss this as if each algorithm is acting independently. They never are. Usually on the point of decision several algorithms come together and are molded into a (let's call it) a mega-algorithm. For example: when a Telco tries to prevent people canceling a subscription (churn prevention) they not only look at the propensity for someone to cancel. They often also look at the propensity for someone to accept the retention offer, the expected customer value, the credit risk, and more. They then try to optimize this. 2. Our actions influence the outcome of the model and therefor we need to be very much aware of the limited time a model maybe used and we may need to take our action into account in our final algorithm. To take the same example of above: when our churn prevention algorithm predicts that a customer is going to churn and we take an action to prevent that from happening we change the reality the model is built on. Therefor we will need to re-visit our model after each cycle. Another example is credit scores: when a business is predicted to have a high risk of failure or a high risk of paying late and a decision is being made to not extend credit we might accelerate the prediction made by the model while if the decision had been made to extend the credit the prediction in the next cycle could have changed. 3. Probably the most important warning in all of this is that we will never have all the data on everything available despite all the marketing promises of big data vendors and others. And all the data we use is already biased: decisions & choices have been made on what to measure, how to measure it, how to store it and how to make it available. There's much more but these are the points I wanted to make today.

Add a comment Cancel reply

You must sign in to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Comment (1)

Jaap Vink

August 21, 2017

There's also a set of negatives that I'm missing:
1. We tend to discuss this as if each algorithm is acting independently. They never are. Usually on the point of decision several algorithms come together and are molded into a (let's call it) a mega-algorithm. For example: when a Telco tries to prevent people canceling a subscription (churn prevention) they not only look at the propensity for someone to cancel. They often also look at the propensity for someone to accept the retention offer, the expected customer value, the credit risk, and more. They then try to optimize this.
2. Our actions influence the outcome of the model and therefor we need to be very much aware of the limited time a model maybe used and we may need to take our action into account in our final algorithm. To take the same example of above: when our churn prevention algorithm predicts that a customer is going to churn and we take an action to prevent that from happening we change the reality the model is built on. Therefor we will need to re-visit our model after each cycle. Another example is credit scores: when a business is predicted to have a high risk of failure or a high risk of paying late and a decision is being made to not extend credit we might accelerate the prediction made by the model while if the decision had been made to extend the credit the prediction in the next cycle could have changed.
3. Probably the most important warning in all of this is that we will never have all the data on everything available despite all the marketing promises of big data vendors and others. And all the data we use is already biased: decisions & choices have been made on what to measure, how to measure it, how to store it and how to make it available.

There's much more but these are the points I wanted to make today.

Topics

Competing With Data & Analytics

Topics

Competing With Data & Analytics

About the Author

Tags:

More Like This

Add a comment Cancel reply

Comment (1)

Jaap Vink