Using Unstructured Data to Tidy Up Credit Reporting

Greg Jones, vice president of Enterprise Data & Analytics at Equifax, explains how the information and data solutions provider is beginning to incorporate unstructured data from sources such as social media to better round out individual profiles.

Reading Time: 7 min 

Topics

Competing With Data & Analytics

How does data inform business processes, offerings, and engagement with customers? This research looks at trends in the use of analytics, the evolution of analytics strategy, optimal team composition, and new opportunities for data-driven innovation.
More in this series
Permissions and PDF Download

The consumer credit reporting agencies in the U.S. — especially the big three of Equifax, Experian, and TransUnion — help consumers and society in general by aligning costs with risk. Analytics is now helping by reducing uncertainties in the alignment.

The credit data housed in the reporting agencies traditionally focuses on people’s personal credit and payment history, down to details about how promptly they’ve repaid loans and when they were late on a payment. Companies that grant credit, ranging from mortgages to car loans to credit card limits, use the agency’s information to decide what products to offer and on what terms. People with “clean” histories may get better terms; people with smudged financial backgrounds may not. But other dirt — inaccuracies and incomplete information — leads to uncertainty and costs everyone.

Greg Jones is one of the data specialists tasked with making the data as clean as possible. As vice president of Enterprise Data & Analytics at Equifax — a global provider founded in 1899 that generates 158 billion credit-score updates per month and operates or has investments in 19 countries — Jones says he’s “accountable for our enterprise Search Match and Entity Resolution systems.”

In a conversation with Sam Ransbotham, an associate professor of information systems at the Carroll School of Management at Boston College and the MIT Sloan Management Review guest editor for the Data and Analytics Big Idea Initiative, Jones explains that Equifax is expanding its sourcing of data to include unique data assets and exploring social media and other unstructured data sources, and that this expansion has the potential to make individual profiles even more exact, improving the market for everyone.

Let’s start by hearing about what your team at Equifax is doing in analytics.

We ingest data into our systems, and some of that data is not always clean or straightforward. What we do is use a set of deterministic business rules and then conditional probability theory to determine if a record belongs to a certain person.

On the one hand, we look at nice, beautifully structured data from a credit card company, where the information is of very high quality. On the other, we could look at information that is from other sources, like some type of unstructured or semi-structured data. We figure out how to extract the important information using text analytics and other mapping techniques. We look at this unstructured data and the question is, how can we figure out [whether] that information actually belongs to this particular person?

That’s where we use more of the conditional probability, and some graph-type database work. Properly ingesting and linking data is core to what Equifax does as an insights company. Our products are conceptualized, designed, and implemented with the sole purpose of helping consumers and businesses make more informed decisions.

So data and analytics is really the heart of what you do.

That’s right. Traditionally, Equifax and the other credit bureaus were taking historical data, information like how you paid your credit card bill, if you filed bankruptcy, that kind of thing, and we were using it to predict risk repayment. That was what our entire business model was built on, predicting risk repayment for credit cards, for mortgages, whatever.

Now, we’re helping companies answer more difficult questions.

For instance, our Equifax Workforce Solutions business is built on the premise of acquiring human resources data (place of employment, date of hire, salary, etc.) and being able to authenticate information in applications for things like the Affordable Care Act subsidies people are eligible for when they get healthcare [insurance].

Or, say I apply for a job, and I tell you I make $500,000 a year, and you want to verify that information. In the past, you would have to call our HR department, our HR department would have to make sure that you’re authorized to know this, research our payroll records, complete paperwork. It was a whole operational nightmare for people that needed to verify this information. We used our core capabilities of being able to aggregate data and provide information so that we could authenticate electronically and say, “Hey, we actually have this real information that says that this person actually does make $500,000 a year and does work at Equifax.”

That’s a category of novel use, or different use of what you’ve traditionally had data on. But you also mentioned looking at social media data, for example, which is a source of new data.

Yes, we’re looking at how we can start to integrate unstructured and semi-structured data in order to do more real-time analytics on the people that we have within our database.

One hypothesis would be to segment customers based on new insight gleaned from unstructured data. So when our customer says, “I’m going to give Greg Jones a mortgage,” we can say, “I know he paid his bills on time for the past seven years, but we have some signals that say that he falls within a high-risk category, that he’s going to leave the country, and so you may want to manage that loan differently.”

One of the important things is being able to link all of these data assets together. I think that at the end of the day accurately adding new data and attributes and applying advanced analytics can help make any decision better.

Certainly there’s a privacy element in here as well. Can these things be done in a privacy-protecting way?

From an Equifax perspective, we’re not going to risk a consumer’s privacy and we don’t want to risk our core business. If we’re going to improve a model by a tiny percent, but it’s going to push us closer to that line, we’re not going to do it. We work closely with regulators and industry groups to ensure we are exceeding privacy requirements. Our primary focus is on being a trusted steward of data and keeping people’s information secure. Secondarily, it is to use that data in a manner that is legal, responsible, and helps improve consumer’s lives and businesses’ risk.

You mentioned some of the abilities to predict things from social media, but does it give you a 1% improvement over your existing models or has it turned out to be much better at predicting? Any sense of how this new world of unstructured data compares with our old world of structured data in terms of its informativeness?

From a risk model perspective, if you look in general there’s probably not a whole lot of radical improvement that’s ever going to happen. Risk models today are so incredibly predictive — they’re really good.

It’s really about the other types of behaviors and the other types of ways that financial institutions or other businesses want to interact with their customers.

Here’s an example. Say that we have existing data about you, about your spouse, about your parents, about your kids, about all of your relatives — all individually. But we don’t know who your parents are, we don’t know who your brother is, we don’t know who your aunt and uncle are or who your friend Johnny is.

The exciting thing is being able to take our existing data and being able to say, “Now I know definitively there is a relationship between these individuals, and what does that tell me about one of them.”

It’s using existing data to uncover really interesting relationships and identify things that maybe we didn’t have the capabilities to do before.

So, the data was there, but your ability to link it to other data is what’s made an improvement. Knowing that Entity A and Entity B are in the same house or something else or are connected.

Right, and understanding that relationship between those people a place or a thing.

I think our next step is how do we take those things and help our customers help their customers make their lives better.

That makes sense. I walk into a bank that’s got incomplete information, the bank will give me a rate that reflects the worst case, not the best case. Accurate data means getting a rate that’s more appropriate for that risk profile.

Yes, and it goes beyond accuracy to efficiency. Let’s take two banks. Let’s take Bank A and Bank B and say Bank A uses one company for their credit reports and Bank B uses a different company.

If a consumer comes into Bank A to apply for a credit card and they say, “Oh, thank you for your application. We’ll get back to you in three days,” and Bank B says, “Thank you for your application. Here’s your answer,” you’re going to have more consumers migrate over to Bank B. That’s going to impact the banks, obviously, but it’s also going to impact the credit reporting companies, to the point that both companies will start offering same day answers.

At the end of the day, people think that credit bureaus and information services are relatively commoditized. My focus is to create a compelling differentiator between us and the other credit reporting companies by enabling our customers to provide the most efficient, the most predictive, and the most accurate experience for their customers, and that improves their service, positively differentiates them from their competitors while reducing their risk.

Topics

Competing With Data & Analytics

How does data inform business processes, offerings, and engagement with customers? This research looks at trends in the use of analytics, the evolution of analytics strategy, optimal team composition, and new opportunities for data-driven innovation.
More in this series

Reprint #:

57205

More Like This

Add a comment

You must to post a comment.

First time here? Sign up for a free account: Comment on articles and get access to many more articles.

Comment (1)
Chris Reich
This whole premise disgusts me. That my social life online would be monitored and 'data' aggregated to improve the customer experience for someone checking me out is repugnant.

We are giving up privacy allowing a credit reporting agency to practice a form of digital psychology. No, we need consumer protection from this potentially damaging intrusion.