explaining_product_to_customerThe SAP HANA Center of Excellence is focused on helping our customers create systems of innovation, and typically that requires using historical data to produce a solution based on advanced analytics or machine learning. Whenever we start working on a project, I inevitably ask the customer the following question. “How important is it that you understand how the predictive model works?” Customers, especially ones new to predictive, have no idea how to answer this, but before we proceed I insist they provide some guidance.

If you proceed without guidance, you’re likely to get into trouble, because the answer to this question not only helps you pick out an appropriate algorithm, but also the predictors you build into your analytical data set. Proceeding on a project without an answer to this fundamental question is always a mistake, and it’s important to walk your customer through the issues.

Deployment: Knowledge Versus Embedded in a Process

At SAP, our point of view is that the best return on investment is when you build predictive models that are deployed into your operational systems. Classification models, for example, generate probabilities which, combined with rules, are used to make decisions. The consumers of the probabilities are your operational systems.

In business-to-consumer businesses the focus is often around marketing and sales systems (like campaign or customer relationship management) that focus on cross-sell or upsell campaigns. These systems use probabilities and business rules to optimize offers for customers and prospects, and often the customer has no idea that a predictive model is behind the personalized offer. In these situations, how important is it that a human (sales person or marketing manager, for example) knows how or why the algorithm makes a decision?

Explanation: Correlation Versus Causation

The answer varies. There are people who simply don’t care how the model works, and all they want is the most predictive model as possible. The other end of this continuum is the person who must understand how the model works, and what the key business drivers for the model are (for example, the most important variables, and why they are predictive).

These people resist deploying predictive models until you can explain how the model works in sufficient detail. Humans have a psychological need to understand cause and effect, but often it’s difficult to provide this type of explanation with the data one has available for model building.

Data scientists understand that machine learning is really measuring what predictors are highly correlated with our targets (what we’re trying to predict), but our end customers ( the clients) want to make an inferential leap to causation. This is especially difficult when what we are trying to predict is human behavior, because homo sapiens are flawed information processors.

The psychological and behavioral economic research is full of examples of people behaving in ways that aren’t in their own best interest. In psychology, we’re taught the following axiomatic principle:

The best predictor of future behavior is past behavior

Inevitably, it is past behavioral predictors that are the most important fields in predictive models, and this is why your transactional and interactive data is the most valuable. However, these types of fields are often unsatisfying for the client who wants to understand causation.

 Practical Guidelines  

  • Customers/clients need guidance on what is achievable given what one is trying to predict and the data domains that are available.
  • “All models are wrong—some are useful” said George Box. A model is simply a mathematical representation of the way the world works, and there is always going to be error. Clients need to accept this, and this requires a certain tolerance for ambiguity.
  • Predictive models don’t override the natural laws of science (like gravity), but they can supplement our understanding of how people/objects/systems function in the real world under different conditions and constraints.
  • Causation is great, but correlation ain’t half bad!

For more on this subject, read the other blogs in the Predictive Thursdays series.

VN:F [1.9.22_1171]
Rating: 5.0/5 (3 votes cast)
Predictive Thursdays: Helping Customers Understand Predictive Modeling – Or Not, 5.0 out of 5 based on 3 ratings