As we develop and discuss our predictive models, we like to point out that one of the key benefits is that they help underwriters avoid unfounded bias. The models are developed based on data, not on intuition or anecdotal experiences. They represent, ideally, an optimal weighting of a set of input variables.
In contrast, the underwriter has built up a set of experiences that form a basis for an intuitive understanding of risk quality. Some of these intuitions are well-founded; others are purely coincidental, and not at all predictive. The goal of the model, in our experience, is to give the underwriter a solid, consistent base of understanding using a well-defined list of input variables. This frees the underwriters to consider everything else, outside of that small list of inputs.
When presenting the model to a savvy group of underwriters, we almost always get these interesting questions:
- Which variables does the model use?
- Why are those variables selected?
- Why is this other variable NOT in the model?
The easy answer, of course, is that we listen to the data – we let statistics determine which variables are important, and which are not, and make our selections accordingly. But what happens when the data says a variable is important, and we disagree? As model builders, should we substitute our intuition for solid indications? Or should we listen to the data, and include variables to which we object?
This is where model building starts to become an art, as well as a science. The answer to this question lies in understanding the intended application of the model and using that understanding to determine how strictly to follow the “pure” indications of the data.
Here’s an example: What are some of the variables that underwriters would universally say are predictive of future loss? Probably every underwriter would agree that past losses are predictive of future losses. Given a choice between two identical exposures, one with no prior losses and one with several high-severity prior losses, every underwriter will choose the first risk.
When we model workers’ compensation risks, we find that the data backs this up: prior loss history is important. Technically speaking, prior loss history is correlated with future loss ratio. However, there is a bit of a surprise here: Prior loss severity is inversely correlated with future loss ratio.
Let me restate that to be clear: Policies with large losses in prior years have LOWER future loss ratios.
It gets worse: those lower future loss ratios are on a manual basis. This means that, all else being equal, an underwriter should prefer (or give credit to) accounts with large losses. What’s going on here? Honestly, we don’t know with certainty why this is true, but we see it across multiple carriers, so we believe that it’s generally true. Some possible explanations:
- Employees at these accounts with large losses are more careful in the future
- They put safety equipment in place
- They receive loss control services and conduct safety training
- They adjust their scope of operations to eliminate the source of prior loss
In the end, it probably doesn’t matter why; the data is clear. Past severity is predictive of the future loss ratio. So what do we do with this? Do we put the claim severity variable into the model?
The only thing more important than having a model that works is having a model that the underwriter understands and trusts. Without that trust, the underwriter will simply ignore the model indications – or will only listen to the model when it agrees with their prior intuition. This is equivalent to having no model at all. If we include this variable in a final model, we will be explicitly advising an underwriter to give preference to accounts with high-severity claims. This will NOT engender trust.
So we exclude past claim severity. Yes, we are deliberately removing a variable that shows predictive power, because our end goal is not merely model power, but impacting the bottom line through underwriter adoption. We also do one more thing: we share all of these details with the underwriters, so they’re aware not only of what’s IN the model but also, what’s NOT IN the model.
When considering the human element, we’ve found that it’s just as important to talk about what’s present as what’s missing. At the end of the day, we want the underwriter to believe in the model, that it’s doing the best job possible in considering what’s included. At the same time, we want the underwriter to focus on what’s not in the model because it’s there that the true expertise and value of underwriting is going to be found alongside predictive analytics.
ABOUT THE AUTHOR:
Bret Shroyer, FCAS, is an actuary and VP of Services for Valen Analytics. He serves as an advocate for Valen’s clients, bridging the gap that sometimes grows between technical modeling and client service teams, executives, actuaries, and underwriters. He helps the models drive success stories as they translate how data analytics helps people make better decisions and deliver tangible results.
Bret joined Valen in 2014 after serving as SVP of Reinsurance at Willis Re for five years. From 2006 to 2008, Bret served as CFO of an environmental consulting and construction firm. Immediately prior to this, Bret held numerous positions including Senior Actuary, Underwriting Director, and Predictive Modeling Manager during his ten-year tenure at Travelers.
Bret earned a B.A. in Mathematics from the University of St. Thomas in St. Paul, Minnesota, and is a Fellow of the Casualty Actuarial Society.