The paper deals with the various applications of data mining in the insurance sector. The insurance industry business model along with the challenges faced by the industry are studied to find the critical factors affecting the industry. Almost all factors identified supported data mining techniques to improve the performance of the companies operating in the industry. The paper analyses the techniques used, i.e. classification, clustering, association, etc. for different aspects of insurance business such as predicting claims, customer churn, fraudulent claims, acquiring new customers and error prediction in claims processing. The following table gives a summary of techniques used by insurance companies:-
Business Aspect
Technique Used
Predicting Claims
Classification
Customer Churn
Clustering, Classification, Association
Fraudulent Claims
Clustering, Classification
Acquiring New Customers
Classification (Major), Clustering
Error Prediction in Processing
Classification
Contents
1. Insurance Industry - Business Model
Insurance is defined as the act, system, or business of insuring property, life, one's person, etc., against loss or harm arising in specified contingencies, as fire, accident, death, disablement, or the like, in consideration of a payment proportionate to the risk involved. (Dictionary.com) Insurance companies basically protect the customer against any adverse losses (mostly monetary in nature) that arise during the period in which the customer holds the insurance with the exception of life insurance where a nominee is paid a lump sum amount on death of the insure. The insurance companies charge an amount (known as premium) against this protection. The business model of insurance companies' works on the premise that the revenues generated from this premium is more than the claims that they will have to fulfill and other costs that the company incurs.
Profits = Premium - Claims - Underwriting Costs
The risk of damage is transferred from the insure to the insurance company for which the insurance company charges the insure a fee. In case where individual claims are likely to run into high amounts, insurance companies process the clients by forming a consortium and mitigate risks amongst themselves. (Eg. Insurance against fire and other hazards at a factory). Insurance companies differ on the application of claims in case of acts of Gods. While certain insurance companies process the claims, certain may not. The products offered by insurance companies can be broadly classified as:-
Life Insurance
Life Insurance is a contract providing for payment of a sum of money to the person entitled to receive the same, on the death of the insured person. In India, life insurance is significantly different from other insurance products such that at maturity of the policy (time when policy ends), he/she gets a certain sum which depends on the policy structure. i.e. premium, installment period for premium, interest rates, etc. The dependent is paid the same amount as mentioned above on the death of the person.
Health Insurance
The insurance company will pay the bills of the company in case the insure is hospitalized.
Auto Insurance
The product offers insurance against auto accidents - both for damage of vehicles involved in the accident, health costs incurred due to bodily injury of the passengers of the car or others hurt in the accident. A particular product might involve either one or all the above features depending on the product type. In most countries, third party auto insurance is compulsory.
Others
Other insurance can include insurance against fire, marine loss or any innovative product that helps the consumer mitigate risks faced by untoward events. The policies of these insurance is generally held by business and not by individuals. Only home insurance, which falls under this category can be classified as retail insurance.
The volumes are very high when it comes to life insurance, health insurance and auto insurance (can be collectively called retail insurance) while they are significantly less in case of insurance for fire and other such activities. In contrast, the premium is relatively lower when it comes to life insurance, health insurance and auto insurance and comparatively higher in case of insurance for businesses. This is directly correlated to the fact that the amount payable in case of a claim is much higher when it comes to insurance against marine loss or fire. On basis of literature survey on data mining applications in insurance industry, we came up with a finding that while data mining techniques are widely used in retail insurance, they do not seem to have been used extensively in other cases. Hence, we have limited our scope of study to applications of data mining in retail insurance.
2. Challenges faced by Insurance Companies
The following challenges are some of the challenges faced by the insurance industry:-
2.1 Premium
The decision on the premium to be charges is one of the major issues faced by the Insurance Companies. As discussed in the previous section, the claims and the underwriting costs should be less than the total premium received by the company for a particular product it is selling in the market. Classical applications which addressed these issues were basic and advanced statistical models.
2.2 Fraudulent Claims
Fraudulent claims is one of the major concerns faced by insurance companies. The total costs associated with fraudulent claims can be as high as billions of dollars for the entire industry combined. Insurance companies have been trying actively to bring down the percent of fraudulent claims by trying to detect them.
2.3 Customer Attrition
With the commoditization of health, life, auto and home insurance, customer attrition is a major issue faced by these companies. The cost of retaining an existing customer is far less than brining in a new one. It is very important for insurance companies to attract and retain customers to exploit the advantages of excess volumes. For doing the same, insurance companies need to have a detailed insight into the reasons for customer attrition.
2.4 Capital & Risk Management
Capital and risk management is the back bone of the insurance industry. Short term liquidity plays an important role in the insurance sector. This short term liquidity is required to pay any claims that may arise during the period. Insurance companies would like to avoid a situation like a bank run if its runs out of short term liquidity issues.
Except deciding the premium, very few problems could be solved earlier with advanced statistical techniques. Data mining, not only restricted to advanced statistics but seen as an 'art' provides solutions to the above problems. While association rules can be used for fraudulent claims, techniques used for customer churn can be used to restrict customer attrition, while predictive techniques can be used for efficient capital and risk management. While these are just the major issues covered by data mining in insurance industry, the next section covers problems faced by the companies, along with the data mining techniques to overcome them in detail. [1]
3. Application of Data Mining Tools to Solve Business Problems in Insurance
3.1 Prediction of Claims
Prediction of claims is the estimation of the amount of money/claims that the insurance company has to pay its policy holders. This is important in determining the corresponding premium that is set for the policy.
3.1.1Predictive modelling
Predictive modelling can be used to determine the estimated value of the claim (as settlement of claim is usually delayed) by using various factors such as severity of the claim, time to settlement, effect of inflation and interest rates. [2] These factors serve as a proxy for the latent variables such as "risk-seeking" or "risk-averse" which is useful in making actuarial predictions.
Modelling in life insurance is difficult compared to other types of insurance (such as auto insurance) as the risk factor to mortality changes over time. Also low frequency of claims requires data used to be collected over a longer time horizon (historical claims data) which is not always available. [3] Predictive modelling in insurance can be used to score claims based on the expected size of claim settlement thus enabling the insurance company to allocate resources to high-priority/higher value claims. [4]
Predictive modelling can lead to potential savings across the following dimensions:
Proactive management by identifying claims with potential for leakage at an early stage.
Identification of practices that increase the claim settlement payments. [5]
C:\Users\Niran\Desktop\Untitled.jpg
Source: "Transforming claims through predictive modeling", Insurance Agenda, Ernst & Young
3.1.2 Classification & Regression model
The value of the claims can be estimated by using the historical records of actual claim to create a training data set and in turn predict the claims for the test data set. Vehicle characteristics are used to predict the insurance claim payments and the interaction between non-vehicle characteristics and vehicle characteristics is also studied.
The prediction of the claim amount can be done by using the classification technique to predict instances with no claims and then regression technique can be used on the remainder of the data set (as the output required is a continuous variable). Clustering is also a useful tool for reducing the number of dimensions/factors considered by grouping the variables (alternatively Principal Component Analysis can also be used for this).
Linear model for regression always remains the same and any improvement in the estimation of claims is entirely dependent on the method of classification that is applied. Decision tree, random forest and linear discriminant analysis (LCA) can be applied in this case. Other methods used for classification such as Naïve Bayes, K-nearest neighbours, neural networks, Support Vector Machine (SVM) algorithm cannot be combined with regression and applied in this case. [6]
3.1.3 Logistic regression
Bodily injury insurance claims remain unsettled for years which necessitate the estimation of the provision for Reported but not settled claims (RBNS) reserve (estimate of the outstanding liabilities for the insurance company). In case of bodily injury insurance claims, logistic regression can be used to predict the severity of the injuries and find the probability of expected severity for each stage of the claims process thereby enabling the estimation of reserves/claim liabilities. [7]
3.1.4 CART (Decision tree)
In workers' compensation insurance, CART can used to select the significant predictor variables out of a given list of potential predictor variables. When compared with the logistic regression method, CART model is relatively better suited to this application. But when the number of predictor variables is small and most of the variables are numeric, logistic regression might perform better than the CART model. CART model can also be combined with multivariate adaptive regression spines (MARS) to create hybrid models which can help in predicting hospital cost in health insurance. [8]
3.1.5 Recommendation
Insurance companies generally design policies on the basis of statistical models developed using historical probabilities of a claim. The insurance company will generally select a premium such that the outflow of cash from claims will be lower than the inflow of cash due to premium. With the advent of these data mining techniques which help in predicting claims, insurance companies can come up with even more accurate predictions regarding the outflow of cash likely in terms of claims. This may be used to reduce premium, if the company is predicted to make sufficient profit from a particular product, thus attracting more customers. On the other side, if the company is predicted to be making loss, the optimum premium can be decided, such that the company turns green from a particular product.
3.2 Customer Churn Management
Customer churn management is important in insurance industry as the cost of acquiring new customers is high and severe competition among various insurance companies makes it crucial for the company to focus on retaining existing customers.
Customer churn management includes three steps:
Determining churn customers.
Determining the reasons for customer transit to other insurance providers.
Deciding on the policies which can help in reducing customer churn.
C:\Users\Niran\Desktop\Untitled1.jpg
Source: "Predictive modeling for life insurance", Deloitte Consulting LLP
Data mining techniques can be used in customer churn management in the following ways:
3.2.1 Clustering
Classifying customers based on risk, determining the characteristics of customer and determining the probability of customer churn based on these characteristics. K-means clustering can be used for cluster analysis. On the basis of the obtained segments, companies can target customers in different segments with different marketing plans thereby increasing profitability. Clustering can also be used to develop new products or cross selling products by getting insights into customer behavior. For example, certain customers might be availing two different insurance products at the same timeand fall in the same cluster due to the other demographic and psychographic factors. The company can to then club the two products and offer it these customers if the product doesn't exist in companies' portfolio. [9] Clustering can also be used for discovering the key factors associated with customer retention. [10]
3.2.2 Predictive modelling
In life insurance, while cross-marketing life insurance products to existing customers (holding other policies) the company may be at risk of losing the customer if the post-underwriting offer is not suited for the customer. Hence predictive modelling can be used to conduct a review of the target customer segment and determine the customers who should receive the offer for an additional life insurance policy.
3.2.3 Decision tree/CART
Determining churn rate and understanding the reasons for customer churn. [11]
3.2.4 Association
Studies have shown that customers holding two policies are more likely to renew their policies than a customer with a single policy and customers with 3 or more policies with the same company is less likely to switch to other providers than customers holding less than 3 policies. This indicates that firms can retain customers by offering discounts or selling package of policies which can be determined by market basket analysis (to find out the policies which are frequently purchased together). [12]
3.2.5 Recommendation
Customer churn forms an important aspect in an insurance company due to the fierce competition in the insurance industry. Almost all industries use data mining techniques to reduce customer churn and the techniques used do not differ significantly across industries. The analysis arrived at on the basis of results seen from data mining techniques used on customer data are more important in case of customer churn management unlike in case of direct marketing where the result is in form of binary result. Marketing techniques have to be developed after detailed analysis of the result such that the customer churn is reduced.
3.3 Acquiring New Customers
With the commoditization if insurance industry, identifying potential clients is an important functionality necessary. Marketing expenses form a bulk of underwriting costs that are incurred by the company. Customer acquisition cost is generally high when compared to customer retention cost. Traditional approach involved expanding sales force operations to acquire new customers, thus incurring costs. To reduce this customer acquisition cost to a certain extent, data mining tools used in direct marketing can be widely used by identifying potential clients for the company. Direct marketing deals with sending a mail or the sales force contacting the customer. With low conversion rate (generally less than 10%), the company insurance companies incur a significant cost in contacting these potential customers. Use of data mining techniques improves the efficiency of the direct marketing campaign undertaken.
3.3.1 Predictive Modeling
The basic target selection processes (predictive modeling) can be divided into two types: segmentation and scoring methods. While segmentation technique divides a customer into segments on the basis of similarity in relevant features scoring approach involves assigning a score to every customer, where higher the score more likely the response from the customer and vice versa. [13] The data used for predictive modeling is generally the same: product portfolio, demographics, frequency of last purchase, etc.
On basis of data sets from response in previous direct marketing campaigns undertaken by the company, the following techniques can be used to improve the hit rate i.e. response from the contacted customer of direct marketing campaigns:
Statistical Regression Techniques
Neural Networks
Decision Trees
CART
CHAID
Fuzzy modeling
Recently, fuzzy clustering is being used for target selection using recency, frequency and monetary value. [14]
3.3.2 Clustering
Clustering technique can be used effectively to identify new customer segments which should be targeted for a particular product. Considering the example of life insurance, these companies can create a special catalogue for various customers by dividing them on the basis of demographics like age, income, occupation and sex. The target population for a particular policy may not be the same for another policy. A particular segment of customers might be suitable for life security policy (young employees with college degrees) while others might be suitable for tax benefit policies (higher qualification and higher income). [15]
3.3.3 Recommendation
Clustering and Classification techniques can be jointly used to improve marketing communication to acquire new customers. The efficiency of the direct marketing campaign can be improved by combining the two techniques. Segmentation of customers will help in finding customers who are suitable for a particular product. Once customers have been segmented, then the likely response to a direct marketing campaign can be predicted using classification techniques. Thus, a customized communication can be sent to the individual customer.
3.4 Fraud Detection
One of the major challenges that the insurance companies face is the fraud claims that happen. A fraud claim is an insurance claim made by a customer by showing fake events that caused damage to claim compensation. Every year insurance companies lose billions of dollars to fraud claims. Research indicates that losses due to frauds have increased significantly in the last decade or so [16] . However it is also very expensive to do a due diligence and evaluate each claim deeply. It would be greatly helpful for insurance companies if there were ways they could find out if a customer was likely committing fraud. This is where data mining tools are of use. Data mining techniques can search through millions of insurance claims to find patterns that are likely to be associated with fraud claims. With the amount of computing power at hand, these tools can spot even minute differences in claim forms to reveal any billing discrepancies or abnormal payoffs.
The following data mining techniques have been used by quite a few insurance companies to detect fraud:
3.4.1 K means clustering
Classification and Clustering techniques have traditionally been seen as effective methods to detect fraud. K means clustering, especially, has seen a large number of variations developed over the years to determine if a claim was fraudulent. Other clustering techniques are deemed to be more expensive. Clusters are formed by minimizing the distance between data points, where the distance is calculated by assigning weights (that are determined using historical data) to various attributes. Finally the clusters can be used to determine whether a data point/claim is likely to be a fraud based on which cluster it falls into.
Following are a few enhancements using evolutionary algorithms have evolved over the years for clustering.
GA based K-means algorithm, also called GA K-means algorithm. This uses Darwinian genetic algorithm to generate K-means clusters.
Momentum type Particle Swarm Optimization algorithm is used to generate MPSO-K means clusters.
3.4.2 Other classification techniques
Classification is done, based on a set of pre-specified categories to determine which class a given data point falls under.
Bayesian Belief Network: For the purpose of detecting fraud, two Bayesian networks are used: One network to model behaviour under an assumption that a particular claim is fraudulent and another which assumes that the claim is legal/valid. The fraud network is setup by using expert knowledge whereas the legal network is setup by using data on prior legal claims. [17]
Based on the attributes of a claim, we can then determine to what extent a claim can be classified as fraudulent or legal. We use Bayes theorem for this.
P (output = fraud | E) = P (output = fraud) P (E| output = fraud) / P (E)
Where,
E - Measurement of attributes
P (E) = P (output = fraud) P (E| output = fraud) + P (output = legal) P (E| output = legal)
And P (output = fraud | E) = 1 - P (output = legal | E)
Decision trees: Decision tree data mining consists of developing a decision tree using training data and then applying this decision tree on new data points to find out which class they belong to. Decision trees are usually a set of IF THEN statements in which the preconditions are logically ANDed, i.e. all the conditions have to be satisfied for an output to be shown as one of the classes.
The decision tree development starts with a single node representing the data set. The two outputs in the analysis would be fraud and legal. If the instances are of same type of fraud, the fraud becomes a leaf node. [18] Otherwise, the attributes that will best separate the data into individual classes will be selected based on Entropy, Gini Index and Classification Error to measure the degree of impurity of the selection. Entropy is the sum of products of conditional probability of an event and the information (attributes) required for that event. Generally, for classification into a single class, the values of Entropy, Gini Index and Classification Error are 0. They reach their maximum value, when all the classes have equal probabilities of occurring.
Also, after every analysis and prediction done, the models have to be compared based on the accuracy of predictions and the cost associated, as in the following matrix.
Prediction\Actual
Fraud
Legal
Fraud
cost = number of hits * average cost per investigation
cost =number of false alarms * (Average cost per investigation + Average legal cost per claim)
Legal
cost = number of misses * average cost per claim
cost = number of correct rejection claims * average cost per claim.
Source: Detecting Auto Insurance Fraud by Data Mining Techniques, by Rekha Bhowmik, University of Texas.
3.4.3 Example
There are a few applications available like the SAS Enterprise Miner, which can apply data mining techniques to help solve business problems. One such example of a health insurance fraud detection system is discussed below. [19]
This example deals with a public health care organization that wants to track fraudulent claims. In general, checking claims is a costly task and mostly happens if the organization receives any tips on fraudulent activity. The data contains a number of attributes like amount of claim, hospital details, treatment records etc. There is also a column depicting if it is a fraud/legal claim. Generally neural networks provide little/no feedback about how the target variable is related to the input variables. For this analysis, regression and decision trees were used.
One other concern was the low ratio of fraudulent to legal claims. To find the best fit using the limited number of data points for fraudulent claims, analysis was done based on node purity and miss-classification at a split, or using CHAID type of Chi-square splitting criterion, which splits only when a significant threshold is achieved. Lift charts and Confusion matrices are used to evaluate the models.
3.5 Error Prediction in Processing
The administrative costs of insurance companies have been on the rise and the costs are being passed on to customers as higher premiums. Payment errors while processing claims generally result in re-processing of these claims which add on to the administrative costs. Mohit Kumar, RayidGhani and Zhu-Song Mei in their paper on "Data Mining to Predict and Prevent Errors in Health Insurance Claims Processing" concentrate on errors in processing claims arising in the health insurance sector. But we believe that the approach can be generalized and used for insurance claims in all sectors. An estimate of an insurance company with 600 million clients showed that the company had $400 million in over payments.The industry uses two types to identify payment errors: random audits and hypothesis based queries.5In random audits, only 2%-5% of the claims are actually reworked, thus wasting 95% effort and costs associated with the auditing. [20] Machine learning techniques can also be effectively used to identify errors in claims processing. The business problem is treated with classification as a solution from data mining prospective. To identify rework necessity on claims, the authors have suggested that a binary classification be used to predict if a claim will require rework in future on not. On the basis of confidence score associated with each claim, rework of claims can be prioritized. SVM (supervised vector machine) classification technique was used to classify a claim processing as a rework problem or not. SVM Classification technique is a type of supervised learning technique where data is analyzed to find patters on basis of classification and regression analysis. The model was tested on two US health insurance firms and that yielded savings of $15-20 million annually.
3.5.1 Recommendation
The only instance of detecting errors in claims processing in insurance industry was discussed by the authors of paper mentioned above. They implemented their developed model on health insurance and found that the company could save $15-20 million annually. As the fundamental data mining problem with an error in claims processing is prediction, we believe that classification techniques can be widely used in insurance industry to predict errors in processing of claims as well. Since re-processing leads to additional cost of any insurance company, it is very important to identify errors at an early stage to minimize the loss. In addition to the SVM technique used by Mohit Kumar, RayidGhani and Zhu-Song Mei, we believe that traditional techniques like CART, CHAID, statistical regression and fuzzy modeling could be used to reduce the administrative costs of insurance companies. Predicting errors in processing seem to be an unexplored territory by insurance companies, and has wide scope of implementation.
4. Conclusion:
As we have seen, there are a wide variety of business problems in the insurance industry that can be solved better using data mining techniques (like marketing, product development, cross selling etc). We have seen how data mining can save a company costs, especially by helping with fraud claim detection, where investigating each claim deeply is a very expensive affair. However there are still segments of the business which can make use of data mining tools, like internal operations including employee management, predicting errors (as discussed above) etc. One of the most important areas of application in the future could be with respect to investments to be made by an insurance company. Along with predicting claims, data mining techniques can also be used to select better investments to match the expected claims etc. There is also scope for increased application of data mining techniques in policy design and policy selection to actually verify if insurance products are used by consumers to actually fulfill those objectives that the company feels the products are made for.