DATA MINING SHOOTOUT 2007 Presented by: ANKIT SAKLECHA KUNAL PAREKH ROHIT JAISWAL VINAY KANOJIA Copyright © 2007, SAS Institute Inc. All rights reserved.
Business Problem M.K Nurich offered a variety of magazines and periodicals to its customers Large customer base with diverse interests Blanket Marketing approach not feasible Adopt a more targeted marketing approach Attract more potential high value customers Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
1
Business Understanding Target new prospects by predicting customer’s lifetime revenue (over 5 years) Build a model to rank customers in terms of expected revenue Find out the predicted revenue of those customers that were not solicited earlier Use data of current customers to expand into the untapped market
Copyright © 2007, SAS Institute Inc. All rights reserved.
Data Understanding Modeling dataset – 10,669 observations 177 modeling variables 1 continuous target variable OBS_ID – Unique Identifier Scoring dataset – 7,054 observations Another scoring dataset – unsolicited customers who had become members
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
2
Data Preparation Explored the distributions and measurement levels of different variables Partitioned the data into 70:30 for training and validation respectively Used several variable selection techniques like Variable Selection, DM Regression, DM Neural etc. Added variables one-by-one and ran the model and observed the change in the RMSE value Copyright © 2007, SAS Institute Inc. All rights reserved.
Data Preparation Rejected 8 variables initially that had about 74% missing values:
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
3
Data Preparation Variables Selected (22) :
Copyright © 2007, SAS Institute Inc. All rights reserved.
Data Preparation Transformation: Transformed the variables to normalize the distribution and reduce the skewness Max Normal Transformation produced the best results
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
4
Data Preparation Before Transformation:
Copyright © 2007, SAS Institute Inc. All rights reserved.
Data Preparation After Transformation:
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
5
Data Preparation Imputation: Imputed variables to replace missing values 22 variables had missing values
Copyright © 2007, SAS Institute Inc. All rights reserved.
Data Preparation
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
6
Modeling Ran different models like Neural network, Autoneural node, Regression, DM Neural, Dmine Regression etc. Experimented with the properties of the models Used Ensemble model to combine predictions from multiple models Found out the best model by doing a model comparison Copyright © 2007, SAS Institute Inc. All rights reserved.
Evaluation Results of Model Comparison:
The best model was Neural network with 8 neurons It had the lowest RMSE value of 158.94 Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
7
Scoring Used our best model to score the datasets P_rev_all was the column that was exported along with the OBS_ID P_rev_all displayed the revenue generated for customers
Copyright © 2007, SAS Institute Inc. All rights reserved.
Scoring For the 1st Scoring dataset, the customer with an ID of 13656 would generate the maximum revenue of $936.76
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
8
Scoring For the 2nd Scoring dataset, the customer with an ID of 1200 would generate the maximum revenue of $323.74
Copyright © 2007, SAS Institute Inc. All rights reserved.
Conclusion Determine the predicted revenues based on customer’s lifetime value Target customers with highest predicted revenues Maximize the Profit and Net Revenue
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
9
THANK YOU
Copyright © 2007, SAS Institute Inc. All rights reserved.
Copyright © 2007, SAS Institute Inc. All rights reserved.
10