Business Intelligence with SAS Program
: PGPBA
Class of
: 2009
Semester
:
Sessions
: 33
Credit
: 3
Course Code : IT661
Course Objective •
To provide concepts & techniques of Data Mining Analysis Tools (DMAT) which are different from various statistical techniques.
•
To equip the students with skills to perform data analysis and conclusions independently with special focus on Data Mining (DM) applications with SAS Enterprise Miner. REFERENCE BOOKS
AUTHOR / PUBLICATION
Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management
Berry and Linoff - Wiley Computer Publishing; 2 Ed, 2004.
Applied Multivariate Techniques
Sharma Subhash - John Wiley & Sons, 1996
Using Multivariate Statistics’
Tabachnick B.G. & Fidell L.S - Allyn & Bacon, 1996.
Multivariate Data
Hair J.F, Anderson R.E., Tatham R.L, Black W.C - Hair. Pearson Education, 2003.
Detailed Syllabus
examples of data screening – Outlier analysis Residual analysis – Generalized linear models – types of sum of squares type I II III IV – Concept of AIC & SBC - Data partition into Training validation and testing
Introduction to Data Mining Applications in Marketing and Customer Relationship Management: A Statistical perspective of Data Mining - Analytic Customer Relationship Management - Tasks performed with data mining - Virtuous Cycle of Data Mining Applications of data mining – Concept of Learning , Knowledge discovery, Analytical Intelligence, Enterprise Intelligence
Receiver operating characteristics (ROC) : Concept , Construction and inferences – Area under the curve (AUC) – Multiclass problems – Volume under the curve(VUC) Logistic regression: Concept of odds ratio
Examining data using SAS: Testing the assumptions of multivariate analysis- BLUE Assessing individual variables Vs variate Normality – Heteroscedasticity, autocorrelation & Multicollinearity – Identification and solutions -absence of correlated errors Important issues in data screening - Complete PG Program in Business Administration
- Wald’s confidence interval and construction – concept of Cordant discordant and tied pairs Basic concepts of logistic regression - Logistic regression with only one categorical variable Logistic regression and contingency table analysis - Logistic regression for combination of 1
Class of 2008
categorical and continuous independent variables – Stepwise backward and forward regression methods in multivariate logistic regression Comparison of logistic regression
Rules: Concept of support confidence lift and gain Defining Market Basket Analysis - Three Levels of Market Basket Data - Order Characteristic - Item Popularity - Tracking Marketing Inventories - Clustering Product Usage - Association Rules - Actionable Rules Trivial Rules - Inexplicable Rules - Building Association Rules - Choosing the Right Set Of Items - Product Hierarchies Help to Generalize Items - Virtual Items Go Beyond the Product Hierarchies - Data Quality Anonymous Versus Identified - Generating Rules From All the Data - Calculating Confidence - Calculating Lift The Negative Rule - The Problem of Big Data
and discriminant analysis Decision Trees: Introduction - Growing a decision tree - concept of logworth – algorithms chaid & cart importance of variable selection Test for choosing the best split – Pruning Extracting rules from trees - Alternate representations for decision trees - Decision trees in practice Artificial Neural Networks: History - Real Estate Appraisal – Concept of a link function Neural Networks for Directed data Mining Neural Net - The Unit of a Neural Network Feed-Forward Neural Networks - Back Propagation Heuristics for using Feed-Forward Back Propagation Networks - Choosing the Training Set - Coverage of Values for All Features - Number of Features - Size of Training Set - Number of Outputs - Preparing the Data - Features with Continuous Values Features with Ordered, Discrete (Integer) Values - Features with categorical values - Other Types of Features - Interpreting the results - Neural Networks for Time series - Example: Finding Clusters - Lessons Learned
Suggested Schedule of Sessions Topic
Time Series Forecasting: Stationarity non stationarity Exponential Smoothing - ARIMA Models - AR Process - Moving Average Process - ARMA Process - Box Jenkins Methodology – Time series construction of rare events SAS Programming: Basics of Programming Data input through programming - Data Steps and Proc steps Simple SAS Programming – Construction of charts and plots using SAS steps
No of Sessions
Introduction
1
Examing Data (Enterprise Guide)
3
Exercises on Examing Data
2
Logistic Regression (LR) (Enterprise Guide)
3
Exercises on LR
2
Decision Trees (DT) (Enterprise Miner)
3
Exercise on DT
2
Artificial Neural Networks (ANN) (Enterprise Miner)
4
Exercise on ANN
2
Time Series Forecasting (TSFS)
3
Exercise on TSFS
1
SAS Programming (BASE SAS)
3
Market basket Analysis (MBA)(Enterprise Guide)
3
Exercise on MBA
1
Total Sessions
33
Market Basket Analysis and Association PG Program in Business Administration
2
Class of 2008
PG Program in Business Administration
3
Class of 2008