NETWORK ECONOMIES peer-to-peer lending Quantitative Analysis Review Pattern Recognition MIT Media Lab Class
January 16th 2009 contact: Ray Garcia
[email protected]
Copyright 2009
OVERVIEW
Synopsis of Research Results •
Predictive Accuracy
•
Predictable with 80% accuracy Loan conversion and defaults
•
Able to detect borrowers financial health by payment record
•
Social Factors
•
Increase odds of getting a loan when financial features are similar
•
Evidence of preferential attachment with threshold number of bids
•
Probable lender biases
•
Demonstrated the textual information influences a loan
•
Shows that images posted by borrower matter
2
OVERVIEW
Business Implications • Tools may need to be provided • For borrowers who need help to increase their odds of getting a loan • For lenders to detect pending defaults. • For P2P to manage risk • Existing Social Networks should be exploited • As loan volume increases the likelihood of similar financials becomes greater therefore the social aspects of assessing the quality of the borrower becomes more important. • User may be reluctant to build a social network on a P2P site when they have already done so elsewhere • Social networks may have a natural affinity for lending therefore increasing loan volumes. 3
OVERVIEW
Recommendations • Continue academic research with peer review publication for validation • Complete a statistical profiling of full set of p2p data (not sampling) • Use existing text, social, images, to determine qualitative voice of customer • Analysis tools needed before having a secondary market of bundled loans • Develop game model of p2p lending using info-economics theory • Simulate social interaction networks impact on p2p lending model • Test predictive model simulated against similar data samples
4
INTRODUCTION
Pattern Recognition Class • MIT ML Professor Roz Picard Phd. teaching the course • Teaching Assistant, Dawei Shen (Phd. candidate in Viral Communications) • CFB Events invite Prosper.com and Virgin Money to present at MIT • MIT Faculty and students take an interest in Peer to Peer Lending • CFB (Ray Garcia) presents P2P Lending to the Pattern Recognition class. • 7 researchers (students in class) interested in Peer to Peer Lending • 2 Teams research P2P Lending using Pattern Recognition
5
INTRODUCTION
Research Teams analyzing P2P Lending • Lending Activity Team
Social Capital Impact Team
• Coco Krumme
• Sergio Herrero
• Charlie DeTar
• Rahul Bhattacharyya
• Matt Aldrich
• Aithne Sheng-Ying Pao
• Ernesto MartinezVillalpando
6
INTRODUCTION
Outline of Analysis • Research Inquiry • Review of prior P2P analysis using descriptive statistics • Analytical results: classification, feature selection, neural networks • Suggested tools / applications for borrowers, lenders, P2P vendor • Conclusions
7
RESEARCH INQUIRY
Analysis of P2P Lending Activity
Analysis from several perspectives • Borrowers: how to improve chances of getting a loan? • Lenders: how to maximize returns by choosing the best loans? • Social Interaction: what are the implications of the network relationships • P2P Vendor: • increase loan conversion, • create tools to help borrowers and lenders • identify loans before default • P2P stakeholders & competitors: • what borrower profiles are best served by P2P versus a traditional lender?
8
P2P RESEARCH
Analysis of P2P Lending Activity
Data Used in the Analysis
9
P2P RESEARCH
Analysis of P2P Lending Activity
Descriptive Statistics • Data from past 3 years: 340K listings, 29K loans • Distribution of credit scores, loan status
10
P2P RESEARCH
Analysis of P2P Lending Activity
Descriptive Statistics • Geographical distribution of members Darker green indicates more members
11
P2P RESEARCH
Analysis of P2P Lending Activity
Previous Research Findings by Stanford GSB • Research used Regression analysis of financial and social factors • Group membership and endorsement increases loan funding significantly • Credit Score and Verified Bank Account are financial factors most correlated with high funding rate • Reference: http://www.prosper.com/Downloads/Research/Prosper_Regression_Project-Fundability_Study.pdf
CURRENT analysis is multi-factor, utilizes advanced feature selection, considers unequal prior probabilities and a variety of data models 12
ANALYTICAL RESULTS
Analysis of P2P Lending Activity
Data reveals patterns
13
ANALYTICAL RESULTS
Analysis of P2P Lending Activity
Loan conversion/default predicted ~80% accuracy • Predict loan conversion, default with ~80% accuracy
(neural net not shown)
14
ANALYTICAL RESULTS
Analysis of P2P Lending Activity
Optimal feature set to predict conversion/default • Identified 96 features (including text, social metrics) • Ranked using floating feature selection
Top 8 features: - Amount Delinquent
- Open Credit Lines - Amount Requested - Borrower’s Max Rate - Credit Grade - Debt to Income Ratio - Funding Option - Endorsement
15
ANALYTICAL RESULTS
Analysis of P2P Lending Activity
The crucial 20%: human judgment & text analysis • Human judgment is key in loan funding - factors such as description, image • # prior bids counts (75% threshold): we want to fund alreadysupported listings
16
TOOLS Decision
Analysis of P2P Lending Activity
Tree Analysis help borrowers get loans (increase loan conversion) • Decision tree predicts loan conversions and defaults
• Borrower can control requested amount and interest rate • Interactive tool to help borrower set optimal amount, rate
17
TOOLS
Analysis of P2P Lending Activity
Decision Tree to identify probability of loan defaulting (before lending) • Before lending, use decision trees to identify risky borrowers • Tool for default insurance, securitization of loans
18
TOOLS
Analysis of P2P Lending Activity
Identify loans pre-default using Hidden Markov Model • 3-state “financial health” model • Prosper could offer support to borrowers before default • prediction error decreases with longer observation series
19
Analysis of P2P Social Impact
Research Inquiry on Social Capital • How social capital influence peer-to-peer lending?
• “Friends”: Direct relationship. intends to motivate other lenders to bid on second degree friends based on indirect trust. • “Endorsements”: Feedback on previous transactions with other users. • “Groups”: Users are allowed to form communities. Group members help each other and the group rating depends on their performance. Peer pressure
20
FEATURE SELECTION
Analysis of P2P Social Impact
Social versus Financial Components Social Profile
Financial Profile
• Group Leader Reward Rate
• Borrower Maximum Rate
• Endorsement Number
• Credit Grade
• First Degree Friend Number
• Debt To Income Ratio
• Second Degree Friend Number
• Amount Requested
• Group Rating
• Is Borrower Homeowner
• Group Size
21
Analysis of P2P Social Impact
FEATURE SELECTION
Importance of Social Features • Most important features • Credit Grade • 3 Bidding Forces
3 bidding forces involving “social interaction” behavior • Bids from First Friends • Bids from Second Friends • Bids from Group Members
• Least important Social Capital features: • Group Leader Reward Rate • Group Size • Number of first degree Friends
22
Analysis of P2P Social Impact
Cluster Analysis of Bids
23
Analysis of P2P Social Impact
Cluster Analysis of Group Rating
24
Analysis of P2P Lending Activity
Conclusions from Lending Activity Analysis 1. Data is separable and has identifiable patterns 2. Non-obvious features do play a significant role 3. Factoring Human judgment is important consideration Recommendation: Create useful tools for borrowers and lenders to help foster P2P activity. These tools should be based on decision tree and HMM models. Development of risk models should be explored using these techniques.
m: Coco Krumme, Charlie deTar, Matt Aldrich, Ernesto Martinez-Villalpan 25
Analysis of P2P Social Impact
Conclusion from Social Impact Analysis “Social features” do not replace “financial features”. But….they are the best complement for differentiation when comparing similar financial profiles. Users do not have time to maintain many social profiles Recommendation: P2P lending should use existing social networks as a foundation, instead of building their own.
Team: Aithne Sheng-Ying Pao, Sergio Herrero, Rahul Bhattacharyya 26
Analysis of P2P Lending Activity
P2P RESEARCH loan or no loan? default or pay?
Analysis Method Introduction Descriptive statistics
experimental methods
feature selection
separation methods
greedy
Neural Nets
floating
Linear Discriminant Analysis
Principal Component Analysis
Support Vector Machine
Hidden Markov Model loan performance
mechanical turk
graphical models
decision trees bayesian networks
Suggested TOOLS for Peer to Peer Lending
group performance 27
Analysis of P2P Social Impact
Summary of Pattern Recognition Models Applied Conclusions remain consistent across different Models • Descriptive Statistics • Linear Regression • Principal Component Analysis • Support Vector Machine • Artificial Neural Network • Linear Discriminant Analysis • K-Nearest Neighbor • Fisher Algorithm • Pudil’s Algorithm • Bayesian Nets • Decision Trees • Hidden Markov Model • Human Qualitative Classification
28
References for Further Study • MIT Pattern Recognition Course Information:
•
http://courses.media.mit.edu/2008fall/mas622j/
• Complete MIT Pattern Recognition Study: • •
http://courses.media.mit.edu/2008fall/mas622j/Projects/CharlieCocoErnestoMatt/#contents http://courses.media.mit.edu/2008fall/mas622j/Projects/SergioAithneRahul/SocialInteractionsInP2PLending.pdf
• Prior Research •
Prosper.com for information on p2p lending
•
Stanford Business School regression analysis of prosper.com data
• •
http://www.prosper.com/Downloads/Research/Prosper_Regression_Project-Fundability_Study.pdf Stanford Podcast by Chris Larsen CEO of Prosper.com http://ecdev.stanford.edu/authorMaterialInfo.html?mid=1576
• Books: •
The Complete Guide To Prosper.com by Sean Bauer
•
Happy About People-to-People Lending With Prosper.com by Roger Steciak
• Competitor list: •
VirginMoneyUS.com, Zopa.com, LendingClub.com, Loanio.com, Circlelending.com, FundMyNotes.com
29