
A Dive Into The Lending Club Data - ClementM
http://100mdeep.com/
======
radmuzom
I build statistical models for banks which help assess the risk of a loan.
Effectively, my models will get converted into the grades (A, B, C, D, etc.)
mentioned in the article. The strategies (second chance, family guy, safe
haven) are generally consistent with experiences from the portfolios of most
financial institutions.

However, I am skeptical (prove me wrong) of the statement in the article -
"Lenders get a return on their investment that is typically much better than
traditional Certificate of Deposit or Saving Accounts". In finance terms, I
will be surprised if they have a higher RAROC [1] as compared to large banks.
If they really do, then congratulations (you will put banks out of business in
a few years)??

[1] [https://en.wikipedia.org/wiki/Risk-
adjusted_return_on_capita...](https://en.wikipedia.org/wiki/Risk-
adjusted_return_on_capital)

~~~
techcreditcard
Your background sounds interesting! I'm working on bringing a credit card to
the subprime market. Would be great to connect with you.

~~~
ClementM
Hey, i'd love to have a chat. PM me at clement@100mdeep.com and let's talk

------
Amorymeltzer
The employment length is really bugging me. I've always selected people with a
few years at their current job, leaning towards higher, because it feels safe,
but this says that <2 years of experience is better than longer! I wonder if
they're "newer" so more likely to stay around and not be pushed out, or if the
rates are much higher compared to a marginal increase in risk. I'm leaning
toward the latter. It looks like income has the same effect for home loans;
<50k has a much higher return simply because they get a huge rating hit.

My other big hit is 3 versus 5-year terms. Anyone here care to comment? I like
the 36 months because it feels more liquid and when I started I wasn't sure
LendingClub was going to be around for a decade or more. Beginning to think I
should reconsider that stance.

~~~
ClementM
Also, it's pretty instructive to look at the Lending Club grading algorithm in
details.

They made it public at some point. Now they are a little less transparent
about it.

But some details can be found in their SEC prospectus.

I can link that up as well if you guys want.

~~~
jlittel
I'd love to see any info.

Their current offering document is here[1], but I don't see much mention of
specifics. There's some detail on mapping to grades on p42 of the Aug 22 doc,
as well as interest rates charged for each risk category.

[1][https://www.lendingclub.com/info/prospectus.action](https://www.lendingclub.com/info/prospectus.action)

~~~
ClementM
I have some details somewhere on my drive. PM me and we can talk

------
Charlesmigli
Nice post. I see that you used dc.js, could you share some code ?

~~~
ClementM
The code to process the lending club data is done in python. I got the full
history of all payments made on the platform and pre-process it to have
something that's light enough to be explored with a good user experience.

For the viz', yes I used DC.js, I can open source the .js if you guys want it.

~~~
edelans
Hi Clement, very neat dataviz ! How did you get the full history of all
payments ? Is it available in the open ? What is 100mdeep btw ? Just a
blogging site you are bootstraping or any plans to make a living out of it ?
(We met a few months ago @ ToucanToco...)

~~~
ClementM
100m is the depth human can free dive to. Without any aid. Without any fins.

[https://en.wikipedia.org/wiki/Constant_weight_apnea](https://en.wikipedia.org/wiki/Constant_weight_apnea)

Tribute to people who go deep into things ;).

~~~
edelans
Nice image! I'm a big fan of Guillaume Nery, the french free diver (4 word
records). Actually Constant weight divers are allowed to use fins (monofins
most of the time), although they use it only to get to ~30m deep, after that
point, the pressure is such that the volume of the body decreases and
Archimedes thrust is no longer sufficient to compensate for the weight... so
the diver can descend without any mouvement... It's very impressive to see
([http://www.liveleak.com/view?i=d4a_1244462128](http://www.liveleak.com/view?i=d4a_1244462128)).

Funny to read about this a few days after Guillaume Nery's accident: a line
was set to the wrong depth causing the diverto dive to -139m/456ft instead of
-129m/-423ft that he had announced the night before...

------
fbenezit
Thanks Clement for this beautifully simple dc.js dataviz. How long did you
play with it before finding the pearl? Do you think there are yet other pearls
to find in your tool?

~~~
ClementM
There are many pearls to find. It depends on your 'set of preferences'. You
want return? you want low-risk ? you care to deploy a lot of money, or not
that much?

But I'd say that for starter, anything in the 8%-9% range is a very good deal
these days in this environment

------
rgn216
Very useful tool. Gives valuable insight on how to select filters in portfolio
construction. If "the Pearl" was a existing product, I would definitely invest
in it.

~~~
ClementM
Look for the Pearl! you can definitely get 8% interest over the long run with
a good liquidity on your cash deployed. To me, it's totally worth it for money
that you don't need in the really short term

------
rocketcity
This is great. I have been using LendingRobot lately. I will have to rethink
some of my strategies based on what I see in this article.

~~~
existencebox
For my own personal interest, can you speak at all on your experience using
LendingRobot? (successes/misses, use duration, gains, etc)

~~~
gtremper
I've been using it for about a month and its one of the better LendingClub
auto-investors out there. By far the best interface, but the fee is a little
steep, though the fee only takes effect after investing 10k though them. I've
rolled my own auto-investor in the past and used the one on NSRPlatform(free
for accounts <20k). I'm planning switch over to LendingRobot 100% when I can't
use NSRPlatform for free anymore.

Here's my shameless referral for lending robot
[https://www.lendingrobot.com/ref/YehMn307/](https://www.lendingrobot.com/ref/YehMn307/)
(you get 10k free, I get 5k more free)

~~~
existencebox
Thanks! Any recommendations for someone looking to get into this space,
especially in terms of things you had wish you knew starting off? (I'm looking
at this as a "long shot" diversification of a portion my savings, allocating a
comparatively very small amount and going entirely hands off)

~~~
gtremper
I have all my LendingClub funds in a Roth IRA so I don't have to deal with any
of the tax loss/gains stuff. I've heard the tax accounting can get tricky in
normal accounts, so I highly recommend putting your LendingClub funds in a
tax-advantaged account.

As for filtering and stuff, you can you do your own underwriting(loan risk
analysis) with the data lendingclub provides[1], or use
[https://www.nsrplatform.com](https://www.nsrplatform.com), which has a nice
GUI tool to explore the data with you're own filters. LendingClub has a JSON
api, so you can an order executer for yourself. (Here's the remnants of the
one I was working on
[https://github.com/gtremper/LoanInvestor](https://github.com/gtremper/LoanInvestor).
P2P-Picks was a 3rd party underwriter that isn't available anymore). I've
noticed that the D and E loans tend to be the best balance of risk and return

Also, be aware that you'll need to continuously buy new notes as payments come
in to your account, otherwise you'll build up cash rather quickly. That's why
these auto-investing services are so useful. Its best to buy only $25(the
minimum) per loan so you can spread your risk among as many notes as possible.

[1] [https://www.lendingclub.com/info/download-
data.action](https://www.lendingclub.com/info/download-data.action)

------
cryoshon
Hm, this has sparked my interest in lending via Lending Club. I'll check this
out for sure.

------
akg_67
I operate an online crowd-lending analytics and automation platform PeerCube
https:/www.peercube.com. I have been analyzing both Lending Club and Prosper
data for my institutional clients for almost 4 years now. While OP made a good
first attempt on analyzing the data, the analysis suffers from two major
shortcomings that I normally see from people getting started with data
analysis.

1\. Domain Knowledge: Novice analyst tend to put the data in a blender and see
what comes out first instead of building some preliminary knowledge and
intuition about the domain. This is quite evident in OP's analysis and finding
about annual income. A person familiar with domain will ask the question "Why
would a borrower with high annual income will borrow a small amount loan at
high interest rate?" This right away will raise flags about risks of lending
to such borrowers. OP will benefit by reading some of the publications (books,
research) on credit scoring and modeling before deep diving into analyzing
Lending Club data.

2\. Data Exploration: Not spending enough time exploring the data can lead to
erroneous conclusion like The second chance strategy. When did Lending Club
start issuing loans to borrowers with delinquencies and public records has a
big impact on returns as newer loans are not aged enough to have sufficient
defaults.

> Watch for your average return (expected return), consistency of returns
> through time (risk), while making sure there is enough supply (liquidity) on
> the platform to deploy your strategy.

Time is not Risk. You need to find a proper measure for risk. Also consider
negative kurtosis and frequent low positive returns but a few high negative
returns nature of return distribution.

> I considered that investors deploy and re-invest their money continuously on
> the platform and therefore own a portfolio with different ‘vintages’ of
> loans. The ROI that are computed reflect this, as they are average returns
> across vintages.

Re-consider this argument of "average return across vintages" being
representative of investor returns. Tip: look at loan volume across vintages
as well as typical re-investment pattern of a typical investor.

> Please also note than due to the low issuance volume in the early days of
> the platform, the returns computed for the pre-2010 period are much less
> reliable than the post-2010 returns.

Please don't do this. The data between 2006 and 2010 is the most valuable due
to the business cycle we were in at that time. The data since 2010 tells
nothing about how loans might perform in the future when business cycle is not
as good it has been in last few years.

OP will really benefit from re-evaluating his finings with critical eyes. I
will suggest gaining some domain knowledge, spending lot of time on just
exploring the data before start drawing definite conclusions, focusing on
distributions, correlations and statistical significance.

~~~
ClementM
A bit more courtesy would have been welcome. You sound very condescending. And
I hope you talk to your clients in a different way!

Let me address your methodology comments nonetheless, which are for the most
part unfounded.

* I don't have any finding about annual income. I don't think it is mentionned anywhere in my conclusions.

* "delinquencies and public records has a big impact on returns as newer loans are not aged enough": because I average across vintage, and because I don't average based on volume on the platform, I account for the aging biais.

* "Time is not risk [..] kurtosis etc.": I don't say that time is risk. I suggest the reader to look at the return series through time. Essentially to look at the volatility of the returns ( without pronouncing the word volatility to keep the content accessible to a novice reader). I essentially encourage the reader to visually assess his Sharpe ratio. Which is a good universal risk measure.

* "reconsider average across vintage": averaging across vintage is a first approximation. I acknowledge the fact that a better methodology would be to take a weighted average that matches the amortization profile of a loan.

* I maintain that any statistics you compute in 2006, 2007 or 2008 is less reliable (statistically). Yes it is an important period to have because of the crisis. And this is why I put on the chart. However, you can't compute very reliable returns when you have a dozen of loans to average across.

Anyway, I happy to exchange with you in PM on methodology if you would like to
continue the discussion

~~~
akg_67
Sorry for coming across condescending. This was not my intention. I was just
trying to guide you in the right direction as you came across someone who is
just getting started with data analysis.

Once again, I will stress, you need to reconsider your methodology if you want
to learn.

~~~
ClementM
I have been doing quantitative data analysis in the investment space for about
10 years.

Happy to hear/talk about methodology if you want to write/talk about it.

------
mikeskim
can you fully automate data driven investment on these platforms?

~~~
ClementM
You can fully automate investment 'filters' i.e. filter the notes you're
willing to invest or not.

