

Show HN: I self-published a book - BlackJack

Hey folks,<p>I wanted to share my latest project, which is a book that I wrote and self-published. The book is called Behind the Ivy Curtain: A Data Driven Guide to Elite College Admissions, and the link is http:&#x2F;&#x2F;www.amazon.com&#x2F;gp&#x2F;product&#x2F;B013YFIQ30.<p>Let me give you some backstory. I went to a fun high school but it was really easy and fun and I was a pretty big slacker. Around sophomore spring I decided I wanted to leave Florida, and the best way to do that was to get a scholarship to a good school. I had no idea how to do that so I spent a bunch of time reading online forums like College Confidential (http:&#x2F;&#x2F;talk.collegeconfidential.com) to figure out what works. I then acted on the advice, became the first kid from my school to go to an Ivy, and gave presentations&#x2F;tutoring sessions at school to help other kids - I&#x27;m still close with my principal and try to be involved in the community there.<p>Anyways, I really wanted to consolidate my knowledge in book form, but there are other books on college admissions. So what I did was programmatically scrape the forums to gather admission results for nearly 5,000 kids over several years to figure out what works and what doesn&#x27;t.<p>Most of you have graduated college already, but I really want to use the book to share this info broadly. Specifically, college admissions consulting is a multi-million dollar industry for no reason - it&#x27;s all built on information asymmetry. I want to get this stuff into the hands of every interested kid, and I wanted to use real data&#x2F;outcomes instead of &quot;oh well this one kid a few years ago did X&quot;, so I put in the effort and wrote this book.<p>The book is pretty cheap ($4.99) so buy it if you want, but I think what&#x27;s more interesting to the HN community is the process I took. I have hit the character limit so the process is in a comment on the post.
======
OafTobark
Some questions:

1\. Is the data you scraped focused primarily for graduating high school
seniors or is there distinction between incoming freshmens vs transfer
students?

2\. Is it safe to assume it's not actually focused on Ivy Leagues only but
also includes other top tier schools such as Stanford, MIT, CalTech, etc?

3\. Is there any distinction for success rates outside of grades and test
scores?

4\. I assume primarily focused on undergrad and not masters or PhD?

~~~
BlackJack
1\. Graduating seniors. 2\. Ivy League + MIT + Stanford, so 10 in total. 3\. A
couple of things => race/ethnicity obv. Income too - poor people and rich
people did better in this high achiever cohort - being middle/upper-middle
class helped the least. For EC's, fewer activities + longer time per activity
was better. 4\. Yup

------
jasondecastro
Clickable:
[http://www.amazon.com/gp/product/B013YFIQ30](http://www.amazon.com/gp/product/B013YFIQ30)

------
BlackJack
So this is how I did it:

1\. Read a bunch about self-publishing. Traditional publishers give 10-15%
royalty, and this book really isn't built for paper-back, so I decided to go
digital. Amazon gives you 35% or 70% royalty depending on what you price it
at. Write Publish Repeat ([http://www.amazon.com/Publish-Repeat-No-Luck-
Required-Self-P...](http://www.amazon.com/Publish-Repeat-No-Luck-Required-
Self-Publishing-Success-ebook/dp/B00H26IFJS)) is an excellent book on what it
takes to be an independent author, and I highly recommend it if you want more
info there.

2\. Started looking at the College Confidential forums and writing my
scrapers. It wasn't too much coding, about 1.5k lines in python to do all the
scraping, clean up, and storage (flat files + SQLAlchemy). It took some time
to figure out what fields to use and what would be relevant, and the other
hard part was actually parsing data. For example, some people report "SAT:
2370", while others do "SAT: 800/770/800", while others do "800 CR, 800 M, 800
W". All three represent the same score (2370), but they all have different
info, so it just took some regex fanciness + manual verification to get good
parsers in place.

3\. Started writing the text. I've been dreaming of writing this since I was
in 10th grade so it wasn't hard to get the outline ready. I finished up the
text on what the application process is like (after you submit), and then I
started generating tables from my data and analyzing it.

I used Google Docs, but that was a mistake for two reasons. For one, once you
hit 40-50 pages on Docs, the whole thing becomes slow. It was annoying to
scroll through to different pages, and after a while I just exported to ODT
and edited in LibreOffice.

The other reason that was a bad call was because it is really difficult to
export it to something that Amazon will like. You see, tables are kinda hard
to do in Kindle, and a lot of the formatting gets messed up because you have
to go Docs -> ODT -> HTML, when Amazon then turns into MOBI or AZW. Eventually
I had to download Office 365 and export as .docx, then edit in word, export
that to HTML, and upload that. Some guess and check got the tables to print
mostly ok, but the lists were all messed up because Microsoft uses some random
HTML stuff. So I went through all the pages and then just used <ul> and <ol>
to do what I needed to with lists.

And that worked well. A few more tweaks and dealing with Amazon's system (once
you send something to 'Publish', you can't cancel it. They approve it quickly
but it takes ~6 hours to actually publish, at which point I would republish a
final version and wait 6 more hours =\\)

Overall, a really fun process. I do want to write more books and experiment
with different tools like Scrivener. Happy to discuss college admissions or
publishing or anything else. Thanks for your support, and eager for your
feedback!!

P.S.: My last few Show HNs
([https://news.ycombinator.com/item?id=8821393](https://news.ycombinator.com/item?id=8821393),
[https://news.ycombinator.com/item?id=8006940](https://news.ycombinator.com/item?id=8006940)),
are all around admission/college application tips. Didn't realize the theme
haha.

~~~
i0nutzb
Nice insights. Couple of questions:

1\. Did you also considered Leanpub?(I think they have a smaller fee) If yes,
why did you pick Amazon over Leanpub?

2\. Why didn't you used an easy to convert format? I mean instead of Word-like
processors (be it Docs, Libre, Office or whatever) you could use Markdown,
LaTeX or something similar, so you could convert into... anything: html, epub
(so you could publish on iBooks as well), mobi, azw, pdf, word etc.

~~~
BlackJack
1\. No I didn't. I have heard it is mainly for tech books, and the
blogs/forums I visited all said that Amazon is the biggest marketplace.

2\. Ignorance :( I thought everybody used Word/Docs so I too started there
since I thought converting would be the easy part. Turns out that it's tough.
Next time I'll be sure to use Markdown/LaTex.

Do you write books too?

~~~
i0nutzb
Let's say that I'm that guy you see in movies that want to write a book for
ages and always postpone it... -_-

