Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: I wrote a book about using data science to solve “everyday” problems (andrewnc.github.io)
632 points by andrewnc 45 days ago | hide | past | favorite | 96 comments

I've always wanted to write a book. I have helped write 3 different deeply technical books (and one solutions manual), but I wanted something fun, interesting, and valuable.

So I wrote "Everyday Data Science" which is a collection of stories, tutorials, jokes, math, and code all written to inspire people to analyze their personal data.

In general, I was also inspired by the challenge to "make $100 online" which I have done in the past month since launching. It was daunting, and I felt quite vulnerable, but overall I'm pleased with what I've made.

I wrote up this quick post to give you an idea of the process I followed to write the book, and some of the content.

I'd love to know your thoughts and am open to (nice) feedback :)

Hey Andrew, it's Matt from our graduate research lab. This is exciting, congrats!

You always did make me thirst for more understanding of ML... guess I'll have to buy this book. Do you make more margins on digital or print copies?

By the way -- HNers -- if your company needs talent in data science, Andrew is easily one of your best candidates. His intuitive understanding and teaching of data and ML inspired me to be a better scientist. Andrew thinks critically and is also a great person to work with.

I have higher margin on digital copies :) Amazon loves to take their share.

Those are kind words coming from maybe the single greatest programmer I know.

Thanks for your support and for the recommendation :D

I've read "Algorithms to live by" and liked it. This looks like a data science variant of that same idea, very cool.

I think books like these can be a great eye opener. We all remember thinking "what am I gonna use this for??" in high school maths, physics etc, and I think this is a fun, approachable and interesting way to see real life impact of maybe otherwise dry and abstract stuff.


This book resonates with me.

This reminds me of the "data science" book was from Data Smart, from the Mailchimp CDO, talking how to keep orange juice tasting the same all year round (using a seasonal fruit) or calculating the likelihood of a consumer being pregnant - all within Excel.

I recommend “Data Smart” to anyone who wants to break into data science.

Is there also a drm-free version available?

EDIT: Nevermind its in the post and I was confused: https://gumroad.com/l/everydaydata

Hi Andrew, thank you for sharing this. I am inspired by the book and the reasons why you wrote it. This will definitely help with not only trying to use your book as some form of self-help, but to remind myself that when you put your mind to something, you can achieve it. This is something I have been struggling with for the last year and for some reason your book and this comment has been a motivating factor. Thank you.

Thank you for your kind words :)

Motivation is such a fickle beast. For whatever reason, I felt good during the entire writing process. Going back and forth with the editor was definitely more challenging.

One thing that helped, actually, was I tweeted an artificial deadline when I started writing. That was immensely motivating for me. I ended up missing the deadline (again because of editing woes) but that was key in helping me push this over the finish line.

Again, I'm glad this was motivating for you and I wish you the best. Feel free to message me on twitter if you ever wanted to chat more about projects you're working on.

Just bought paperback based on this description. You are a good writer! Excited to see if any of your research experience gave any insights into the book. (I don't know since I only read this description)

By any chance is there a non Amazon hard copy store you distribute through as well?

I'm using Kindle Direct Publishing. This service means the books are printed on demand so I don't have to worry about inventory and the like.

So I guess that's a long way of saying no, there isn't another option for now.

Did you make the $100 entirely through book sales?

Yes. I officially launched the book at the end of January. I delayed doing a HN launch until I was sure people enjoyed the book and I knew there would be some interest.

The unfortunate truth is that book writing is a low margin game and you shouldn't do it for the money. Development or actual data science is far more profitable.

But there are many other intangibles :)


The book looks very intriguing! Before buying it, I wish I could read a bit more about what one could expect from the book. Let me explain :)

> This book is for people as untechnical as my Mom or as technical as my Applied Scientist friends working in big tech.

> ODEs On A Diet

Do you present an intuitive explanation of ODEs? Would both your mom and an Applied Scientist find it informative? If yes, I would love to read! A few pages from this chapter would really help in making a purchasing decision.

> a solution to Multi-Armed Bandit problems which are significantly more efficient and only takes a few lines of extra code.

Does the book include code samples? As someone wanting to learn about, e.g., ODEs through code, I would love it code samples were present in the book.

I understand that $7.99 is not a lot in the US, but it is not the case in a few other parts of the world. Given that there are no refunds, having more information about the book would be valuable for international customers.

BUT... the books looks great, and big congratulations on publishing it!

Send me an email and I can get you a PDF free to read before you choose to buy.

I added the book to Goodreads and forgot to add the cover! Ugh, and you can't edit a book unless you have "librarian status", whatever that is. But here's the link if anyone's interested: https://www.goodreads.com/book/show/57197718-everyday-data-s....

whoa cool! Thank you for doing this. It looks like the cover has been added by some kind Librarian.

I've submitting a petition to "claim" the book as an author.

Awesome! I just added it to my TBR!

Nice! Real-world problems like these are what spurred my interest in statistics/data science, and I agree with your sentiment that intro stats books are a terrible way to cultivate enthusiasm and curiosity about ways to solve problems using data.

In my experience, the most difficult parts of the process are (1) translating from qualitative problems “in the field” to a formalized technical problem, and (2) all the wrangling necessary to implement the formal problem using field-collected data.

I find that I spend most of my time as a data scientist working on these parts of the process, and often don’t have the bandwidth or even requirement for more advanced methods. Not that fancy techniques are the goal, per se, but I do notice I rarely have the opportunity to use them.

This is kind of why I like fitness trackers (minus intrusive user tracking) and would like to see them ever expand into automated tracking of users' physiology, allong with simple seemless logging of experience (like hitting a button to log a headache, or some other minor symptom that might be nothing but months or years of trends might reveal an issue)

It reminds me of the problem of a car noise, but you bring it to a mechanic and it doesn't reproduce on the spot so diagnosis is extremely hard. In fact I've had a heart issue of this sort recently where I experience symptom intermittently, sometimes weeks or months apart. My cardiologist has brought me in multiple times for an EKG, but since itt not while I'm having the symptoms we have no idea if the normal EKG readings are telling an accurate story.

Essentially I want to see personal health devices like the black-box on an airplane. Take blood pressure, weight, and resting heart rate once a week? It goes to the black box. Log a physical symptom or mood? Goes to black box. Sleep and activity patterns, I integrated smart-watch EKG? Goes to black box.

I think the advent of truly seemless UI and UX with unobtrusive comfortable devices that could do this would provide a massive leap forward in preventative health.

We just need to get there in a way that doesn't make it one of the most massive land-grabs ever of personal data by device vendors.

I wonder if you'd like this community: https://github.com/woop/awesome-quantified-self

It's similar to what you're describing and aligns well with the idea of using your personal data in a more open way.

Thanks for the referral, I'll check it out!

Do you sell an epub version? I’d rather read this on my kobo but PDFs don’t do well on it, and I’d rather not feed Amazon for the hard copy.

Unfortunately not :/

I tried for a while to get the epub / mobi version but the PDF was always mangled by every conversion technique I tried.

I did just get a nice tip from a HN user about a piece of software that might work. I'm going to give it a try and maybe I'll have an epub one down the road.

But as of now, there isn't an epub version.

Have you looked at pandoc?

I've been working on converting an old AI book into cleaner markdown for a while, and the biggest issue there has been the lack of a clean source. I'd be surprised if there isn't a way to manage this conversion.

If you have the LaTex, you can more easily convert that to EPUB.

That's a shame. No epub version was the very first thing I noticed, and searched this HN page on before even reading the comments.

epub on the way!

I'd buy this at the blink of an eye if there was an epub version. Also a Kobo users who'd rather not use PDF or feed the Amazon :)

I just submitted a kindle version to Amazon for review. I'll also be adding the epub version to gumroad for those who want to read it in that format.

It's not as nice looking (personal opinion) but it is functional at least.

what are the limitations of the epub version you have noticed so far?

It took me too long to find the purchase link. I eventually clicked on the picture of the book and that worked. I was looking for another buy link for too long though (really, only 20 seconds maybe, but that's probably too long).

Thanks for mentioning this. The post was designed with the idea that people would read through the whole thing before deciding to buy (there are two links towards the bottom).

I always get annoyed when I navigate to a page and there are HUGE buttons telling me to spend money. So I tried to not do that.

Although, I may have pushed it too far in the opposite direction.

Your book explains A/B testing.

Behold, option A!

This looks great but.... I’m not going to buy the paperback and I hate reading extended stuff on tablet/phone/laptop - I read books on kindle, kind to my eyes, distraction free, lightweight and comfortable.

Anything plans that address this?

I tried a number of hours to get a kindle version. But since I wrote the whole book in Latex and every converter from PDF -> Mobi/Epub mangled the book, I wasn't able to get a kindle version.

I settled with the PDF as a soft copy instead (and I put PDFs on my kindle, even though that's obviously not desirable).

If there is a nice way to get PDFs to read natively on kindle, I would jump on that, but I wasn't able to get it to work.

Ah that's a shame to hear — the book sounds really cool (as a layman with an interest) but I too was hoping for a Kindle edition. May pick up the paperback if I can make some room! — now operating a strict one in, one out policy for physical books :)

Can you convert Latex to Word (or ODT, etc.), perhaps with Pandoc, and then from that to epub (e.g. Google Docs allows you to download a Doc as an epub)?

Book looks interesting, btw!

Thank you!

Yes, in theory.. but in practice it didn't work when I tried that route. Because I use the Tufte-Book template, there is a constant margin to the side of the main content that holds figures, equations, and such.

This margin gets brutalized by every program (even the propriety ones Amazon built).

Lessons learned. I need to be more mindful of the various conversion processes in the future.

My conclusion with a self-published book a number of years ago was that, once you get beyond flowing text, creating a Kindle version gets a lot harder. It can clearly be done--e.g. there are Kindle format guidebooks that work well. But when I did another self-published book a few years later, I was giving it away anyway so I just did a PDF version for tablets. (My last one is through a publisher but it's also mostly text.)

> I use the Tufte-Book template



I've often been disappointed at the type of html it's possible to force out of (La)TeX. And epubs are pretty much html, css and images.

I generally think that some kind of markdown with the help of pandoc is the happy path for pleasant writing/editing and good output for html/epub and print/pdf.

I did find this, that have some hints on how to get xml, and then xhtml with support for equations - but it looks cumbersome:


See also: https://github.com/duzyn/tufte-markdown And in particular (beautiful!): https://edwardtufte.github.io/tufte-css/

You know, the book looks interesting - but not in pdf, and not as a physical book (I try to live without physical books - won't I be a sad case when the singularity pushes us into post-apocalypse?)

Did you cconsider selling access to the source (eg: private github repo, suitable license)?

For all you know you are already in a simulation created by Roko's Basilisk. And that comment made you loose points.

Not sure I get the reference. Apparently it's easy to end up in a position like the author, with a large (La)TeX manuscript that easily produce good pdf/PS, but mediocre to awful html/epub.

One option for getting help would be to open up everything to everyone - selling access to the source might be a way to retain some income while enlisting some help in conversion from volunteers (yes, we would first pay full price for the content, then volunteer to try conversion options to produce passable epub output).

Ok, that’s a shame, I wrote my thesis in Latex so am familiar with its ... issues.

The only alternative I can think of is copy the paragraphs to html p tags and then any equations written in latex could be converted to images with LatexIt! (if that’s still a thing)

That all assume it’s enough if a deal for you, understand if it’s not and thanks for the reply

Ok, I spent much of yesterday working on this issue and I have a Kindle version under review with Amazon. It should be available in the next few days :)

I'll update here and on twitter when it's ready.

Amazing!!!!! Great job - thank you

Kindle version: https://www.amazon.com/dp/B08XKDHDXT/ref=sr_1_5?dchild=1&key...

I also sell the epub directly on gumroad bundled with the PDF https://gum.co/everydaydata

Might be just me but seems that Paypal option is not currently working in Gumroad

Bought :-)

"It is easier to put on a pair of shoes than to wrap the earth in leather."

I'd love to see a few sample pages from the book (in addition to the snippets that were on the page you linked to).

Sure! Here are two pages in context.

https://andrewnc.github.io/preview/page_20.png https://andrewnc.github.io/preview/page_94.png

That should give you an idea of the style of writing etc :)

That’s my personal opinion but I prefer when line spacing isn’t so massive to stretch the page count. I’d rather have smaller line spacing with less pages overall than thinking I’m getting a 120 pages book only to find out each page only has ten lines.

Yeah agreed. That spacing looks like a college essay, ie, basically unreadable.

The book looks like exactly something I'd like to read... except that line height is a huge turnoff. It runs counter to most readability guidelines, and makes it feel like the book size was artificially inflated.

(That being said, I love the idea, and hand-drawn visuals + colorized equations!)

Exactly what I feel. I hope an epub version comes because then I can adjust the line spacing and margins.

Ok, I spent much of yesterday working on this issue. I reduced the line spacing a decent amount and I have a Kindle version under review with Amazon. It should be available in the next few days :)

Thanks for the constructive feedback.

I'll update here and on twitter when it's ready.

Thank you! I will definitely get it once that change is deployed.

Congratulations on the book but the formatting is very poor for a technical book. 10 words per line is good for a small paperback sized novel not for a technical publication. The excessive white space, the line spacing are also inappropriate. The most important element (arguably) - the equation also is the hardest to read due to the small size and how compactly it was formatted. If you've paid for the formatting as a service I'd try to get a refund and give it to someone more competent.

Hahaha I loved the humor displayed on page 20. Looks like you're sticking well to your idea of trying to make the book enjoyable and easy to read!

Purchased a copy. Congrats on the book! Hoping to be able to use some of this to get some ideas on how to track some things to improve my health and possibly in some roundabout way for game design.

I've been looking for a book that balances technical rigor with real-world application, and more importantly, is fun to read. This looks promising - adding it to my reading list!

I just bought the pdf book.

My first two immediate impressions: 1) The handdrawn illustration thing is cutesy and cloy. 2) I've no idea why the line spacing is so massive. The word density per page feels like half a normal book.

Of course those are just superficial remarks, and I have barely read anything yet, but they both immediately annoyed me.

*the line spacing thing is annoying because it makes me feel like i'm constantly spinning the scroll wheel on my mouse more than actually reading the text.

Yeah.. a few other folks have mentioned the line spacing issue. I made that choice because I wanted the book to feel "open" and "approachable". Not sure it was the right choice in the end, but it's the one I made.

As far as the drawings, I'm not sure what you mean by cloy?

Since I wrote this comment, I have continued reading the book, and indeed, I really dislike the line-spacing. It itself is distracting somehow. Of course maybe I'm fixated on it now, since I complained about it...

As far as the handdrawn graphics, I think it's kind of a schtick that XKCD created and it works for them, but I find it really tedious when others imitate it. The example chart, something about birds drawn on top of derivative symbols, didn't make much (any?) sense to me.

I appreciate that the book has brevity, and I want to motor through the text. I guess if typesetting is the main problem, that's a good thing. Probably among the easiest things to change.

Ah that makes sense.

I sent out an updated version on gumroad with new line spacing, I hope that is easier to read :)

On amazon it's marked as sold out, at least for me. Is that a fact or is that some market area thing?

If sold out, will it be available again soon?

Edit: refreshed and it's there again!

Congrats on this project. Reminds me of when I first read both 'Algorithms to Live by' and 'What If', books applying maths/science to everyday topics. Adding to my reading list!

Bought a copy on Amazon (it was $22 for the paperback version though), link for the lazy https://www.amazon.com/Everyday-Data-Science-Optimize-Your/d...

I'm probably going to buy the PDF. I think the paperback is too expensive for a book that is only 114 pages.

> I think the paperback is too expensive for a book that is only 114 pages.

I find that a quite strange way of thinking...

I don't judge a restaurant's "value for money" based on their dollar per calorie. I don't choose a laptop based on how many components it has. I don't choose to watch a movie based on it's bitrate.

I kinda get that page count is an "easy metric", but iy seems super shortsighted to chose to not buy a book based purely on page count.

And even at $18, that's, what, a movie with a candy bar trip? A sandwich and a couple of coffees at a cafe? Cover charge to see a band at a bar? There seems very little chance that most people who read this site wouldn't get a couple of hours entertainment from a book like this, with at least as high a likelihood as enjoying a movie/band/cafe recommended here.

Having said that, I'll buy the PDF too, but only because it's my second preference after a Kindle Edition, which doesn't seem to be available...

Thanks for writing this out. It's priced so high on Amazon to cover printing, shipping, color pages, and international markets. My goal was a $12 book, but that was infeasible (their rules). I've priced it essentially as low as possible.

I also agree that Kindle would have been awesome :/

I figured it was due to issues beyond your control, and not because you're marking it up or anything. Just voicing my opinion that I think it's too expensive. I hope you don't take it personally -- I'm sure a lot of your time has gone into this book, and I'm not trying to be insulting. If you sell out the print run anyways, then my opinion doesn't matter at all.

Oh no offense taken. I definitely hear you loud and clear. :) I appreciate you taking the time to chime in here.

>I kinda get that page count is an "easy metric", but iy seems super shortsighted to chose to not buy a book based purely on page count.

It's the best metric currently available for a book that has no reviews, and is from an author I haven't read anything else from.

Congrats on completing a book, on an interesting topic, and with a reduced line spacing option.

It looks interesting. Since you say the book is for folks 'as untechnical as my Mom', I'll get a copy for my young niece that is trying to get into programming.

You have a minor typo on the page - 'enthrawled' should be 'enthralled'

It's funny, I was just thinking that I'd be happy if I never saw any variation on "as untechnical as my Mom" ever again. To me it perpetuates the idea that by default, you can assume that women/mothers are nontechnical. I don't think that most people who use it have that intention of saying that, just that that's part of the effect, and to me that's unfortunate.

This is really astute, thank you for bringing it up. In this case my Mom actually asked to be included in market materials because she enjoyed the book so much.

She's a brilliant woman whom I respect very much. I didn't intend for my phrasing to be denigrating and I apologize for how it is phrased.

It's funny, for me it is the opposite. It used to be I couldn't leave my Dad alone in a room with a computer for more than about 5 minutes without loud swearing reverberating around the house. Whereas my Mom was always a complete gadget freak and loved to tinker with things.

Nice catch, thank you!

I hope your niece likes it, I would honestly love to know how she handles it. I bet she'll get a lot of the ideas, and might have to skip over some of the nitty-gritty details.

> You have a minor typo on the page - 'enthrawled' should be 'enthralled'

Sure that it's not "entrawled in data", like someone entangled in a swift web of information? ;)

I wonder why the epub version on gumroad is named after the (only?) ebook reader that can't handle epub


Really nice work, I just bought it and put in on my reading list for today

Persperation needs to be perspiration - don't sweat, it's in your blog post not the book.

If you want the book reading for errors, I would be happy to do so.

Thanks for the catch!! I definitely didn't pay a professional editor for the blog post. Although, even with an editor looking at the book, I already spotted a few unfortunate typos.. ah well, can't win them all.

You are welcome.

Can you update the book on the fly?

Offer stands. If you want it proofreading, I will do it for a credit in the intro page or similar.

Those colorized equation explanations are amazing! I’d love to be able to point a tool at an equation and get an explanation like that.

Congratulations!!! great book.

Page 17 of the book:-

Fit: The process of showing data to a model to the model get better at explaining or predicting something.

seems like a typo

I will buy if you put up a kindle version on Amazon.

I just submitting a kindle version for review, it should be available in a few days, I'll update here when that happens.

Amazing idea, just bought a copy.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact