Scikit-learn user guide (2017) [pdf] (scikit-learn.org)
140 points by ghosthamlet 8 months ago

If you want to learn machine learning, there're much better material to start with. For example, this: http://www-bcf.usc.edu/~gareth/ISL/

I agree that this text is great for machine learning theory, but often times non-students learn better through learning tools while solving real problems, rather than focusing on theory. Scikit-learn is an amazing tool for facilitating problem based machine learning. I don’t feel the same way about R unless the student has a statistics degree already...

The course actually teaches statistics. R is only used to illustrate the concepts.

so much value in one free python library. inspiring.

also ironic that it is for free, but adjustments to real world business problems are as expensive as you want (consultants, coders)

What is it exactly?

sklearn (scikit-learn) is a machine learning library for Python. In practice, it's useful for integrating a whole zoo of machine learning models for a range of tasks (supervised, unsupervised) and varying strategies within these domains (e.g., decision-tree based models, regression, neural networks) into a simple API, where the following three lines of Python code do 80% of the ML work for you.

>>> model = [insert sklearn class here]

>>> model.fit(trainX, trainY)

>>> model.predict(testX)

The entire scikit-learn Documentation in a single PDF.

We've updated the title from the submitted “One book to rule Machine Learning: glorious sklearn doc over 2000 pages” which is... not the original (like the guidelines ask).


I have to admit that the sales pitch of the document is on point, however they could leave the revision history out as I see not much value added there.

You can bring it up as an issue on github. They can probably move it down to the bottom.

The change log has A LOT of value if you are trying to figure out what version you should upgrade to. If you are stuck on an older version, then you need to understand what value you gain by potentially introducing new bugs into your product.

They also credit everybody who has worked on the project, which is super important.

You have a good point, thanks. Maybe my reflexion was more about the scope of the pdf, no point in bundling things not relevant to learning the library and that maybe need easier discoverability than a pdf anyway.

It is quite navigable. You can click anywhere on the table of content and it will get you right there. There is also a sidebar with bookmarks that work. You can basically skip 70 pages. Considering it is already 2000 pages, you have to use the table of contents already.

The scope of the PDF is to be the full documentation. If you were used to doing something a certain way, you need to know that things changed.

