
Fastcore: A library that extends Python with new features - thenipper
https://fastpages.fast.ai/fastcore/
======
typon
This library reminds me a lot of the code I've seen from junior developers
trying to make things "cool" by removing a few characters of typing in
exchange for library abstractions that create headaches down the line. I think
features like this take years to naturally evolve in a codebase, on an as
needed basis. And once that code base has evolved to reach something
resembling non idiomatic Python, it can be quite hard to integrate new devs
into the team. Such transitions into custom dsls or nifty tools are inevitable
for large code bases, but I try to minimize it as much as possible, even when
it hurts productivity. Long term maintainability is more important to me.

~~~
slaymaker1907
I know what you are talking about in general, but most of the stuff in this
library seems to either add clarity or help prevent errors. The problem of
kwargs where you end up with a bunch of confusion on documentation is
definitely a real thing.

Multiple dispatch is also a great thing as well. You don’t realize you need it
until you try implementing something unification algorithm.

Compose is nice for reducing the mental overhead of reading nested
expressions. However, I do wish they took a line from lodash and implemented
flow instead (like compose but reversed so the first argument is invoked
first, then the output of that is fed into the second, and so on).

Syntactic sugar is not bad if it increases signal to noise when reading code
or if it prevents erroneous usage.

~~~
marmaduke
Ok but what’s with the pre post meta init thing? Let’s see a junior run a
debugger through that! Dropping one line of code costs several stack frames in
your debug session.

~~~
jph00
It's a feature from the standard library. However it's only normally available
in dataclass. fastcore simply brings the same functionality to other classes
too.

~~~
dragonwriter
dataclass has __post_init__ as a mechanism for doing what you'd normally do in
__init__ besides what dataclasses do in the generated __init__, because the
generated __init__ means you can't do it in __init__, and otherwise you'd have
to override (and replicate the function of) the generated __init__.

It is not equivalent to PrePostMetaInit.

------
orf
I dislike some of these. Some are cool, like 'Self', but giant "swiss army
knife" libraries feel odd.

> Avoid boilerplate when setting instance attributes

dataclasses[1]

> Type Dispatch

functools.singledispatch works with type annotations[2]

> A more useful __repr__

dataclasses[1]

> A better pathlib.Path

path_object.read_text()[3]. The pickle stuff is interesting, but I don't see
the value over just pickle.load(path_object). Monkeypatching Path to add this
fees odd and unpythonic.

1\.
[https://docs.python.org/3/library/dataclasses.html](https://docs.python.org/3/library/dataclasses.html)

2\.
[https://docs.python.org/3/library/functools.html#functools.s...](https://docs.python.org/3/library/functools.html#functools.singledispatch)

3\.
[https://docs.python.org/3/library/pathlib.html#pathlib.Path....](https://docs.python.org/3/library/pathlib.html#pathlib.Path.read_text)

~~~
formerly_proven
Using pickle for "Path.save/load" feels like a shortcut to me that is
inappropriate for many circumstances, since loading a pickle in Python
literally means running code from that pickle. It's basically
eval(Path.read()). It's not the kind of trust relationship that I would
typically expect from a load/save pair.

> Wait! What's going on here? We just imported pathlib.Path - why are we
> getting this new functionality?

If you are writing a library and this is part of your expose, think long and
hard.

> Path.ls

Highly gimmicky, doesn't support globs. Arguments are "n_max, file_type and
file_exts", in that order. n_max seems largely pointless for real programs and
like a workaround for a poor REPL for interactive use. The difference between
the last two arguments is that the first one also accepts mimetypes (does this
call "file" on every file it sees?).

Arguably the signature of a "Path.ls" should rather look like
"Path.ls(glob='<star>', <star>, files=True, dirs=True)" and it should probably
not return a bunch of strings.

~~~
jph00
Look closer. It doesn't return strings.

~~~
formerly_proven
Distinction without difference, since Path is just sugar around a string.
Directory listing functions should generally return something that mimics
dirent/WIN32_FIND_DATA to avoid double work. For example, filtering for file
type can typically be done without an extra stat call per file.

------
miguendes
> Wait! What's going on here? We just imported pathlib.Path - why are we
> getting this new functionality? Thats because we imported the
> fastcore.foundation module, which patches this module via the @patch
> decorator discussed earlier.

This is a recipe for disaster. Too much magic that makes it harder to debug.
Especially for junior developers or people less experienced in python.

I've seen fast.ai code and in terms of good practices it violates a bunch for
the sake of convention. A good example is the `from module import *`.

~~~
jph00
That's not a good example at all.

Every fast.ai module defines __all__, and is carefully designed to allow
"import *" to be used safely. I'm not aware of any other library that goes to
this trouble.

~~~
kyran_adept
Not having _all_ is just part of the problem. With "import *" you also
unknowingly load and potentially overwrite symbols. If you specify each symbol
you use, it's very simple for even the most basic linter to figure you are
doing something wrong.

~~~
nuclear_eclipse
Even beyond that, if you've imported `from foo import bar` and then `from
something import *`, it might work just fine. Then you upgrade the `something`
module, and it happens to now include an item named `bar`, and you're suddenly
breaking your existing usage without ever having changed your code.

Star imports are convenient, but dangerous in so many subtle ways. There's a
very good reason that every major Python linter will call this out as bad
behavior.

------
linkdd
Monkeypatching anywhere except unittests is a nono for me. It adds some black
magic you need to know before reading the code.

Am I reading the standard pathlib.Path or the monkeypatched one ? If I have a
bug with it, it comes from the standard library or the patch introduced
silently ?

Remember the zen of Python:

> Explicit is better than implicit

And I would personally prefer a unification library (
[https://github.com/mrocklin/unification/](https://github.com/mrocklin/unification/)
) over the type dispatch proposed here. It's just more powerful and closer to
pattern matching if that's really what you want.

~~~
Alex3917
> Monkeypatching anywhere except unittests is a nono for me.

What do you think about adding pprint to builtins so that you don't need to
import it each time when debugging? (Or, alternatively, so that you don't need
to add it to the top of every single file?) It seems kind of wrong, but the
alternatives are bad also. I'm not sure why Python didn't do this themselves
when they added breakpoint in 3.7.

~~~
lacker
Monkeypatching in one application is maybe okay. You really want pprint in
builtins, fine, who is it really hurting?

Monkeypatching in a library like fastcore that many other people use is really
bad. All sorts of bugs and incompatibilities will happen down the road, as
software becomes dependent on your monkeypatch without anyone realizing it.

------
bonoboTP
Too clever, too much magic. I remember I used to create such things in my
projects in my earlier days as a dev and was quite proud of how compact things
were and how much I understand my way around the language that I was able to
come up with it. But at the end of the day, these things have a much higher
price than it seems at first. Sparing a few lines here and there is not worth
confusing your code's readers or yourself in a year. Standard established
"pythonic" solutions are preferable.

Some are nice functions, I guess all of us have our collection of utils that
we carry around projects and it's certainly nice to have some of them
collected in an installable library. But I'd prefer to have them separately in
small, separately installable modules, similar to "more_itertools".

delegates: Maybe it's useful for auto-completion in notebooks.

store_attr(): Too confusing.

Avoiding subclassing boilerplate: don't, it's too confusing. Type dispatch:
perhaps, but I think isinstance is okay too.

A better version of functools.partial: Ok, docstring is kept. Could be even
changed in the standard library.

Composition of functions: Okay, could be an addition to functools in the
standard library.

A more useful __repr__: Good.

Monkey Patching With A Decorator: Please don't.

A better pathlib.Path: I don't think loading pickle should be a method of Path
objects. I already dislike pathlib putting read_text as a method of Path. Why
not Path.read_jpeg_image()? Path.save_mp4_video() etc. Also don't monkey patch
Path please.

Self: too confusing, little upside to it. It makes things look like the
function is being called, while actually only a lambda is produced. Too much
magic.

in_notebook(), in_colab(), in_ipython: Useful.

The L list: the numpy style indexing is nice, but you may as well create a
numpy array.

------
kristjansson
Fastai/Fastcore are fascinating in their emphasis on interactive computing in
notebooks. I think that starting from the assumption that users will be

(a) primarily using the abstractions in this library interactively

(b) to do analysis, or experiment with many different implementations of
similar functions

(c) in long-lived sessions where re-running to the same point may be expensive
in time or compute, or impossible due dependence on cell execution order etc.

makes their design choices make more sense to me.

Lots of the functionality identified here sort of nice if you're in a
traditional environment, but imperative if you're in (or writing for users in)
a notebook environment. For example, the emphasis on preserving propagating
signature information in function objects themselves makes sense if you're
relying on `inspect` to provide IDE-like autocompletion. The emphasis on easy
monkey-patching and psuedo-generics makes sense if you're interactively
working with classes in modules you don't control, or can't cleanly reimport.

Not only that, fastai/fastcore themselves are written in Jupyter notebooks via
their literate programming environment nbdev [1]!

I don't know that there's a point here beyond a bit of begrudging admiration
for how far they've pushed the notebook platform. It makes for a library (and
a Python) that looks and feels just a few degrees off-axis from much of the
rest of the ecosystem...

[1]: [https://nbdev.fast.ai](https://nbdev.fast.ai)

------
lacker
This library reminds me of fast.ai itself. It’s cool how you can do so many
things with a few lines of code, and I was impressed the first couple times I
used it. Over time, I started to run into problems when I wanted to do
something beyond the initial demo. Many parameters are undocumented, functions
do something automatically that’s often nice but there’s no clear way to undo
it, and side effects are scattered about. I would prefer code that was
slightly longer, but simpler and more pythonic.

------
Galanwe
Not sure if this is an elaborate prank or a real attempts at making Python
more Java like.

I stopped reading when the author suggests using

class NewParent(ParentClass, metaclass=PrePostInitMeta): def
__pre_init__(self, _args,_ *kwargs): super().__init__()

------
pryelluw
Longtime pythonista and co-organizer of PyATL here. I appreciate the goals of
the library and understand the points being made in the code. There are
certain things that python has acquired over time that are not as nice as they
could be. Or maybe we can say that momentum to improve them reduces over time.

One of these things is the __init__ method. It is not named in a clear manner
and can be verbose. The alternative proposed with store_attr is close, but
still falls to be descriptive. Ideally, I'd just adopt the keyword
"constructor" or "initialize" and have it be synthathic sugar for __init__. I
do like the fact that it provides a shorter way to initialize things. That
could be adopted with no drawbacks as long as we keep named parameters.

I congratulate and welcome such ideas. Ill invite the library author(s) to
join the python core team and discuss the ideas with them. It can take time to
get new things into a project like python, but we cant grow a language by
doing the same thing over and over. Languages need to evolve (to a certain
extent) over time.

~~~
jph00
Thanks for the message!

The challenge here is that fastcore is a deliberate attempt to study the
design axis of "what would happen if we focused on Python's dynamic features,
and gave up on static analysis entirely".

Using it effectively really requires a live coding environment, such as
Jupyter Notebooks (which we do all our coding in, using nbdev). The majority
of Python programmers are not currently using such an environment.

In addition, some of the design choices are literally working around Python
design choices which we disagree with, such as the recent change in Python
where StopIteration is converted to a RuntimeException, which broke all of our
composition of generators.

There are some things, however, which might fit into Python, such as:

\- Our ProcessPoolExecutor takes `0` as max_workers, to mean "run in serial",
which we find very helpful for debugging

\- Something like our `bind` (but less hacky) would be nice

\- Multiple dispatch would be great (although our version is customized for
data processing - a generic python one would be a little different I think)

\- Would be nice if `partial`'s repr showed the original docstring

\- Our `delegates` decorator might be worth stealing

~~~
pryelluw
Appreciate the response. Would it be OK if I reach out through email? Id like
to extend a friendly professional relationship between fast.ai and PyATL.

My email is in my profile :)

------
hansvm
Some of this looks excellent. E.g. I occasionally add features like the kwargs
delegate in my own code, and there are plenty of places in scipy where that
would be an improvement.

Other bits of it aren't drastically better than the standard library. Their
typeddispatch is undeniably useful, but functools.singledispatch covers most
use cases, and it'd be a shame to pull in all of fastcore for a single feature
that's probably covered by another library (e.g. the multipledispatch
library).

Some of this is horrifying. There's nothing wrong with the PrePostInitMeta per
se....but I guarantee that's going to be a footgun down the line, and IMO if
somebody isn't able to comfortably write that metaclass themselves they
probably shouldn't be using it.

~~~
jph00
PrePostInitMeta is just adding what's already in the stdlib dataclasses, but
making it available to other classes too.

~~~
nemetroid
Dataclasses have __post_init__, but they don't have a __pre_init__, do they?
From what I can tell, PrePostInitMeta adds a completely new __dunder__ method
with pre_init.

------
armitron
I didn't find a single thing that I liked here.

Rather, I thought these syntactic layers made things worse rather than better,
in terms of clarity, understandability and ease of writing simple,
straightforward code.

------
iandanforth
Quick warning Jeremy Howard, the primary author of fastcore, doesn't give two
s __ts about PEP8 or other Pythonic conventions. He 's entirely pragmatic and
focused on rapid prototyping and experimentation pipelines. If you like that
then fast* libraries will probably appeal. If however you see `from foo import
*` and cringe then you're going to have a lot of mental speedbumps to get past
with this library.

------
codegladiator
> Whenever I see a function that has the argument __kwargs, I cringe a little

I cringe a little when someone says a straight forward thing should be
replaced by magic middleware.

------
rzimmerman
I actually think the “delegates” syntax for kwargs or something like it is
worth adding to the standard library. The rest feels like it’s either covered
by dataclasses or not something I personally want in python. Except pre_init
and post_init instead of calling super().__init__. But that’s up there with if
__name__ == “__main__” for being an annoying idiosyncrasy that’s too hard to
change.

~~~
ayush--s
There is functools.wraps in standard library for similar "delegation" in
decorators

------
vvladymyrov
I hope that fast.ai courses do not and will not use this library - it would
increase required effort to understand the code in lessons.

~~~
tmabraham
This is the fastai library used:
[https://docs.fast.ai/](https://docs.fast.ai/)

You don't directly use fastcore for most things, but fastai underneath uses
fastcore. If you want to extend fastai for custom and advanced tasks, fastcore
is actually very helpful to use.

------
karlicoss
Nice work! It's often hard to communicate complexity and justify the
abstractions, but nevertheless it's a cool demonstration of what's possible in
Python. It's worth thinking about simplifying code and DSLs, as long as you
don't go too crazy.

My personal concern would be the lack of typing annotations (IMO very
important to have in a 'standard library'), and even if they were present I
doubt something like `store_attr()` would be typeable without a mypy plugin.

Personally, I sometimes trade boilerplate and tolerate code duplication for
the sake of the code being transparent to static analysis.

------
heavyset_go
I'm only really a fan of @typeddispatch and the @patch decorator. They don't
do anything weird and follow the standard Python decorator pattern.

------
fpgaminer
Love the fast.ai ecosystem.

Most don't appreciate how critical it is for an ML researcher to be able to
play. The best research in fields like that require a tight feedback loop,
since we lack an intuition or logical framework from which to reason about
these systems. Instead we have to throw shit at the wall, inspect what sticks,
and repeat 1000x. The vast majority of evolutionary leaps that I've seen in ML
and fields like it have been from "stupid" stuff no classic researcher in
their right mind would try. Instead it arises out of a sense of curiosity and
playing around. "I wonder what would happen if we used only attention layers?
Wouldn't that be silly?" ...

Tooling is a huge part of that. The cost of iterating needs to be low. If you
have to rebuild your entire training loop to try a stupid idea, you're not
going to try that idea. That's what makes ecosystems like fast.ai so valuable.
The whole thing is designed for fast, cheap, highly malleable iteration.

The design decisions required to achieve that have resulted in a library with
its fair share of sharp corners. It's a bit of a holographic codebase, where
functionality is splintered all across the surface of it. The focus on
Notebooks and code jamming leave engineering traditionalists wanting. Etc.

Don't get me wrong. When I write write most other code, I put my engineering
glasses on. I want code to be clear and transparent, maintainable, idiomatic,
etc. It's just that in ML play is more critical.

So it's important to look at fastai's codebase through that lens. I disagree
with commenters here criticizing the codebase as being akin to what an
inexperienced engineer would write. Instead it's clear to me that the fastai
developers have made numerous very conscious decisions to break convention in
support of their goals. They've done that well. And I think it's all in
support of their "fast" ideal.

Honestly if I were to lob real criticism at fastai, it would be to say that it
needs more documentation. I know, that's a weird criticism given their use of
literate programming and the courses available. But ... for example, I have
such a hard time with the DataBlock API. I didn't come out of the courses with
a good understand of it and the docs are IMO too lacking to sufficiently teach
an intuition about it. I had a model where the "y" variable is generated by
the model itself. Was never able to figure out the idiomatic way to express
that in fastai. Had to fight the DataBlock API and the training loop system
tooth and nail to fit it in. I'm sure there's a way to do it. It just wasn't
apparent to me.

Of course, that's a _criticism_, not a complaint. The work of Jeremy and co is
just an incredible gift to the world. I don't think it speaks ill to say the
docs are lacking. That's true of the vast majority of open source software.
Documentation is plain _hard_. And the courses, while they touch on the
library a good amount, are primarily focused on teaching ML concepts ... as
they should be.

Hell, I bet they are aware of that and their switch to literate programming is
probably driven by it. It's probably just early in that cycle and things will
continue to get better.

I hope it's clear this comment is more about love of the library than a
complaint about what is otherwise an amazing thing.

~~~
mloncode
[Author of the blog here]: Thanks for that. Indeed, there is an opportunity to
document the library better. That's why I decided to roll up my sleeves and
spend an extended period of time documenting fastcore. It was an amazing
educational experience for me. I didn't know that python was this extendable,
and I previously wasnt aware of concepts like multiple dispatch. I think there
are other contributors currently working on other portions of the library as
well. The literate programming environment, which made its serious debut in
version 2 (that was just released), definitely encourages this and makes this
process really easy.

------
LukeB42
All libraries extend their language with new features.

A lot of the proposed value-add seems like the authors aren't aware the built-
in `type()` can dynamically modify class definitions.

Getting rid of `hex(id(self))` in `__repr__` is regressive UX.

@typedispatch seems like both a slow way of using 'isinstance()' and a FOMO
C++ virtual function bloat.

~~~
qayxc
What does typed dispatch have to do with virtual functions in C++? This type
of dispatch happens at compile-time in C++ and has nothing to do with "bloat"
or even virtual functions.

~~~
LukeB42
What's it offering that's more idiomatic than the `isinstance()` it breaks
down to other than resembling virtual functions in C++?

What's costlier, the new stack frames plus conditional for a decorator or the
`elif isinstance()`s in one function definition?

How does a programmer reference the correct function definition by name if
they're using @typedispatch instead of `isinstance()`?

~~~
qayxc
I get the feeling that you don't know what virtual functions in C++ are.

I'd suggest you stop comparing typed dispatch to virtual functions in C++,
because they have nothing in common. If you want to draw parallels, make sure
to use a suitable analog.

~~~
LukeB42
I'm thinking of templates, if that helps.

