Advanced.9: "Can use Python's built-in functions like map(), filter(), reduce()."
...and knows why you should probably use comprehensions instead.
Experts.4: "Can use Python's C API to extend Python with C/C++ code."
I've written Python professionally for 20+ years. I've extended Python with C for funsies, but have never, not once, needed to do it at work. I'm glad other people have used C (and now Rust) to speed up the Python modules I want to use. I'm equally glad I've been able to use those modules, not have to enhance them with C.
Experts.7: "Understands and uses Python's garbage collection system."
99% of what you need to know is "don't write circular references".
Experts.10: "Have a good understanding of Python's internals, such as bytecode, the Python interpreter's execution model, and how Python's data types are implemented at the C level."
See Experts.4. I've had a nice career writing code in Python, but haven't had to hack on the CPython codebase. I bet there are lots of people who are experts in C who've never written a line in the GCC or Clang codebases.
I find it interesting that you use reduce so often. Most python functions that handle two inputs handle it generically by handling a list of inputs, so passing in a full list exploded as args is sufficient and reduce isn't needed.
All of pythons built-ins at least work this way like sum(), print(), etc. As such, I have not ever reached for reduce. Sometimes a comprehension inside a function call though.
It's used to replace all looping. A combination of map and reduce is an alternative to loops.
Also you don't need map. You can use list comprehension to replace map. But list comprehensions can't replace reduce.
Technically reduce can also replace map. The only primitive you need to do all your iterations is technically just reduce.
If you want to get more elegant than you don't even need reduce. You can eliminate all redundant primitives and use functions themselves for iteration. It's Just recursion.
> ...and knows why you should probably used comprehensions instead.
This. The author does not seem to know that (1) reduce is not a built-in function anymore, and (2) map and filter were also planned to be removed from Python 3 [1]:
> About 12 years ago, Python aquired lambda, reduce(), filter() and map(), courtesy of (I believe) a Lisp hacker who missed them and submitted working patches. But, despite of the PR value, I think these features should be cut from Python 3000.
> Update: lambda, filter and map will stay (the latter two with small changes, returning iterators instead of lists). Only reduce will be removed from the 3.0 standard library. You can import it from functools.
I have only been writing python since Guido wrote that comment, but I have to say I don’t agree with him. I find the map/filter syntax to be very readable and intuitive. List comprehension is ok but to me becomes nearly incomprehensible (see what I did there?) where nesting is involved.
It’s easy to chain map and filter statements without losing readability. I think I might be in the minority though.
Theres nothing wrong with using them but what you are going to find across the community are comprehensions. They concisely bring the data container and the iteration together with no fuss.
If you need to map and reduce and filter and want to build a chain and/or have reusable predicate functions then go for it.
I always encourage my teams to reach for the ecosystem usage before doing things because another language does it different (map/filter/reduce being the best option in JS 90% of the time).
But maybe you work at a functional shop and everyone Gets It(tm) and then it is the norm.
They both seem self evident, though I prefer the functional syntax. I don't write enough python that I don't have to look stuff up each time it comes up. C++ and Rust are my normal tools, and they are both fine having a couple ways to do something. Is the "one way to do something" matra still a guiding principle for Python? Having multiple ways means when I guess at syntax I have more chances to be right.
There also seem to be really stong opinions about what is idiomatic. I find writing python without lots of typehints and a good IDE to be infuriating, but I've been told type hints aren't pythonic.
To me, the concept of “pythonic” has lost credibility for all the reasons you mention. It used to be that “duck typing” was a point of pride, something everybody learned. Now if you’re not type hinting, I think most people consider it bad practice.
There’s a bajillion ways to do everything. We use print() now, but not raise(). f-strings or .format()? Maybe c-style string formatting? How about dict formatting? Is adding strings ever permissible? Didn’t we used to use single quotes? I thought that was pythonic too.
Pythonic doesn’t exist. What’s popular is all there is. I say let’s bring map, filter and reduce into fashion.
> I've extended Python with C for funsies, but have never, not once, needed to do it at work.
I used to do it much more in my earlier days of python 2 in around the mid-00s. It was more often the case that there was some C library I wanted to interface back then.
It’s still useful in performance situations (high iteration counts, library writing, etc). But I find myself doing it much less often now.
> I've written Python professionally for 20+ years. I've extended Python with C for funsies, but have never, not once, needed to do it at work.
Well, different folks have different experiences. One of the first Python-related things I had to do for my job was implementing a Python API for a custom C library that controlled some ancient hardware.
Sure! That's totally a thing, and lots of people do that.
But I've done tons of stuff in Flask. I wouldn't say you need to "Have a good understanding of Blueprints" to be a Python expert. Or "proficiency subclassing yaml.YAMLObject". Or "encyclopedic knowledge of Requests verbs". Or "strong opinions on Poetry vs Pipenv". Or "can use Ansible to configure a fleet of servers".
Those are all common Python subjects in the areas I poke around a lot. You can be an expert data engineer without knowing any of those things. Just like you can be an expert backend engineer without having used Pandas a lot.
>You can be an expert data engineer without knowing any of those thing
Yes, but then you're exactly that, an expert data engineer, not an expert python programmer. Extensibility, in particular C interoperability is a core aspect of the language and it's to a non-trivial extent designed around it. Much of Python's utility rests on the fact that libraries exist who make use of this feature, especially in data science!
You can be an expert software architect in Lisp and write many programs without resorting to macros, but if you want to call yourself an expert Lisper you'll need to know your macros.
I get what you're saying, but I disagree. You can write an awful lot of expert-level Python code without learning C. After all, the result of that interoperability is something that looks and acts like a Python object. It's not like users of that code have to know about C. They just write Python like usual, and it's magically (from their POV) faster than if it were also written in Python.
As far as the data engineer bit, that's in the context of this article, which asserts that you need to know libraries predominately used by data engineers in order to be an expert programmer. I disagree, because Python is no more "about" data science than it's "about" web services or platform development or infrastructure management.
> Python is no more "about" data science than it's "about" web services or platform development or infrastructure management.
Amen, brother. It's incredibly annoying how every new niche that Python accreted over the years, is full of people who firmly believe Python exists only for them. At one point it was LISP refugees, then markup junkies (like me), then sysadmins, then 3d artists (!), then webapp jockeys, and now it's datascience and machine learning. Every kid thinks the world is built for him, but pushing such limited worldviews can only box the community into a corner - just look at Ruby to see what happens then.
Cool and overall sane, so lemme just pitch in from my perch on the critical side:
*Advanced*
> Can use advanced Python libraries like numpy, pandas, matplotlib.
You're a data scientist and this is not "Python".
> Can use regular expressions for pattern matching in strings.
Not Python really again.
> Understands and uses Python's memory management and optimization techniques.
Wait, like thinking about how a `list()` looks like from the (false) C-level perspective? Can you fix my CPU transistors while you're at it? (yes, /s)
*Experts*
> Can use Python's C API to extend Python with C/C++ code.
Do you mean whatever FFI Python has? Otherwise this is like asking your mechanic to also make it a plane, and a submarine. After all these are all just vehicles.
> Understands and uses Python's garbage collection system.
Ref counting? Wait what?
> Have a good understanding of Python's internals, such as bytecode, the Python interpreter's execution model, and how Python's data types are implemented at the C level.
Submarine :D
My perch also lets me see here the aspects I usually don't care about. Thank you!
The entry about numpy, pandas, and matplotlib also seems ranked way highly, if you ask me. I was using those to great effect as a young physics student almost a decade ago, long before I knew what a decorator or context-manager was. Yes I was probably googling a lot, but then again I still am.
> > Can use advanced Python libraries like numpy, pandas, matplotlib.
> You're a data scientist and this is not "Python".
Pandas and matplotlib, are a weaker case, maybe, but numpy is fairly broadly used (math being a core part of computing) library outside of just data science.
And I have learnt the basics (and some of the Advanced, as per the article) of Python before numpy even existed. Mixing external libraries, however useful!, with the language itself is not good. "Batteries included" would be very fine here - it comes with the language and that was one of the original (selling? nah, _utility_) points of Python (and now with the external stuff that has morphed... to its life-line really; so never underestimate the ecosystem!).
I'm probably guilty of (poorly) re-implementing bits of Numpy in various places. If I'm whipping out a little tool to show some timings, I'm likely to write a simple standard deviation function instead of dragging along Numpy for just that one thing.
Lots of backend/platform engineering uses math, but not always enough to justify adding such a huge dependency.
I feel similar. No matter how good at Python I get, I doubt I'll ever write an C interface, I just don't need that. I've used pandas and some of these randomly, but TBH I'd rather write a dataclass with static typing rather than re-construct data in pandas. Does that bump me down to a lesser skilled programmer?
I guess it's easy to pick on any rubric, but I found the implication of this one be unhelpful. I like the training materials that focus on continual development instead -- I don't recall any off the top of my head, but they provide a contextual learning path, like if you've mastered list comprehension now take a look at decorators.
“Can use advanced Python libraries like numpy, pandas, matplotlib.”
Interestingly in my 15 years of Python I’ve used numpy sparingly for image processing and haven’t used the others ever. I appreciate this isn’t meant to be an exhaustive checklist but it reminds me how two advanced coders could live in two entirely different drawers of the tool chest.
Same - I have been using Python since 1998 and I failed the LinkedIn skills quiz on Python because it was over 50% numpy questions and things about matrix math.
I'm basically allergic to pandas. My old job used a shared "utility" library written by someone who imported pandas in order to...write Markdown into log files.
In most circumstances, I feel like using metaclasses is a red flag. How many people are there on earth who understand and regularly use meta classes when they're appropriate? I always pegged metaclasses as "you should know this exists in case you ever get unlucky enough to need it but otherwise stay the hell away from it"
"Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why)." — Tim Peters
I once used metaclasses because I was writing a code generator for a DSL. But that was the only time.
Conceptually they're not too difficult but more likely than not YAGNI.
The one python pattern I used to use metaclasses for (registry types that keep an index of all their subclasses, very useful for DSL type tasks) is easier with __init_subclass__ and is somewhat more readable. So I haven't used metaclass in a minute.
The worst thing about metaclass is that 95% of python programmers look at it and just go "whuzzat?" and require a seminar to catch a clue. It's actively hostile to whoever maintains the code after you.
I'm with you completely about __init_subclass__. It covers almost all the uses where I'd otherwise use metaclasses. Junior engineers so that and say "oh, clever!", rather than running off screaming into the night.
I want to work with folks who know the difference between a module and a package. That follow pep8 for most things.
People that know what a namespace package is and how __init__.py works.
I care more about operating the toolchain efficiently and writing understandable code and avoiding "advanced" patterns until forced to use them for business impacting constraints (performance).
Use simple data classes or vanilla classes without a lot of methods. Functions that take objects in and dont mutate them.
Readable and understandable by folks new to the org and the team. And juniors.
People who care about the impact of maintaining code and put the effort into effective documentation and code comments.
Self-assessment is so messy that lists like this are just misleading.
Among other issues, many of the criteria for "expert" sound more like the criteria for "library developer". Many conventionally-expert application developers and data scientists would smartly not use those features and may self-assess themselves out of the category for lack of active practice. Meanwhile, too many conventionally-junior developers would sophomorically insist that they can and should use them and self-assess themselves as experts.
I needed this, I write python all the time but I know I am not a pro and the language features always change. I just never needed asyncio for example. Maybe it's my C experience in the past speaking but for most use cases I prefer to use simple features anyone who is invested in python can reas easily so that others in the future can maintain it. So, I don't go out of my way to look for special and advanced ways to speed up python or make it more efficient. If performance was that important to me, I would use something like Go, which at least as of 2yrs ago was somehow even simpler than Python and faster.
There are a lot of kind of questionable/obscure skills in tiers 2-4, but I think that's fine actually. It's not a software development skill assessment; it purports to be only about "python expertise." There are definitely people out there with strong knowledge of every topic in item 3 and I'd definitely be persuadable that such a person is a "Python expert."
I just worry, when I see a list of things tagged "expert" like this, that people will take it as a learning path, which... Could be good, but there are much straighter paths to learning to be effective as a developer.
I think the goal of the author was to explain how he rates candidates for the positions he is offering, rather than answering properly the question "what makes a beginner / advanced / expert Python developer?". This is made evident by the fact that most contributors to this thread strongly disagree with the author's choice. I personally find it very weird that sets are listed as an advanced feature of the language. And understanding Python's dynamic typing system is somehow an expert-level achievement?
Given he's selling the book at the top for the weird theoretical-but-maybe-not-reality-based "interview process", I think some of the things lack practical feet-on-the-ground expertise levels.
Personally I think a few things actual experts should know are:
- python packages: pip vs package manager vs conda
Python 2 seems like a weird one at this point. I am sure there are plenty of code bases in the wild that are still running Python 2, but what version? You cannot assume 2.7 and there are probably a lot of gotchas I no longer recall from say 2.3 to 2.7.
An expert will relearn the differences if/when it needs to be done.
From the comments section, i understand that the author did not do a good job. Can someone please share other resources, that contains an exhaustive list of things to learn in python.
Maybe generators? I could kind of see an argument for some tricks like using tuples or slots to save memory. Anything in functools?
Those are all under the heading of conserving memory, nothing to do with explicit memory management. If I needed that tight of control, I would not be writing Python.
Also it need be said that one’s initial shock at the state of numpy/pandas will probably increase with their ‘general’ programming proficiency.
The degree to which the ‘Python data science’ scene has their own separate set of conventions, and the degree to which there’s a lot of weird meta-programming going on in these packages, is quite jarring.
Advanced.9: "Can use Python's built-in functions like map(), filter(), reduce()."
...and knows why you should probably use comprehensions instead.
Experts.4: "Can use Python's C API to extend Python with C/C++ code."
I've written Python professionally for 20+ years. I've extended Python with C for funsies, but have never, not once, needed to do it at work. I'm glad other people have used C (and now Rust) to speed up the Python modules I want to use. I'm equally glad I've been able to use those modules, not have to enhance them with C.
Experts.7: "Understands and uses Python's garbage collection system."
99% of what you need to know is "don't write circular references".
Experts.10: "Have a good understanding of Python's internals, such as bytecode, the Python interpreter's execution model, and how Python's data types are implemented at the C level."
See Experts.4. I've had a nice career writing code in Python, but haven't had to hack on the CPython codebase. I bet there are lots of people who are experts in C who've never written a line in the GCC or Clang codebases.