Unit testing to me seemed akin to drinking 8 glasses of water every day. A lot of people talk about how important it is for your health, but it really tends to get in the way, and it doesn't seem to really be necessary. Too frequently, code would change and mocks would need to change with it, removing a good chunk of the benefit of having the code under test.
Then I started writing integration testing while working on converting a bunch of code recently, and it has been eye-opening. Instead of testing individual models and functions, I was testing the API response and DB changes, and who really cares what the code in the middle does and how it interfaces with other internal code? So long as the API and DB are in the expected state, you can go muck about with the guts of your code all you want, while having the assurance that callers of your code are getting exactly out of it what you promise.
Unit test suites would break all the time for silly reasons, like someone optimizing a function would mean a spy wouldn't get called with the same intermediary data, and you'd have to stop and go fix the test code that was now broken, even though the actual code worked as intended.
Integration tests (mainly) only break when the code itself is broken and incorrect results are getting spit out. This has prevented all kinds of issues from actually reaching customers during the conversion process, and isn't nearly so brittle as our unit tests were.
When your TDD approach revolves around integration tests, you have complete freedom to add, remove and shift around internal components. Having the flexibility to keep moving around the guts of a system to bring it closer to its intended behavior is what software engineering is all about.
This is also how evolution works; the guts and organs of every living creature were never independently tested by nature.
Nature only cares about very high-level functionality; can this specimen survive long enough to reproduce? Yes. Ok, then it works! It doesn't matter that this creature has ended up with intestines which are 1.5 meters long; that's an implementation detail. The specimen is good because it works within all imposed external/environmental constraints.
That's why there are so many different species in the world; when a system has well defined external requirements, it's possible to find many solutions (of varying complexity) which can perfectly meet those requirements.
Unit tests on the other hand are useless at identifying complex issues like race conditions; they're only good at detecting issues that are already obvious to a skilled developer.
* I restart the system from scratch between each test case but if that's too time-consuming (I.e. more than a few milliseconds), I just clear the entire system state between each test case. If the system is truly massive and clearing the mock state between each test case takes too long, then I break it up into smaller microservices; each with their own integration tests.
I mistunderstood your Integration tests. I understand them as something like checking from the end user contract. If so, without TDD, it won't be possible to track down the bug easily as with TDD.
Mock based testing however is the only thing I've ever encountered that forces me to think very, very clearly about what my code is doing. It makes me inspect the code and think about what dependencies are being called, how they're being called, and why they're being called.
I have found that this process is extremely valuable for creating code that is more elegant and more correct. I value mock tests not for the tests that I end up with at the end, but for the better production code that I wrote because of them.
Or they don't. If the tests just continue to work, then you ask yourself whether the change should have broken them. If not, you go on your way with a fair amount of confidence. But if the change should have broken the tests, then you have to look at why it didn't...
While yes, unit tests do have a maintenance burden, they are often reproducible, less flaky, give you targeted debug information, and run extremely fast.
There are heavy costs to integration and e2e testing that often gets dismissed by developers who often have not experienced the fast feedback loop a good fat unit suite gives you.
E2E UI tests are definitely hard to debug though, in most cases. They still have value, but I’m a fan of the “testing pyramid” here. A small number of E2E UI tests, a decently large number of integration tests that each try to test a single contract between 2 services, and a tonne of unit tests to cover sad paths, detailed behaviour, etc.
A few months ago, I did a contracting job and the team I worked with used the "mock everything" approach, without even one integration test, which to me seemed crazy (especially for a component which was calle "integration layer").
I tried hard to find the advantages in this approach, studying what the rationale and the best practices were, and questioned my previous assumptions. In the end, I had to confirm those assumptions: even if there were hundreds of tests and they all passed, many logical errors weren't caught.
Even worse, they gave a false sense of confidence to the team, and made refactoring super-slow. But the team leader was super-convinced it was the best idea since sliced bread.
Additionally, people with large state machines with too-complex sets of possible states (games, big frontends without a top level state management system) tend to only unit test because it’s frequently too much of a pain to set up an integration runtime environment. Places with lots and lots of manual QA testers.
A lot of people overuse mocks when testing. In fact, using mocks enforces coupling between different methods because a lot of people use them to assert that a method with a specific name was called with specific parameters or they create one that asserts a method with a certain name returns a certain value. So when one wants to refactor, they not only have change the code; they need to update all the mocks that reference it as well.
I've found that a better way to structure code is to take the result of an external dependency and pass it in as a parameter to a method that will process it. Then when I unit test that method, I just pass in what I expect from that dependency and assert on the return value of that method. I don't try to unit test the outer method that calls the dependency by creating a mock call for it.
> Then I started writing integration testing while working on converting a bunch of code recently, and it has been eye-opening. Instead of testing individual models and functions, I was testing the API response and DB changes, and who really cares what the code in the middle does and how it interfaces with other internal code?
It makes it easier to isolate the cause of the error rather than having to search to the entire call chain to find it (especially if it's a logic error that doesn't result in an exception). Plus, integration test suites take a lot longer to run and can have timing issues due to caching or other reasons which can result in sporadic failures.
I am fan of Gary Bernhardt ever since viewing this talk.
For GUIs, the proper approach is to unit-test the functionality underneath, not the GUI itself.
After all, any program out there does something. If you know what it's doing, you know what it should do, and what it shouldn't. That means you can test it.
After all, writing the test needs to be possible, to start with.
So adapters, views, commands and what have you need to exist only to fulfill such purpose, and even then, their interactions with the GUI layer don't get tested.
So one is creating them, without knowing if they are the right elements for the GUI layout.
Hence why testing, 100% behind it, TDD not so much.
This isn't to say that integration tests aren't great, too. They all tell different stories. I like unit tests, as they force me to think about modules as units. If a unit is hard to test or brittle, it may need more attention to its design.
As the codebase gets larger and more complex (and interesting!), I want unit tests to fail because of small changes. That's actually useful feedback, whereas the simple, brutal failure of an integration test is just not granular enough to quickly help me understand the details of the change.
For future tests and projects, unit tests do have a place and I'll make better use of them.
A good learning experience, none the less.
They serve two different purposes.
Unit tests scale a lot better. That's why most generally use a pyramid structure: lots of unit tests, a moderate amount integration tests, and a few end-to-end tests.
Mentally, something like clojure’s ring framework make this easier to grasp: the abstraction it provides is dictionary-in, dictionary-out. Once you have something like this, there’s no need to spin up a web server to do integration testing: you just shove a bunch of dictionaries and and make sure the output dictionaries are what you expected.
A good approach is to use each technique where you get the most bang-for-buck: use unit tests only for pure functions, and decouple your system in a way that integration tests can be reduced to simple data-in data-out (which also makes them very fast)
If your integration tests can be reasonably be set up with a couple containers, great, but not every system is that flexible. And not every data store is that simple to provision.
Knowing about that bug is really valuable knowledge and not adding a test for it is basically like throwing away that knowledge instead of sharing it for future people working on the project.
Can you or others speak more about this? I was taught that verifying function calls for spies/mocks was good practice. But, I encountered this problem just the other day when I refactored some Java code for a personal project. Everything still worked perfectly, but, exactly as you said, the intermediate function calls changed so the tests would fail due to spies/mocks calling different "unexpected" functions.
I'm an intermediate programmer so can someone with more experience fill me in with what's best practice here and why? Do I update the test code to reflect the new intermediate function calls? But this whole approach now seems silly since a refactoring that doesn't affect the ultimate behavior of the function that is under test will break the test and that seems wrong. So do I instead not verify function calls when using spies/mocks? In that case, what is the use case for verifying spies/mocks?
If you have a function that fetches data, you shouldn't test that its hitting the data layer, only that the correct data is returned. This way when you improve the function to not hit the data layer at all under some conditions, your tests will keep passing.
On the other hand, if that function is supposed to log metrics or details about its execution, you should test that ,as it is part of its contract and can't be inferred from the return value.
The important part is to remember that not every function is a component, and not even every class by itself. Where to draw the boundary between components is itself an explicit design decision, and should be made consciously, not mechanically.
Instead of mocks, some people prefer to build fakes / stubs which are versions of a dependency which are "fully operative" in a sense but with a simplified internal implementation. For example a repository that keeps entities in memory. (Not the same as an in-memory database! The fake repository wouldn't use SQL at all.)
Tests would check the final state of the fakes after the interactions, or simply verify that the values returned by the tested component are correct.
The hope is that fakes, while possibly more laborious to set up, allow an style of testing that focuses less on the exact interactions between components, and therefore is less brittle.
- Mocks aren't stubs https://martinfowler.com/articles/mocksArentStubs.html#Class...
- From interaction-based to state-based testing http://blog.ploeh.dk/2019/02/18/from-interaction-based-to-st...
Also, many of the assertions that are often put into unit-tests (especially at stub/mock boundaries) are better formulated as invariants, checked in the code itself. Design-by-contracts style pre/post-conditions is a sound and practical way of doing this. When this is done well, you get the localization part of low-level unit testing even when running high-level tests. Plus much better coverage, since these things are always checked (even in prod), not just in a couple of unit tests. And it is more natural when refactoring internal functions to update pre/post-conditions, since they are right there in the code. When a function disappears they also do.
I don't like the term "integration" tests though, as they hint at interactions between systems being the important thing to test. Integration between services / subsystems are just as much a detail as internal function calls. If using the real system during test is too complicated or slow, maybe it should be simplified or made faster? Only when that is not feasible do I build a mock.
The trick for me is focusing on what the unit does from a consumers perspective. Avoid testing implementation details (unless they are important side effects), and test the behavior that does not change. If you do that, then refactoring becomes easier, because tests will only break when the contract of the unit changes.
We have a lot of data that explicitly shows that automated unit-testing doesn’t work. One good example is one of our ESDH systems, which changed supplier in a bidding war, partly because we wanted higher code-quality.
It’s a two million line project, we paid for 5000 hours worth of refactoring and let the new supplier spend two years getting familiar with it and setting up their advanced testing platforms.
So far we’ve seen some nice performances increases thanks to the refactoring. It has more problems than it did before though, even though everything is tested by unit tests now and it wasn’t before.
We have a lot of stories like this by the way. I don’t think we have a single story where unit-testing actually made the product better, but we do have tested systems where we couldn’t say because they always had unit-testing.
Ironically we still do TDD ourselves on larger projects. Not because we can prove it works, but because everyone expects it.
At best you have data that shows that a poor unit testing implementation failed to deliver. A bad experience doesn't prove a whole tech strategy doesn't work when the whole world shows otherwise.
Because your users should not be your testers and you must catch bugs before deploying them in production.
You may be able to deploy in 3 minutes but it takes way more to debug and fix your bugs.
Testing helps the project to catch bugs prior to deploying them and also provides the infrastructure to avoid regressions.
Write tests. Not too many. Mostly integration.
Use system/integration/functional/unit tests wisely.
For precise stateless stuff, like making sure your custom input format parser/regexp covers all edge cases, I prefer unit tests - no need to init/rollback database state.
If a unit test fails when the code under test hasn't functionally changed then it means that your unit tests are flaky.
The biggest cause lies in inter-service communication. You push transaction boundaries outside the database between services. At the same time, you lose whatever promise your language offers that compilation will "usually" fail if interdependent endpoints get changed.
Another big issue is the service explosion itself. Keeping 30 backend applications up to date and playing nice with each other is a full time job. CI pipelines for all of them, failover, backups.
The last was lack of promised benefits. Velocity was great until we got big enough that all the services needed to talk to each other. Then everything ground to a halt. Most of our work was eventually just keeping everything speaking the same language. It's also extremely hard to design something that works when "anything" can fail. When you have just a few services, it's easy to reason about and handle failures of one of them. When you have a monolith, it's really unlikely that some database calls will fail and others succeed. Unlikely enough that you can ignore it in practice. When you have 30+ services it becomes very likely that you will have calls randomly fail. The state explosion from dealing with this is real and deadly
I'm not an expert by any means, but I'm pretty sure that statement indicates a problem.
And the easiest way to do that is to not build a microservices architecture; instead (and I hope I'm preaching to the choir here) build a monolith (or "a regular application") and only if you have good numbers and actual problems with scaling and the like do you start considering splitting off a section of your application. If you iterate on that long enough, MAYBE you'll end up with a microservices architecture.
What saved us before, was our forest of code could depend on the database to maintain some sanity. And we leaned on it heavily. Hold a transaction open while 10,000 lines of code and a few N+1 queries do their business? Eh, okay, I guess.
Maybe we didn't have the descipine to make microservices work. But IMO our engineering team was pretty good compared to others I've seen. All our "traditional" apps chugged along fine during the same period
Not even the army has perfect discipline, even with hard training. They have cross-checks, piles of processes and move slowly for the most part
(Software development shouldn't aim to be like the army though)
When the services have their own datastores, well now they need to talk to eachother
However a much larger problem was overall bad tooling. Specifically the data storage requirements for an event stream eclipsed our wildest projections. We're talking many terabytes just on our local test nodes.
We tried to remedy this by "compressing" past events into snapshots but the tooling for this doesn't really exist. It was far too common for a few bad events to get into the stream and cause massive chaos. We couldn't find a reasonable solution to rewind and fix past events, and replays took far too long without reliable snapshots.
In the end I was convinced that the whole event driven approach was just a way of building your own "projection" databases on top of a "commit log" which was the event stream.
Keeping record of past events also wasn't nearly as useful as we originally believed. We couldn't think of a single worthwhile use for our past event data that we couldn't just duplicate with an "audit" table and some triggers for data we cared about in a traditional db.
Ironically we ended up tailing the commit log of a traditional db to build our projections. Around that time we all decided it was time to go back to normal RPC between services.
Now I'm seriously considering a somewhat hybrid approach: Collect all of my domain data in one giant normalized operational data store (using a fairly traditional ETL approach for this piece), and then having separate schemas for my services. The service schemas would have denormalized objects that are designed for the functional needs of the service, and would be implemented either as materialized views built off the upstream data store, or possibly with an additional "data pump" approach where activity in the upstream data store would trigger some sort of asynchronous process to copy the data into the service schemas. That way my services would be logically decoupled in the sense that if I wanted I could separate the entire schema for a given service into its own separate database later if needed. But by keeping it all in one database for now, it should make reconciliation and data quality checks easier. Note that I don't have a huge amount of data to worry about (~1-2TB) which could make this feasible.
I'm going against the Martin Fowler grain hard here, but Event Sourcing in practice is largely a failure. It's bad tooling mostly as I mentioned, but please stay away. It's so bad.
Doesn't that imply that each service then has to store any data it receives in these events - potentially leading to a lot of duplication and all of the problems that can come with that (e.g. data stores getting out of sync).
Honestly, its a bad name for the architecture.
This isn't normal.
You should just have a single CI pipeline, failover and backup approach that is parameterised for each microservice.
The overhead was huge compared to "traditional" apps. Just updating a docker base image was a weeks long process.
We were a Django shop. Switching to SOA didn’t mean switching languages and frameworks.
These are attributes that 80% of projects and teams lack so when they decide to jump onto the microservices bandwagon the shit hits the fan pretty quickly.
I discovered the value of compile time type checks when I worked on large codebases in dynamic languages where every change was stressful. In comparison having the compiler tell you that you missed a spot was life changing.
I discovered the value of immutable objects when I worked on my first codebase with lots of threading. Being able to know that this value most definitely didn't change out from under me made debugging certain problems much easier.
I discovered the value of lazy evaluation the first time I had to work with files that wouldn't fit in entirely in memory. Stream processing was the only way you could reasonably solve that problem.
Pretty much every paradigm shift or opinion change I've had was caused by encountering a problem I hadn't yet run into and finding a tool that made solving that problem practical instead of impractical.
That in itself can turn into a learning experience if you stick around for the aftermath.
Sounds like me in reverse: I discovered that value when I had to do work in a dynamic language after working only in C and C++. It's like that old saying but not knowing the value of something you have until you lose it.
I now get way more annoyed by the in-house build system than the huge C codebase.
Now `unique_ptr` is worth missing. And destructors are good too -- you couldn't have unique_ptr without them. But in it's own way, C has both: its just you just have to remember to call the destructor yourself, every single time.
You start using `shared_ptr` because you are too confused about the code to know which of the two "contexts" is supposed to own the thing. So shared_ptr (might) fix your memory freeing problem, but it maintains (and sometimes worsens) the problems caused by having two different contexts that might or might not be alive at any given time.
With two different threads however, such problems are often unremoveable, which is why shared_ptr is the right solution mitigation.
The C language itself is _beautiful_ but I am missing a beautiful standard library! Things which are trivial one-liners in other languages are sometimes 10-20 lines of brittle boilerplate code in C. If the standard library would have a bit more batteries included it would make trivial task easier and I could concentrate on actually getting work done. Opening a file and doing some text processing take usually a few lines in Python/PHP but in C you have 3 screens of funky code which will explode if something unforseen happens.
And working with additional libraries also a nightmare compared to composer/cargo. Adding a new library (and keeping all your libraries up to date) is dead-simple in basically any language besides C/C++.
tldr: I love the language itself but the tooling around it sucks.
The library has a dependency on another library, and when you go look at the dependency, it tells you that it needs to use a specific build system. So you have to build the library and its dependency, and then you might be able to link it.
But then, turns out, the dependency also has dependencies. And you have to build them too. Each dependency comes with its own unique fussiness about the build environment, leading to extensive time spent on configuration.
Hours to days later, you have wrangled a way of making it work. But you have another library to add, and then the same thing happens.
In comparison, dependencies in most any language with a notion of modules are a matter of "installation and import" - once the compiler is aware of where the module is, it can do the rest. When the dependencies require C code, as they often do, you may still have to build or borrow binaries, but the promise of Rust, D, Zig et al. is that this is only going to become less troublesome.
To be honest, the per-language package management seems wasteful and chaotic. I must have a dozen (mutually compatible) of numpy scattered around. And why is pip responsible for building FORTRAN anyway?
When things go sideways, I find troubleshooting pip a little trickier. Some of this might be the tooling, but there's a cultural part too: C++ libraries seem fairly self-contained, with fewer dependencies, whereas a lot of python code pulls in everything under the sun.
I’m willing to put up with a lot more using a built in. It’s built in, a junior dev can be expected to cope with that. Adding dependencies has a cost. Usually the cost is bigger than realized at the time.
I've used it for all sorts of stuff earlier, less complex than the other. Scripts, devops, ETL... But then I got into a company that is using it for some quite serious stuff, a large codebase. Holly smokes this thing does not scale (in terms of development efficiency and quality) well. I swear at least 70% of our bugs is because of the language and half of the abstractions are there just to lessen the chance of some stupid human mistake.
Sorry, but I will not pick it ever again for even a side project.
I got a somewhat direct comparison when I mypy-ified a small program where I used a lot of async and await (basically implementing own event loops and schedulers, it was interfacing very custom hardware that handled very different but interacting streams at once).
So basically I did it because I was tired of not noticing mixing up "foo()" and "await foo()", but then the static typing continued to catch a myriad of other, unrelated problems that would have ruined my day way late (and often obscurely) at runtime.
For small ("scripty") to moderate sized things, mypy absolutely recovered my faith in it.
I also completely switched to python 3 after being in python 2 unicode hell just once. There are very good reasons for python 3.
i agree that typing saves time but i am struggling to produce evidence for that
Just think about this not uncommon scenario: "The program crashed two hours into testing because apparently, we somewhere set this element in this deep structure to an integer instead of a list, and are now trying to iterate over that integer. But we cannot easily fix it, because we don't know yet where and why we set it to an integer."
The compiler would have immediately given you an error instead.
So, collect all the errors that are, e.g.:
- Addressing the wrong element in a tuple,
- any problems arising from changing the type of something; not just the fundamental highest level type ("list" or "integer"), but small changes in its deeper structure as well, e.g. changing an integer deep in a complex structure to be another tuple instead,
- "missed spots" when changing what parameters a function accepts; this overlaps with the former point if it still accepts the same arguments on the surface, but their types change (in obvious, or subtle "deep" ways),
- any problems arising from nesting of promises and non-promise values,
and many, many other problems where you can trivially conclude that the compiler would have spit out an error immediately, and explain to management how various multi-hour debugging session could have been resolved before even running your thing.
mypy and types are only "hoops" in comparison to non-annotated Python (in terms of added syntax / coding effort), yet compared to explicitly typed languages where you have to declare types, that's just standard thing you have to do so there is no extra effort in comparison to these other languages (and then the judgment that it only works "half as well" is controversial (Edit: or at least needs qualification)).
and what i mean by hoops is that python is not designed to have static types. thus, any type system added is tacked on by definition and will lead to "hoops". in something like f#, at no point will the existence of its type system and inference be a surprise. the language is built around and with the type system.
A proper numeric tower. Python has complex numbers, but they don't seem well integrated (why is math.cos(1+2j) a TypeError?). Fractions are frequently very useful, too, and Python has them, in a library, but "import fractions; fractions.Fraction(1,2)" is so much more verbose than "1/2" that nobody seems to ever use them.
Conditions! Lisp's debugger capabilities are amazing. And JWZ was right: sometimes you want to signal without throwing. Once you've used conditions, you'll have trouble going back to exceptions. They feel restrictive.
(I've come to accept that in a language with immutable data types, like Clojure, exceptions make sense. Exceptions feel out of place, though, in a language with mutability.)
Other big wins: keywords, multiple return values, FORMAT (printf on steroids), compile-time evaluation, a native compiler (and disassembler) with optional type declarations.
Lisp is unique among the languages I've used in that it has lots of features that seem designed to make writing large programs easier, and the features are all (for the most part) incredibly cohesive.
If you have multiple dispatch, then building/supporting a numeric tower is natural.
In your other reply you pointed out macros. They are a mixed blessing, easily misused. Other languages have them but use them more sparingly and making it harder to overlook their special status, which leads to better "code smell" in my opinion.
Do take a look at Julia. It has learned deeply from CL and innovated further.
Macros are easily misused, true, but so can any language feature. I can go on r/programminghorror and see misuses of if-statements. It's the classic argument against macros, and I hear it a lot, but I can't say I've seen it happen.
25 years ago, conventional wisdom said that closures were too complex for the average programmer, and today they're a necessary part of virtually every language. Could we be reaching the point where syntactic abstraction is simply a concept that every programmer needs to be able to understand?
I think "macros are easy to misuse" comes from viewing macros as an island. In some languages (like C), they are: they don't really integrate with any other part of the language. In Lisp, they're a relatively small feature that builds on the overall design of the language. Omitting macros would be like a periodic table with one box missing. It'd mean we get language expressions as native data types, and can manipulate them, and control the time of evaluation, but we just can't manipulate them at compile time.
Also, macros, which are what (the view layer of) CLOS is built with. In most other object oriented languages, the language isn’t powerful enough to implement its own object system.
Having the full power of function calls in all cases (named or anonymous) is incredibly helpful. I know the Python party line is “just use def” but adding a line here and a line there adds up fast. A syntax for function objects is also great.
I’ll also call out the ability to use dynamic or lexical scoping on a per-variable basis. That has saved me hundreds of lines of work, and made an O(1) change actually an O(1) patch.
That being said Julia's type system is definitely the way of the future in my opinion.
See here: https://wiki.python.org/moin/Why%20is%20Python%20a%20dynamic...
"objects can change their shape anytime" is a function of dynamic typing and is orthogonal to strong or weak typing.
Everything can be used as a bool. This was often used to check for None, but had some issues when used with - for example - datetimes which evaluated midnight as false. In part due to the fact the integer 0 evaluates to false.
I'll grant you python is stronger than some other dynamic languages, but it is still at least half an order of magnitude weaker than Julia, which is strong in ways approaching Haskell.
Unfortunately Typescript's type system is largely driven by the need to represent anything you can do in JS, for library compatibility reasons.
The main designer also wrote C# and Delphi. In a lot of ways, C#'s type system is better (less complex), but Typescript has the huge advantage of working with existing JS
Fortunately(?), I never worked for a company that used it for a large codebase so I never found out if my assumption was correct or not.
In a non-trivial system with solid engineering, QA concerns quickly go beyond details like which programming language is used. Like how to QA entire systems, sub-system interactions, ensure low time from bug discovered in production to fix deployed, eliminating recurring sources of issues etc.
But if the culture that caused the choice of a dynamic language is "oh its just so much easier and productive to not have to write types all the time!", then you are going to be in for some serious mess and pain in a larger system. That is not a technical problem with dynamic typing/language though :)
My initial thought when working with Python wasn't from a bugs/QA point of view but merely looking at the productivity of a developer working on a large code-base. Things like accurate auto-complete, code discovery, architectural understanding, knowing what 'kind' of object a function returns and so on become more important once the codebase and the amount of engineers working on it increases.
Source: Have been coding in Java since 1.0.
The user friendliness comes both with how uncomplicated it is to write them, but also how easy it is to process them in parallel (a nightmare in Python).
Also if you're fixing broken Python code, you're using Python, so no that doesn't really track.
What fundamental changes do you see since 2.2? If you're talking about the object model; objects in python were garbage before and after 2.2, and as a paradigm, it's mostly useless bureaucracy. Bleeding edge 90s ideas.
I'm guessing based on your incredulity about 15 or 20 years older than you.
Python isn't new to me: it's old, and it's crap, and its "evolution" is towards a dead end. Stuff like nodejs will eventually supplant it if it hasn't already, and with good reason (not that I am a huge fan). Python was a novel design and a great choice ... back in the 90s. I mean, use it if you like it; use Forth or Lua or whatever you like. I think it's terrible and should be abandoned wherever possible.
Python should be new to you, or at least newer than when you first encountered it. The fact that it's not means you have no clue what changed, so you have no clue if it's any better.
The only thing that needs to be abandoned is this fake idea that older people carry knowledge that can't be expressed except in the form of trust. If you've got reasons, let's hear 'em, but "I'm old" isn't a reason to do anything.
Mind boggling really.
I understand if you inherited/maintaining a legacy application, you may not have had a choice about 2.7, but if it's a significantly large project and you are not making any attempts to use Python3 (and many of the improvements that come with it, including optional typing as one commenter mentioned, don't blame language, unless you have a solution that is magically going to fix all the problems from a language that is pretty much in "maintainance-mode! please use the new version" mode).
When CPUs ran at 1 MHz, I wasn't so sure about FORTH with its RPN. But it ran a lot faster than interpreted BASIC, and was a lot faster to write than assembler. Once the world discovers the source of all its woes and goes back to wide-open, 8-bit systems, I'll go back to FORTH.
I hear this a lot and I think it's a misunderstood statement. Python does not care if you do not have a space in assignments or arithmetic or between commas or parentheses.
What Python does care about is the indentation of the source code. The indentation is what guides the structure - which is already what we are doing with most languages that don't care about indentation!
What I really mean to say is there are plenty of valid complaints with Python, but white space just is not one of them. If you are writing good code in a language with C-syntax you are doing just as much indentation.
No that's not what the other languages are doing. They have explicit structure defined in the code (with Lisps being at the extreme end), which allows the development environment to automatically present the code in a way that's easy to read. This frees the developer from the job of manually formatting their code like some sort of caveman.
As someone with 8+ years of experience of programming in Python for a job, I've seen countless of bugs spawned by incorrectly indented code, which is incredibly difficult to spot. I've seen people far more experienced than me make these bugs because they didn't notice something being misindented. To me, the fact that we have to deal with this is laughable. Especially considering it's so easy to fix Python the language so that indentation becomes unambiguous: add an "end" statement for each deindent (aka closing brace).
Is this a claim that people do not indent in other languages and leave it to the dev environment? Yes, those languages don't rely on indentation but people do still manually indent or rely on their environment to indent it for them to make the code remotely readable. I for one cannot read Java without it also being indented correctly.
> This frees the developer from the job of manually formatting their code like some sort of caveman.
I honestly don't understand this part. What tools are you using that don't do indentation for you? Emacs and vim extensions, vscode, Pycharm, atom...all of them have very intuitive indentation for when you type. The most you have to do is hit backspace after finishing a block.
> Especially considering it's so easy to fix Python the language so that indentation becomes unambiguous: add an "end" statement for each deindent (aka closing brace).
As someone with a lot of Ruby experience, the "end" is absolutely not more clear than indentation. There's a reason environments highlight do-end pairs together: because it's hard to know which ones match which.
Nobody manually indents their code. Almost any language other than Python is unambiguous to indent, so the computer does it for you.
>I for one cannot read Java without it also being indented correctly.
That's easy, just copy paste Java code into an editor and press a button to indent everything. Voila! Good luck if you're dealing with Python code which got misindented somehow (e.g. copying from some social network website which uses markup that doesn't preserve whitespace, which is most of them).
>What tools are you using that don't do indentation for you?
Python code cannot be re-indented unambiguously. So if you copy paste a chunk of Python code from one place to another, you can't just press a button to reindent everything. You have to painstakingly move it to the right level and hope that it still works. In Common Lisp I just press Ctrl-Alt-\ and everything becomes indented correctly.
>The most you have to do is hit backspace after finishing a block.
That works if you only write new code and never have to change existing code.
>There's a reason environments highlight do-end pairs together: because it's hard to know which ones match which.
No, the open/close brackets allow the IDE to highlight them so that the programmer can clearly see the scope of the code block. This is a useful feature of the language. In Python it's almost impossible to see which deindent matches what if the function is long enough/deep enough.
This position is hard to maintain after you've spent an hour trying to debug a nonsensical error just to realize you opened the python file in an editor that used a different tab/space setting than the file was created in. Significant whitespace is one of the biggest misfeatures in programming history.
I've mixed spaces and tabs before on a handful of occasions, and it's always told me straight away what the problem is.
Here is an example of mixed spaces and tabs for indentation:
You'll see that it raises an error:
TabError: inconsistent use of tabs and spaces in indentation
IndentationError: unindent does not match any outer indentation level
When something tells you "indentation error", where's the first place you would look, if not the indentation? When you know you have a problem with the indentation, what do you think you have to change, if not the whitespace? There isn't a lot of opportunity to go in the wrong direction here.
Don't get me wrong, Python definitely has unexpected, difficult to debug behaviour for newbies (e.g. mutable default arguments). But this in particular isn't one of them. This is 2+2=4 level stuff.
>what do you think you have to change, if not the whitespace?
Note that the same error shows up when you have an inconsistent number of spaces to indent a block.
> Note that the same error shows up when you have an inconsistent number of spaces to indent a block.
And what would the solution to that be, if not changing the whitespace?
Every single piece of information available is strongly pointing in the same direction here.
And what about one-liners, a la Perl? Stack Overflow answers seem to imply either embedding newlines in your string, or using semi-colons (as a Python newbie... you can use semi-colons?!)
On the other hand, if your function needs a flow structure that's more complicated than a single line, should you really be inlining it?
And I think any such use case that you really want to inline can probably be accomplished with list comprehensions
I did the same thing. In fact, this is why I came to comment on this thread: my initial reaction was to turn my nose up at the significant indentation. Then I gave it a try, and I got over it in about five minutes. It just wasn't the big deal I thought it was. I've heard a lot of other people say the same thing.
The only time this should ever be an issue is when you're copy/pasting code from a web page or other source that doesn't preserve white space when you copy.
Otherwise, I'll never understand why this is so hard for people. No matter what language you're using, you should be properly indenting code blocks 100% of the time, and if you're doing that, Python's white-space-as-syntax will never be a problem.
Now I see OO as something I have to deal with, like a tiger in my living room. Thankfully, so many new languages have come out recently; Go, Rust, and Elixir being the ones that I use regularly, that have called out OO for what it is and have gone in more compelling directions.
Hopefully one day they will teach OO alongside other schools of thought, as a relatively small faction of programming paradigms.
Now I can compare Erlang to Java and it is really baffling how the heck Java took over the world. To do erlang I just need an editor with some plugins, ssh connection to linux with OTP installed and of course rebar3. To do Java I need 4GB of RAM to simply run an IDE with gazillion of plugins, maven to cater for thousands of dependencies for the simplest app and I need to know Spring, Hibernate, AOP, MVC and quite a chunk of other 26^3 3-letter abbreviations. No thanks.
> Build encapsulated components that manage their own state, then compose them to make complex UIs.
Components managing their own state is a textbook definition of OOP. They even use inheritance in their example on the main page:
> class HelloMessage extends React.Component
For the React that I write, class components are used only when there's some trivial local state that I don't want to put into Redux (e.g. button hovering that can't be done in CSS), or when I need to use component lifecycle methods.
And yes, class components do inherit from React.Component, but they specifically discourage creating your own component base classes.
I don't do web development but I've read react API docs and user guides.
Objects calling other objects is optional for OOP, I never saw a definition that requires them to do. OOP is about code and data organization.
Objects and methods are everywhere in react. Some are very complex.
Just because it uses a few lambdas doesn't mean it's not OOP.
For reference, here's now a non-OOP GUI library may look like: http://behindthepixels.io/IMGUI/ As you see not only it's hard to use, it doesn't scale.
Like it or not, OOP is the only way to deal with complex state invented so far. Even in functional languages: https://medium.com/@gaperton/let-me-start-from-the-less-obvi... And modern rich GUIs have very complex state.
Smalltalk, which is the prototypical OO language, does the exact opposite of everything you said (all computations happen by message passing and all members are public).
No it doesn’t.
> all computations happen by message passing
I did not say message passing is required to be not present, I said it’s optional.
> and all members are public
I did not say anything about encapsulation. I said OO is about organization of code and data. If you have classes with properties and methods, it’s OOP.
Have a look at functional reactive programming.
Re "The model is a data structure and the view is a series of functions on it."
This is exactly how I see my OO-based GUI programs.
The object I define is firstly a data structure. Then I want some operations/views on it? define methods on it.
I used to think Node.js was the greatest thing ever. I won't bother explaining the benefits but suffice it to say I much prefer writing a server in Java compared to node.
I think it takes getting burned at least once for new developers to understand why a lot of seasoned developers like types.
Over time I've realized that there's a simple principle that applies to a lot of stuff in software and engineering in general:
"The bigger something is, the more structure it needs"
Writing a quick script or small application? Sure use Python, use Node, it doesn't matter, but as size increases, structure needs to increase to stop complexity from exploding.
It doesn't just apply to typing either. The bigger a project is, the more you'll want frameworks, abstractions, tests, etc.
If you look around this principle applies to a lot of things in life too, for example the bigger a company is, the more structure is added (stuff like HR, job roles, etc...).
As a corollary, the inverse of the principle is:
"Don't add too much structure to a small thing"
I used to think: come on you better than though hipsters... this shit looks ridiculous, and it isn't intuitive... There's no way it's worth it to learn. It's just the new fad.
Oh, how I was so so wrong.
Ruby is kind of nice in that there's not an easy way to iterate over a list without functional code. You start mapping and reducing pretty regularly, and then discover the power of higher-order array functions, and how it lends itself nicely to functional programming.
React/Redux is nice in that it pretty much forces you to wrap your head around the way functional programming works.
React/Redux was definitely a step up from Spaghetti-jQuery for me, but I'd stop short of calling it an enjoyable experience. It wasn't until I started playing around with Elixir that I really fell in love with functional code.
In a lot of ways, Elixir is really similar to Ruby, which makes it pretty easy to dive in (for a Ruby-ist). But in subtle ways, its functional nature really shines. The |> is perhaps my favorite programming concept I've come across. It's so simple, but it forces you to think about -- and at the same time makes it natural to comprehend -- the way data flows through your application.
Don't get me wrong, Elixir is still very much a functional language. It's allure in that it looks like an imperative language and has a lot of similarity to Ruby is misleading.
The learning curve might not be as steep as say, Lisp, but it' still quite steep. And I think it'd take around the same time to be meaningfully proficient in either.
Glad I got my Elixir/BEAM rave out for the day.
But what really blew me away were doctests. Basically I ended up writing my unit tests where my code was. That was my documentation for the code, so there was no need to maintain unit tests and documentation separatetly.
: https://elixir-lang.org & https://hexdocs.pm/elixir/Kernel.html
I wrote a rinky-dink Web application while teaching myself Ruby and Ruby on Rails. It turned out most of the pages were constructed from a database query returning a batch of results which then needed filtering, sorting and various kinds of massaging. I ended up really enjoying doing this work by applying a series of higher order functions like for() and map() to the data set. This got me started on thinking functionally.
Years later, I decided I wanted to do more with Lisp while continuing to work with the Java ecosystem. Clojure is a Lisp that runs on the JVM and is rather solidly functional. It's possible but very awkward to do mutation. If you don't want to run with shackles, you need to embrace immutability and FP. I found myself fighting FP until eventually I saw the light and was able to productively embrace it.
Sure thing, the data buffers are objects being re-used with mutation rather than being built from scratch per new message. Immutable objects, or a kind of "FP light" would have made this particular problem impossible.
In languages like JS or Ruby though, you might need to compromise. Generally I start with the immutable approach and refactor if performance becomes an issue.
I found recursion pretty unintuitive and didn't find the way it was taught in that course worked for me. At the time it mostly seemed like the approach was to point at the recursive implementation and say "look how intuitive it is!" while I completely failed to get it.
Many years later after extensive experience with C++ in the games industry I discovered F# and now with an appreciation of some of the problems caused by mutable state, particularly for parallel and concurrent code I was better prepared to appreciate the advantages of an ML style language. Years of dealing with large, complex and often verbose production code also made me really appreciate the terseness and lack of ceremony of F# and experience with both statically typed C++ code and some experience with dynamically typed Python made me appreciate F#'s combination of type safety with type inference to give the benefits of static typing without the verbosity (C++ has since got better about this).
I still struggle to think recursively and my brain naturally goes to non recursive solutions first but I can appreciate the elegance of recursive solutions now.
I've written primarily in a functional language (OCaml) for a long time now, and it's very rare I write a recursive function. Definitely less than once a month.
In most domains, almost every high-level programming task involves operating on a collection. In the same way that you generally don't do that with a while loop in an imperative language, you generally don't do it with recursion in a functional one, because it's an overly-powerful and overly-clunky tool for the problem.
For me the real key to starting to be comfortable and productive working in a functional language was realizing that they do actually all have for-each loops all over the place: the "fold" function.
(Although actually it turns out you don't end up writing "fold"s all that often either, because most functional languages have lots of helper functions for common cases--map, filter, partition, etc. If you're solving a simpler problem, the simpler tool is more concise and easier to think about.)
The solution to this seems to be the object capability (key) security paradigm, where you combine authority and designation (say, the right to open a specific file, a path combined with an access right). There are only immutable globals. Sandboxing thus becomes only a matter of supplying the absolutely needed keys. This also enables the receiver to keep permissions apart like variables, thus preventing the https://en.wikipedia.org/wiki/Confused_deputy_problem (no ambient authority).
Even with languages that have security policies (Java, Tcl, ?), control is not fine grained, and other modes of interference are still possible: resource exhaustion for example. Most VMs/languages do not keep track of execution time, number of instructions or memory use. Those that do enable fascinating use cases: https://stackless.readthedocs.io/en/latest/library/stackless...
All of this seems to become extremely relevant, because sufficient security is a fundamental requirement for cooperation in a distributed world.
Functional pearl: http://www.cse.chalmers.se/~russo/publications_files/pearl-r...
Among the available customizations in a safe interpreter are: restriction of use of individual commands, ability to set a time limit for execution of code, a limit to number of commands executed, and depth of recursion.
Memory usage can't be directly limited, but Tcl has trace features that allow examination of a variable value when set, so one could write custom code that prevents total contents of variables from exceeding a specified limit.
I'm watching Zircon with great interest.
I think Hickey’s comparison to git is apt: we don’t stand for that in version control for our code, why should we find that acceptable for user data?
You should opt-in to immutability when the state calculations are very complex and very expensive to get wrong.
I do wish mainstream languages had better tools for safe, opt-in immutability. Something like a "Pure" attribute you assign to a function. It can only call other pure functions and the compiler can verify that it has no state changes in it's own code.
Now I feel I could race a team of programmers in those languages and be far in front.
The old list of official aha's is: Monads, applicative functors, lenses. But really it's about spending time learning them well enough to use normally.