Hacker News new | past | comments | ask | show | jobs | submit login
The Linux 2.5, Ruby 1.9 and Python 3 release management anti-pattern (lucas-nussbaum.net)
157 points by BuuQu9hu on Dec 26, 2016 | hide | past | favorite | 136 comments

Arguably one of the most successful companies in the world is Microsoft. The reasons for their success are many, but at least one is the focus on backwards compatibility. Hook users, and never give them a reason to leave.

As developers we know it is painful to maintain backwards compatibility, but it's incredibly valuable. It builds user confidence in your products, and most importantly it builds user trust that upgrading is safe and thus encourages adoption of newer versions of your software/libraries. The only mistake Microsoft made was not doing rolling releases. Major updates are slow and scary, so even though upgrading to a new version of Windows didn't mean loss of compatibility, it was never "easy" and always jarring.

Backwards compatibility, automated testing, rolling releases. That's the magic bullet to keeping everyone happy and on the latest updates. If your users have to think about version numbers, you're doing it wrong.

* Yes, there are instances of Microsoft breaking compatibility, more-so post Windows 7. The point is that they tend to focus on it, and they get it right 97% of the time which is a technological feat.

See the Rust programming language for a great example of this recipe.

> Microsoft breaking compatibility, more-so post Windows 7

IIRC, there have been several instances of this between releases, less so from the user application perspective, but rather from the driver model.

Windows 3.1 to Windows 95 (16- to 32-bit), Windows ME to Windows 2000 (NT kernel), Windows XP to Vista (UAC etc.) each saw various issues with device drivers.

The extent of user application backwards-compatibility is impressive, with only Windows 10 removing 16-bit application support (boo). For example, even now it's possible to double-click on the application icon at the left-hand of the window title bar to close the application - à la Win 3.1.

> The only mistake Microsoft made was not doing rolling releases

I think they're 'fixing' that with Windows 10.

Windows 95 still booted through DOS, pretty much any application that could run on 3.1/3.11 could run on 95, you could run Windows 3.11 from within 95 if you wanted too.

Windows ME did not evolve to Windows 2000, Windows ME was a consumer/home operating System based on the same branch as Windows 98.

Windows 2000 was a continuation of Windows NT(4) for the enterprise market.

Windows XP was the first OS that unified consumer and enterprise versions on the same branch and moved the consumer version to the NT kernel.

Windows 10 didn't drop support for 16bit applications, it still is supported on the 32bit version like most people said, Windows XP 64bit couldn't run 16bit application either.

This has to do with how the 64bit CPU's work and the lack of 16bit real and IIRC even protected mode.

As for general compatibility you can install Windows 3.11 on a machine and upgrade it all the way to Windows 10 and keep your data as long as you go through 95/98 and Windows XP in the middle.

There are videos on Youtube where people install 3.1, 3.11, 95, 98, 98SE, ME, XP, Vista, 7 and 10 and keep their data and some settings (computer name, user, my documents, and even a quite a few windows settings ;)) and that is actually quite impressive.

IMHO, the single big break of backward compatibility was between windows XP and Vista 64.

What exactly did break? I can run 20 year old software on Windows 10 without resorting to emulation.

> The extent of user application backwards-compatibility is impressive, with only Windows 10 removing 16-bit application support (boo)

It's still supported on the 32-bit versions of win10! It has never been supported on a 64-bit version of windows though.

That raises several questions for my next team meeting...

Well, it does. I was told one thing and reality is different. I'm supposed to trust other members of my team rather than fact-check their input.

> Windows 10 removing 16-bit application support (boo).

CPU's in 64bit mode don't support running code in 16bit mode -- Windows 10 (actually 7) didn't remove support, it never existed to begin with. This is also why 32bit versions of Windows can and do still run 16bit applications.

Not exactly true.

CPUs in long mode (aka 64 bit) do not have the virtual 8086 mode, so the cannot natively execute real mode code. However, the code segment selector can be configured in such a way as to allow 16 bit protected mode code to be executed, the same way 32 bit code can be executed.

There's also the option to transition to 32-bit protected mode and then go to virtual 8086 mode from there. Or to simply emulate the entire CPU. It was probably just a cost-benefit trade-off in the end.


Wasting silicon to emulate some 30+ y.o. architecture makes no sense

Intel/AMD could thing about doing a "pure" x64/x32 processor that starts in 32-bit mode mode and leaves everything else to an emulation layer (this could be present on the UEFI only for those customers that are interested in it)

So why is 16bit execution supported on Windows 10 32bit but not on Windows 10 64bit or even Windows XP 64bit?

It seems to be this:


"The primary reason is that handles have 32 significant bits on 64-bit Windows. Therefore, handles cannot be truncated and passed to 16-bit applications without loss of data."

I would argue the most relevant example here would be Visual Fred (aka, Visual Basic .NET), the saga of Microsoft throwing people who had relied on Visual Basic 6 under the bus; for more on this topic as it relates to Python 3 and Ruby 1.9, I wrote a comment a month ago that contrasts.


> the saga of Microsoft throwing people who had relied on Visual Basic 6 under the bus

Even though the IDE isn't supported, Microsoft still supports VB6 applications, and will throughout the lifetime of Windows 10 [0]. As an ex VB6 developer, I find this crazy. Just goes to show how much legacy VB6 must still be out there.

VB.NET got me to move to dot.NET, and then to C#. And I'm a better dev because of it. So you could say it fulfilled its role at least once. And let's not kid ourselves here, VB6 was always limited. E.g. subclassing was almost mandatory for advanced functionality, but would cause the debugger and program itself to crash when it hit a breakpoint. 64-bit compatibility was never going to happen, my understanding is the IDE includes some 16-bit programs which is why it's no longer supported. At some point, you've got to cut your losses. If anything, IMO, Microsoft's continued support of VB6 is sending the wrong message to managers and developers.

[0] https://msdn.microsoft.com/en-us/vstudio/ms788708.aspx

The real reason why Microsoft had to drop VB6 is because its object model was fundamentally incompatible with .NET. VB6 was a very thin wrapper around COM, after all. That C# happens to be technically superior to VB6 in most (but not all) respects just provided Microsoft with a convenient excuse for deflecting criticism.

That being said, I don't mourn the loss of VB6. It was a truly horrible language, and the fact a whole generation of programmers preferred it over superior contemporary alternatives (e.g. Delphi) doesn't speak well of them. The only good thing about VB6 was its form editor, and that isn't even a language feature.

It wasn't even just COM, it was specifically OLE Automation.

VB6 Variants were OLE VARIANTs. VB6 strings were BSTRs. VB6 arrays were SAFEARRAYs. Date, Decimal, Currency - you name it, it's all OLE.

And this went both ways, too - e.g. the reason why BSTR is called BSTR is because, well, it's a BASIC STRING... And if you want to add two OLE variants in a C++ Win32 app, with exact same semantics as in VB6, you can (to this day!) do this via VarAdd: https://msdn.microsoft.com/en-us/library/windows/desktop/ms2.... Ditto VarSub, VarCat etc. You can see that this family of functions is very VB-centric actually, because it has VarAdd and VarCat separately (corresponding to + and & in VB); it conflates binary and boolean operators just as VB did; and it has VarEqv and VarImp, corresponding to VB operators EQV and IMP. And yet, it lives in oleaut32.dll...

> 64-bit compatibility was never going to happen,

But it did, more or less. VBA and VB6 are essentially the same thing, and VBA 7 (still used in MS Office) added 64-bit support:


Given the macro engine in Excel is VBA (basically VB6) I suspect VB6 support will be around for a VERY long time.

I think another example was by joelonsoftware, where when MS moved to win 95, they actually ran popular software looking for bugs, and found the SimCity would use-after-free (which, before 95 wouldn't crash), so they modified their OS to leave that memory cell intact even after free.

They do this kind of stuff all the time[0], and it provides enormous value to users.

[0] https://blogs.msdn.microsoft.com/oldnewthing/

They do it to this day. Application compatibility assistant or whatever its called now runs on every single machine and hot patches loaded programs. Malware authors love it.


And now we're stuck with decades old software and API stacks while Metro/Modern UI/UWP struggles to take off since about 5 years.

The day MS depracates WinAPI, they're control of the desktop is over, as every company which develops huge codebases will switch to an environment where their programs doesn't have to be rewritten at MS's discretion.

Is there any platform an enterprise could switch to that would guarantee a slower rate-of-change than Windows? "One major rewrite-requiring breaking change in 20 years" sounds like—if not a good deal—the best one I could foresee getting anywhere.

IBM (Oracle, too, I would guess).

http://www.longpelaexpertise.com.au/ezine/IBMBackwardCompati... has a very good observation, though:

"The reality is that backward compatibility is only possible if there are enough users paying […] for it."

It is the same for Microsoft. If, for example, several big customers would want 16 bit support on x64, I'm sure they would provide it.

Once you rewrite once, the red line is broken. I think joelonsoftware has an article about that.

Sure they will rather change to a platform in the hands of Oracle,Apple or Google.

What platform is that?

The issue is that this is only of concern to application developers who don't want to use old APIs. Systems owners, managers, and users won't care if it gets the job done predictably and cost-effectively. And if the vendor is willing to keep that stack around and support it...

> stuck with decades old software

"reap the rewards of mature software"

There was a great talk recently by Rich Hickey[0] about how change itself is breakage. Breaking early and often is catastrophically worse than python 2.7 in my opinion.

[0]: https://www.youtube.com/watch?v=oyLBGkS5ICk

Agreed. Hickey, in that talk, claimed if you really did break backward compatibility, you should just name it something else. So in real terms Python3 should have been renamed to Pythonidae or something like that. Eventually the last folks working in the 2.7.* branch would notice that nobody else was in the room and hopefully figure out where they went.

I disagree. Python 2->Python 3 was not really a huge change. It seems that the Python 2 userbase is finally starting to dissolve but IMO the holdouts were mostly doing it for philosophical purposes. Renaming the project for Python 3 would've resulted in a massive decrease in useful searchability, history, etc, when really only a small handful of backward incompatible changes were at stake.

It seems the better lesson is a) keep compatibility for a LONG time; there are finally some old Windows programs that Windows 10 no longer runs; and b) when you do break compatibility, don't make a big deal out of it.

Python 3 probably would've been better if it was just a feature-heavy incremental release that didn't break compatibility. 3.2 probably should've introduced a change that discouraged but still supported the old code style (maybe print a warning any time the file didn't contain a shebang-style header like #!allow_old_style, kind of a simpler reverse-version of the future module), and then maybe 3.8 could finally remove compatibility after the old method has been deprecated and difficult and annoying for 10 years or so. Making a hard delineation between compatibility, while on the surface appearing useful, seem to have a much stronger negative effect in encouraging users to stick with "their" version out of principle than it has the planned positive effect of giving users plenty of time and notice about compatibility breakages.

This seems gradual enough that most users won't notice and there won't be a hubbub, nor an insurrection among users to shun your new iteration for making them do extra work.

Dealing with people effectively, including end users, is very difficult. It takes a lot of patience.

If you look at long-running, successful projects, they don't do this.

One example is the original V8 dev team that went off to create Dart (AIUI). Nobody followed them and V8 got replacement developers. (But perhaps that was a particularly egregious case.)

I feel like the only responsible thing to do in such cases—if you truly believe that the thing you're abandoning is just a plain-old failure of engineering, and it's effectively self-flagellation to be subjected to maintaining it—is to poison the well/salt the earth when you leave, so that people are at least forced to switch to something else, if not your replacement project.

Badly-engineered buildings get condemned, and we make it illegal to use them, even if it might benefit someone to use the building in exchange for the (likely not-very-well-considered) risk of the roof falling on them.

What sort of political/social infrastructure would it take to allow someone to condemn a codebase?

Google has begun building "time bombs" into Chrome that will start inconveniencing users once the binaries reach a certain age without being updated (I believe this is specifically related to certificate validation, but I'm not going to look up details because I don't want to be put on a watchlist for googling "chrome time bomb" :P ).

Is the watchlist for googling it worse than the watchlist for writing it?

Nah, I was just on my phone and needed an excuse for my laziness. :P

It's hard to collect money for a rolling upgrade. This was the pre-cloud era and you sold products. With Windows 10 that's changed and it looks like they found a way to monetize user data also.

Here's an example of Microsoft Office killing off backwards compatibility for some older document formats:


They've dropped support for 15-16 year old file formats with a workaround in place....

The old versions of Word is Word 2 for Windows, and Word 4 for Mac, and Excel 4....

The work around isn't complicated it's long because they gave you 3 methods of how to get the functionality back (the simplest way is to download 1 to 3 self installing hotfixes depending on what files you still need to access, I'm pretty sure they just didn't want to pay royalties anymore for supporting some of the non-MSFT formats nor did they really want to support nearly 20 year old binary formats) and that document has been revised as late as 2011, MSFT dropped support for Office 2003 in 2014, 11 years after release.

Yes MSFT does break backward compatibility once in awhile, but it's usually quite warranted, they offer supported workarounds or exemptions and they tend to break it once in 1-2 decades rather than once every 2 years.

Sorry but giving an example of MSFT dropping support for Word 2 or Excel 4 file formats that were designed probably in the late 80's in 2008 isn't exactly a good example of a company breaking backwards compatibility.

If your organization circa 2008 was still dependant on 2 decades old formats don't blame the vendor, you should have migrated that data, It's like complaining that verbatim stopped making floppy disks, oh noes what would you do?!

Even assuming conversion works flawless, I don't think full migration is possible.

I bet archive.org has zillions of files that now aren't easily readable anymore. Do you think it is realistic to expect them to fetch every single copy from storage and convert it? I bet they cannot even find _all_ of them, given that documents may be embedded in mailbox files of various formats, in disk images, in usenet posts, in archives using various obsolete compression methods, embedded using OLE in other files, etc.

Even if they can find them, conversion in-place may not be possible. A file inside some archive file may grow in size, making the archive file larger than the archive file format allows.

For example, a few years after they declared conversion done, somebody could find an old CD and upload an old brochure or logo. A converted version may already be available, but how would that somebody know that that file looks the same? All they have are the files, and they are different.

Except for scale, archive.org isn't exceptional here. Large corporations face the same issue. They likely will bite the bullet, though, keep around some way to convert files for a few years after they declare to be 'done', and accept that some data will be lost after that.

That's true if you do the conversion after 20 years, but again what exactly are you advocating here that we should support 20 year old binary formats?

>Arguably one of the most successful companies in the world is Microsoft. The reasons for their success are many, but at least one is the focus on backwards compatibility.

Well, Apple is another very successful company, and follows the exact opposite model.

So, the empirical data support both approaches (if the goal is being a succesful company).

Does this mean most OS X applications or drivers break when OS X is updated, and have to be recompiled or have is source code updated or re-written ?

No, it just means that Apple breaks backwards compatibility much more often (for PowerPC to Intel transition, for 32 to 64 bits, drops Carbon, changes APIs, etc). Though what you describe also happens to some apps using certain APIs that need code changes to run on a newer OS X.

We weren't talking about MS satisfying that basic level of compatibility (that Apple usually satisfies as well), but much much increased levels (which Apple doesn't).

Backwards compatibility can tie you to things that you need to get rid of though. At some point you have to decide to drop support for stuff and move on, otherwise something else that has will come in and eat your lunch.

I feel like you could make this argument about Apple just as well. It's got a larger market cap, so even more arguably successful. And they break compatibility (judiciously) in order to create a better product.

This isn't really true. If Microsoft of 2016 has a focus on backwards compatibility it is only because of their myriad mistakes in the past. I know people have this false impression because of fluff pieces that have been written about Microsoft having a focus on backward compatibility but anyone who has lived through the last 30 years of computing history knows this isn't really true. They have broken backward compatibility numerous times.

Many, many DOS programs would not run under Windows 3.1, necessitating users to boot into or exit out into DOS to run those programs, and of course Windows programs could not run under DOS.

Windows 95 generally did better at running DOS programs but it was far from flawless. Many programs required all kinds of workarounds and even patches.

Windows NT was originally a complete departure from DOS and Windows 3.1, 95, etc. only later on did Microsoft realize that they needed a compatibility layer for NT to ever go mainstream (along with a rebranding.)

Windows Vista broke tons of stuff too. Including everybody's printer drivers. When Vista was released you could buy a Vista computer along with a printer which had no drivers for Vista and which HP would tell you there's no Vista drivers ever coming.

Windows 8 may not have broken execution of Windows 7 apps but the entire UI was a complete shift from what users were accustomed to. Whether you liked it or not, that kind of major change is a big mistake when so many of your customers are tech novices who are not going to have an easy or enjoyable time learning how to use their computer from scratch.

This is just scratching the surface. My point is let's tell the truth here, Microsoft is not some hero of backward compatibility who has never made a mistake and always kept their users at heart.

While it would be hard to call any kiloperson organization a hero, it is remarkable how people on the Windows team keep old software working even if it does not deserve to.

The most accessible documentation of this is Raymond Chen's blog. For example here's a piece written in 2007 about Windows 95 and DOS: https://blogs.msdn.microsoft.com/oldnewthing/20071224-00/?p=...

While the herculean task of billion-device compatibility has not always gone off without a hitch, the idea that the past was all gaffes and trial and error at Microsoft does not ring true if you read the above blog from one of its more prolific public writers.

Maintaining hardware compatibility with any PC hardware ever created is indeed nigh-impossible, so you'll note I didn't even mention that. I only talked about maintaining compatibility with applications written for their own operating system. Something that Microsoft does have pretty much universal control over.

The only mention I made about devices was regarding the printer driver issue with Vista, which had nothing to do with the devices and everything to do with Microsoft totally redoing their printer driver implementation on the OS level. It necessitated all drivers be redone and some manufacturers (like HP) took the not very admirable path of telling customers they were up a creek.

Windows 95 came with a way to configure shortcuts so that they would reboot back into DOS, just to run the program, and resume after you quit.

I call that going the extra mile.

I remember the shuffling you had to do to make enough extended memory or expanded memory (depending on the program) available. It was no small feat to be backwards compatible with programs interacting more or less directly with the BIOS and with custom routines for individual makes of add-in cards. Successfully shifting that whole ecosystem to a fully protected mode, preemptive environment over the course of 6 years or so (from 95's first release to XP) was probably the most impressive software migration we've ever seen.

Not really arguing with your points, but it seems Microsoft has a lot of lessons learnt from maintaining backwards compatibility.

It'd be wonderful if their architects collected and codified these somewhere, ie "we did X because of Y, it didn't work, because Z ends up happening"

Mostly as a lesson for the rest of us not to fruitlessly repeat.

It's probably all there in their issue trackers. That's partly why software teams track issues, because somewhere there's a decision to be made and a story to be told, and that story is written in the issue before it's closed.

Software development is really all about finishing stories.

In addition, I believe there is a large amount of Windows-native software that runs better under WINE than it does under Windows 10.

Edit: am I wrong? See https://en.wikipedia.org/wiki/Wine_(software)#Backward_compa...

I would be careful to trust wikipedia about this, especially if there are no citations.

The "Third Way" is to avoid both problems (diverging stable and unstable forks vs. a single line with frequent minor breakage) by making heroic efforts to maintain backwards compatibility even in the face of major changes. Someone already mentioned Windows as an example of this. I'd point to C++ as another example: the C++ committee's longstanding commitment to maintaining backward compatibility at all costs has always been one of C++'s greatest strengths, and is certainly one of the main reasons why it's still in widespread use despite its age and many flaws.

Imo c++ is terrible because of trying to maintain backwards compatibility too hard. Sure, keep abi and linking compatibility all you want but why should I be able to compile code written for c++98 in c++17 mode? No corporate policy would allow you to just flip the switch without thorough review and regression testing anyway. Removing features is also a feature. It would be better if you could compile each library in a different mode but still allow them to link to each other, this would allow you to gradually write code without the old crap in new libraries and still keep proven old code untouched.

The biggest mistake Python 3 made was the print statement change. There was really no compelling reason to do it, it broke basically 2.x script ever for 0 practical gain.

I'm not so sure. On one hand yeh sure, print vs print() may not have been something users were clamoring for, or worth adding an additional layer of confusion/weirdness over. However it's relatively straight-forward to port such code or instruct someone "ah you don't do print x you do print(x)".

From what I've seen the unicode/string bytes/chars is a much bigger factor. Ironically this change made my largest Python app easier in Python 3 (easier to embed unicode literals from various languages), so perhaps my opinion isn't so valid here...

Sure it's easy to fix, but it made a ton of scripts stop working despite being otherwise unaffected by the version change, all for the sake of a small syntax change. Unicode strings are a much bigger change and required some effort to fix, but it improved the language semantics while the print change rendered lots of scripts incompatible for no good reason. Just deprecating the old syntax and giving tons of warnings for a while before dropping it would have been much better.

I totally get you and I genuinely agree, it probably wasn't something so important to risk pissing people off over. As the sister comment to yours mentioned it could be seen as a bit of a "fuck you" to maintenance devs, but if we're honest with ourselves the reason that people haven't moved to Python 3.x after 7+ years isn't really due to print().

edit: sorry I hit "flag" instead of "parent" when trying to navigate back, hope that didn't affect anything (I've since unflagged)

Ultimately, I think Python 3 had exactly the wrong amount of breakage. They either should have broken way less (e.g., let 2.x code run largely unchanged if it didn't deal closely with unicode) or should have broken more, to give more of carrot... like if some of the new async stuff could have been available in 3.0, that would have given some people a reason to move right away.

I kind of feel like that's why print changed. It's going to be used in almost every Python script at some point in time, so it forces a breaking change without causing a seriously major break by itself. It's enough to make your script need to be updated, but not enough to make you re-write it from scratch (possibly in another language).

The unicode change was more impactful, but the print was more like a little fuck you to maintenance devs everywhere.

If they wanted print-as-function, fine, but they didn't have to eliminate the print statement at the same time.

Some warts you're better off not scratching at.

>> There was really no compelling reason to do it

> the print was more like a little fuck you to maintenance devs everywhere

> If they wanted print-as-function, fine, but they didn't have to eliminate the print statement at the same time.

Complete and utter hyperbole. You could read PEP-3105 [0] and easily see it isn't true. In addition to this, quoting from both the PEP and the docs [0][1]:

> "Luckily, as it is a statement in Python 2, print can be detected and replaced reliably and non-ambiguously by an automated tool, so there should be no major porting problems (provided someone writes the mentioned tool)."

> "When using the 2to3 source-to-source conversion tool, all print statements are automatically converted to print() function calls, so this is mostly a non-issue for larger projects."

Enough with the bullshit.

[0] https://www.python.org/dev/peps/pep-3105/ [1] https://docs.python.org/3/whatsnew/3.0.html

As much as I want Python 3 to replace Python 2, you can't recommend 2to3 while saying "Enough with the bullshit". 2to3 is abandonware that never worked.

Now you can use python-future instead, a third-party solution, but the 2to3 part of the PEP was a false promise that gave ammo to Python 2 fans.

> the 2to3 part of the PEP was a false promise that gave ammo to Python 2 fans

The PEP (3105) doesn't actually mention 2to3 anywhere, it just says that the fact that print is a statement (ie handled explicitly in the language syntax) means it is easy to write a tool to do the conversion automatically. Which is a perfectly valid point, even if 2to3 fails to do that for some reason.

At the end of day, if enough people beleive print requiring brakets is an issue, it's becoming an issue. I beleive it and few other people beleive it in this thread.

There is no added benefits to have done that. They are just not listening. Now, Pythons 2/3 are sufferin despite being one of best language out there.

> There is no added benefits to have done that.

For god's sake, it is subtle, but not that hard to understand. There are added benefits. For one thing, consistency. Why is `sys.stdin.write()` a function but `print` isn't? What happens if you want to replace `print` with your own function, e.g. for logging? With a print function, you can just reassign that function, with a statement, you can't. What's the effect of this switch on the bytecode/parsing code? All the arguments against it seem to come down to "I don't like it".

> being one of best language out there.

Why is Python one of the best languages out there? Ease of use through consistency. Surely that's just an argument for `print()`.

Python3 allowed the devs to fix some of the mistakes or sins of the past, similarly `input`/`raw_input`. In the short term, it's maybe a bit of hassle. In the long run, not at all. 2to3 handles this mostly, but you can use `from __future__ import print_function` to test it in Python2. Nobody is refusing to switch to Python3 because of `print` alone.

No one is against making print a function, we are just disappointed against the mandatory bracket thing. It's exactly what's wrong, you are not understanding because you don't want to listen. I agree with all the arguments you are presenting, but why can't you see why having to add - mandatory - brackets can be annoying?

In addition, the hack is already there:

  Python 3.5.2 (default, Dec 2015, 13:05:11)
  [GCC 4.8.2] on linux
     print "hello"
  Traceback (most recent call last):
    File "python", line 1
      print "hello"
  SyntaxError: Missing parentheses in call to 'print'
Why not just calling print? We can do it right now and it's requiring no changes to the 3.x parser.

Finally, the print stuff was one of the main reasons I pick Ruby over Python for my startup 5 years ago. I may be alone but I don't think so. And I think I make the right decision. Look at the some of the latest benchmarks Ruby vs Python. Python is now lagging strongly behind when it was ahead 5 years ago with 2.7. The 3.x branch keeps making weird decisions.

> we are just disappointed against the mandatory bracket thing. It's exactly what's wrong, you are not understanding because you don't want to listen.

It's not hard. This isn't some arbitrary bracket hate or love, it's how the language syntax works at it's core: statements (e.g. if, el, in) never have brackets, function invocation always has. Simple!

So what am I not understanding? That you are proposing some other syntax, e.g. function invocation without brackets? That you don't like breaking changes? That you'd keep everything as it was for sake of habit? Python 3 breaks Python 2 in many ways, but okay. Any other coherent arguments why `print` should remain a statement?

> Finally, the print stuff was one of the main reasons I pick Ruby over Python for my startup 5 years ago

There are enough things to not like about Python 3. But do not get me started on Ruby. A language that constantly breaks stuff between minor version numbers (1.8 -> 1.9, 1.9 -> 2.0, etc)? Way to undermine your own argument.

Edit: To append this infamous article about a Ruby migration 1.8.7 to 1.9.3 (http://www.darkridge.com/~jpr5/2012/10/03/ruby-1.8.7-1.9.3-m...), lest I be accused of Ruby bashing. Of course, the original article we're discussing mentions Ruby in the same breath with Python 3. Oh, the irony of trying to put down Python 3 by citing Ruby.

Ruby allowed you to write code that worked in every version from 1.8.6 to 2.4, a huge benefit when writing libraries during the transition phase.

Things like the parentheses in print was why the 2.7 branch was created -- allow the old to exist and make available some of the new features. Probably the biggest mistake there was not to make deprecation warnings on by default, to accentuate that it was a transitionary release. Then again, it's the same problem PHP has, with not wanting to warn that the old way of doing things is going away for various reasons.

In python, functions are called using parentheses, so "make print a function" implies "require parentheses in calls to print". It would be weird to create a special case for the "functions are called using parantheses" rule just for print.

So make it a function, but the only function with optional parentheses? And only in some cases?

The point you're missing is that is that is was another change, for changes sake.

You're right, a print() statement is better than just print, but it was an unnecessary change, that was made 'for the best' without sufficient justification.

    Python3 allowed the devs to fix some of the mistakes or sins of the past
Unfortunately, historically, it seems that it would have been a far, far better choice to roll those fixes out incrementally, even if it meant multiple breaking changes.

It was too many changes, all at once.

I think the lesson here is that time and time again, 'big bang' breaking changes have been failures.

2to3 should have made this a trivial change; but that only holds if 2to3 actually worked for every other simultaneous breaking change (it didn't).

A far, far more mature and better approach would have been the 3.x line being a continual stream of breaking changes with a flawless 2to3 that converted 100% of every python2 program to working python3, and a series of 3.xto3.y migrations that flawlessly upgraded python3 code until all breaking changes had been worked out.

Imagine if you had a series of transformations you could apply to any python2 project and you could incrementally migrate it automatically to 3.5. That is what we should have had. Screw print statements; invest in meaningful stuff when you make breaking changes.

I do scientific programming / ml, and python3 is nothing but hassle. The resistance in my community, imo, is we got a medium pain (print vs print() isn't a hassle / dict.iteritems -> dict.items is a tiny pain, but dealing with years of split libraries and parallel installs and all that was a giant pita) with no benefits at all. Years later in 3.4 or 3.5 we finally got a matrix mult operator, but that's about it.

> It was too many changes, all at once.

Yes, this is indeed a problem. You could argue the Macbook Pros also suffer from this. But at some point, you've got to make the changes. Since Python 3 was already a breaking change, is it better to suffer a bit more in the short term and lay the foundations for the future?

In a way, this trade-off of short-term vs long-term is what responsible people do every day. Brushing your teeth, going to college, etc. Sometimes the easy path is the wrong one. Most religions are fairly clear on this, too ;) But it does rely on a (perceived) gain in the future. For some people, Python 3 just isn't a great gain. And to be honest, it's only in later 3.* versions we've seen the really useful things. Even then, for some people Python 2 was enough, just like for some people bash is enough, or 640k or memory.

It's still a hard question, one with no definitive answer. Economically speaking, does one major port cost more than multiple breaking changes over time? You might think spreading the cost out is better, but this only works if reputation doesn't suffer. Which is, again, very hard to quantify.

But we're kind of past all of this. Python 2 is legacy, Python 3 is still actively developed. It isn't a discussion worth having, because it isn't a discussion anymore. Maybe next time (Python 4 anybody?), we can look back and learn something.

I just want people to know why it was changed, that there was some thought behind it, and that it wasn't arbitrary. Python Enhancement Proposals are easily my favourite "feature" of Python.

The concepts of "text" and "array of bytes" should have been separated from the beginning, as they're distinct enough concepts. It's good that they separated them in python 3 but unfortunate that it causes problems because code didn't explicitly distinguish these two concepts.

I'm not sure if there's a third way for software that is omnipresent as Python.

Usually you have libraries, etc. which might be developed or not. As technology progresses, you can't achieve 100% backwards compatibility... I mean it's an accident waiting to happen, it's not a case of if it's just a case of when and how.

Well no one wants to hear it, but PHP devs do a good job keeping BC breaks minimal.

PHP4 to PHP5 was almost as big of a deal as Python2 to Python3. In fact, while I have PHP5 code that can easily ported to PHP7 -- I still have PHP4 code that will be forever on PHP4.

Because of this, PHP developers strive to avoid this kind of large breakage. Instead it's a smooth stream of deprecation and removals/changes and really large changes (like native unicode strings) have been non-starters.

I might be misguided, but the last time I used PHP was over ten years ago, using PHP3, and we had to name the files my_file.php3 in order to use PHP3. That's the opposite of minimal breakage. On the other hand, my simple PHP3 scripts continue to work to this day, php3 extension and all!

That's just how your server was/is configured.

As far as languages go, it broke a lot.

Not that it shouldn't have (register_globals) but it was obvious they were learning on the job.

You can't run most PHP3 code nowadays without work.

It's more like after PHP 5 the community had learned that lesson, while python didn't have the chance yet.

Given that, I think the PHP community handled the split much better, with the FIG pushing hard for PHP 5.

I think the difference is that most of PHP install base was managed by third parties (hosting providers) and once you lobbied those you basically forced everyone to upgrade. With python most of the install base is proprietary, and thus more fragmented, which makes it much more complicated to lobby for an upgrade.

I think as a counterexample, one could look to projects like FreeBSD, which have had a long-running development branch and branches for release for a very long time. Breaking changes happen in -CURRENT, but within a stable branch -- 10.x, 11.x, etc, you even get things like ABI compatibility, so a module written for kernel 10.1 will work for kernel 10.2.

The example projects show more a lack of development discipline than a problem with the big break model. A version number change from 1.8.7 to 1.9.0 doesn't suggest big changes that one should be cautious of, causing problems. Linux Kernel 2.5 had the problem of a lack of developers willing to write more inclusive tests and a lack of a pre-determined roadmap. Python had the problem of a lack of a transition version where deprecation warnings were on by default. Successful breaks can be made when they are declared in advance so people can have reasonable feedback. The projects in question show communication problems, not problems with the idea.

> Python had the problem of a lack of a transition version where deprecation warnings were on by default.

I'm really not sure this is necessary, though perhaps I'm wrong. I expect that fewer people would have upgraded to 2.6 or 2.7 if they were full of warning printouts.

> Successful breaks can be made when they are declared in advance so people can have reasonable feedback

Python has an enormous amount of advance warning. Python3 was released at the same time as 2.6, which came with a warnings mode for breaking changes in python 3, then 2.7 allowed people to backport many of the newer features. There is an automated tool to do many of the conversions and you've been able to test your changes for the last seven years against a fully supported alternative, and 2.7 will be supported until 2020. That's a total of about 11 years from the launch of a working upgrade, it was being discussed a lot for at least a couple of years before the 3.0 launch and Guido says he first brought up the idea of py3k in 2000. I'm also sure you can write code that works in both 2 and 3, so there's even a fairly clear migration path.

If developers aren't going to change things given such a runway, would adding default deprecation warnings really make such a difference?

I'm sure I recall at the time that the transition was intended to take 10 years, and so is the problem simply that people are using a different measure of success than python did themselves?

>If developers aren't going to change things given such a runway, would adding default deprecation warnings really make such a difference?

I think many developers don't pay attention to these things. If it compiles - they've turned off the warnings - then it's fine. What's needed is the use of a "--use-deprecated-features" flag in order for the code to run/compile successfully. People pay attention when they need to change the build script in order to keep doing it wrong. Sure, they'll keep doing it wrong, but now they'll know it's serious.

But who is that targeting? Python programmers that haven't heard of python 3.X in the last 7 years?

I hazard that there's no easy answer to this.

If you 'fork' and start off on a new development trail, its hard to retain compatibility.

If you keep with the legacy branch, you can end up with a frankenstein monstrosity.

Most of the examples used are of very highly coupled pieces of monolithic code ie. kernels, language runtimes, and probably also extends to databases. Its hard to take these apart without losing performance in some aspect or another...

I suspect most 'user-land' developers have moved to the microservices/SOA type development methodology where API's dominate so this is less of a concern.

I think it's a lot more subtle than that. Users are always ready to jump ship, even if you don't break compatibility.

Merely introducing new functionality that competes with the old ways and shifting dev focus to it forces users to put an effort into the new ways, which irritates them, so some might not do that and might start looking around instead, depending on how deeply invested they are and how fed up they are. Others have not tried even the old ways yet and are ok with the focus on the new ways. And only few are those benefiting from the new ways. Simple strategy just can't work for everyone. There has to be a separate strategy for separate kinds of users and things that introduce competition and fragmentation must be carefully weighted.

Given all that, breaking compatibility early and often as a proposed solution seems like even worse strategy as it not just demands effort, but demands it immediately and irritates users even more, which means more users are probably going to get fed up a lot sooner.

With respect to Python 3, I think it /is/ in fact a similar but different language.

Primarily my focus is not on small annoying easily covered things (a print() function is better than a print keyword), but on more systemic changes like it's decision to make a unicode str the default type of string for the language.

I would like to see a new version of Python (maybe 4?) that centers around basic, commonly atomic, types and arrays of those types. This would prevent the issue of Python 3 either mangling input strings or choking as it tries to digest legacy input that isn't in the anticipated encoding (at the cost of not making the mess larger). It would not-modify data that is being passed through the language UNLESS requested as an operation or as an output filter (implicit operation). Traditional UNIX like 8 bit clean handling is why UTF-8 is a super set of ASCII (code-points above 0x7f undefined/locally specific); UTF-8 was able to define an in-situ character sync method and was still compatible with existing files, and programs that did not enforce a specific encoding on the data stream.

Py3k's problem is lack of incentives. If Py3k is 30% faster in execution speed ppl will migrate in no time. But currently everything is do-able in py2.7 and maybe even faster.

A successful example is PHP7.

I suspect a reason Python 3 might 'be slower' would be precisely how expensive everything being a /validated/ (is it UTF-16 internal? Some other crazy Windows Native format IIRC) string, including internal object elements like names and dictionary (map) keys.

UTF-8/UTF-16 are encodings, used to encode strings for transport/exchange. Internally, strings are stored in UCS-2 or UCS-4 representation.

AFAIK, this hasn't changed between Python2 and Python3, after all, Python2 also supports unicode. It has more to do with what a "string literal" in source code means. I don't want to go into it too much, but in Py3, `"a"` is a unicode string literal equivalent to `u"a"` in Py2. Py3 `b"a"` is roughly the same as Py2 `"a"`. But note the ambiguity in Py2 - could be a sequence of bytes, or an ASCII encoded string. This is, of course, an oversimplification btw. But not too far of if you think how e.g. `from __future__ import unicode_literals` works in Py2.

Is parsing UTF-8 literals in a file slower? Maybe negligibly. Does it affect runtime? I have no idea. Probably not, you can still do string comparisons byte for byte once you've converted to UCS-2/4. It might use more memory though.

> UTF-8/UTF-16 are encodings, used to encode strings for transport/exchange. Internally, strings are stored in UCS-2 or UCS-4 representation.

That makes no sense; UCS-2 and UCS-4 are encodings too.

It isn't completely false. It's a simplification (as UCS-* is often used to denote internal encodings), because oh god, Unicode. This does a great job at explaining some intricacies: http://lucumr.pocoo.org/2014/1/9/ucs-vs-utf8/

Also, in Python 3, strings can be UCS-1 (latin1?), UCS-2 (UTF-16?), UCS-4 (UTF-32?) or other: https://www.python.org/dev/peps/pep-0393/

It's complicated. Really complicated. Sorry.

I'd lean against referring to them as any Unicode-based terminology, as all of that is ultimately designed to represent all scalar values (e.g., even UTF-16 code units compose in such a way as to give some way to represent scalar values outside of the BMP, though they obviously cannot directly). The other notable thing about Python 3.3+'s representation is their ability to represent surrogates, which UTFs cannot.

Realistically, simply referring to it as "flexibly sized codepoint sequences" is probably about as good as we can get, IMO, because that's what they fundamentally are. (And there's very little terminology for a sequence of codepoints!)

I doubt that, as Ruby 1.9 to 2.x has done a lot of encoding changes to now default to utf8 while at the same time being a lot faster.

If you think Python 3 doesn't offer anything compelling, you haven't read about what's in Python 3.

I suspect for a lot of people the carrots aren't really that relevant compared to the cost of migrating. Async is one of the bigger ones and for us folks doing Django it's not on the menu.

There are certainly quality of life improvements but nothing too ground breaking apart from async.

The article seems to believe that Debian packaging practices can be used to illustrate how the users responded to a project's actions. I find that funny, but it makes it hard to accept the article's conclusions.

(I'm not saying anything about Debian, really; I would have the same complaint if the article pointed to Fedora, Slackware, Ubuntu, etc. as proof about whether a change confused users).

(I'm the author)

Well my main free software affiliation is with Debian -- that's why I used it to illustrate.

Still, I think it's relevant. One of the difficult choices that distributions have to make, is to decide which branch of projects should be shipped to users (of if multiple branches need to be shipped, which one should be the default).

Of course Debian is often on the conservative side of things, and also made those decisions based on its own ability to introduce patches to fix breakages.

It would be more interesting to have data about which python version is supported by how many modules, over time (and which ruby version is supported by how many ruby gems, over time). But I don't think that this data is available?

Python 3 was to some extent a reaction to the constant breakage that people were complaining about at the time. It was decreed that backwards incompatible stuff would go into 3 and 2 would remain relatively static. Eventually it was announced that 2 would remain entirely static.

That created an odd situation where the version that people were expected to transition from became a much better place to do new development in that it was a stable target. Heck, in a sense it is still better to target Python 2.7 for new development as the people running the reference implementation (cpython) have promised to stop changing things as a matter of policy. Your 2.7 based app could run basically forever with no changes.

So you are dammed if you do and dammed if you don't. There is no definite answer here.

Are there examples of widely used, mature projects that have gone the opposite way?

OpenBSD, llvm, and go all have six month cadences.

And do the same sorts of breaking changes get made, just distributed in smaller chunks through those six month releases?

I don't know about OpenBSD and LLVM, but Go has made good on the Go 1 compatibility promise, with only a few minor changes being disputed as wether they break it or not.

Go 2.0 will probably be a different story, assuming it ever comes out, but I have a hard time believing the devs will release anything that won't be fixable using the "go fix" tool.


I can only comment on the specific llvm case, but I cannot say "it works" without listing the fact that I have to keep ~5-6 versions of llvm around _exactly_ for that reason.

You cannot compare an OS, or in general any product which is "complete by itself" (browsers), with a compiler which is then used to _make_ the end product. Your source depends on it, and as it grows it just becomes increasingly hard to keep up.

Rolling releases are /also/ a tradeoff. I'm not particularly pleased with llvm in this case. I genuinely prefer semantic releases, and yes, even the python2->3 strategy in this context.

And ubuntu and gnome and fedora and...

web browsers?

I'm thinking particularly of Firefox, with all these extensions that do not work with the latest versions.

Firefox used to do the infrequent large change thing, then moved to frequent small breakage.

Specific to add-ons: every release has broken some and the XUL depreciation is to some extent to deal with that problem better.

Note that users hate both options proposed in the article passionately, it just seems that break early break often results in fewer users voting with their feet.

That's been coming for years. XUL addons have never worked on mobile Firefox.

Not sure if Scala counts (it's 12 years old) but it continues to have (usually minor) breaking changes on each release.

Do you have an example of what broke between 2.11.7 and 2.11.8?

Sorry, by release I meant e.g. 2.11 -> 2.12 . I don't recall any breakages in recent memory on smaller versions.

Here's a presentation from Theo de Raadt on how OpenBSD does its releases, https://www.youtube.com/watch?v=i7pkyDUX5uM.

Swift ?

I think they aren't a reasonable comparison because they're very young and warned up front that they'd break compatibility several times in the early years.

From this point of view, Swift language is doing right. It is irritating to have to fix a project every few months because Swift language is changing under our feet, but it is still better than having multiple active flavors of Swift. (I still don't love Swift, though.)

Well, Apple can do whatever they want, with any language they want, better or worse, and they can still force developers to change to the latest thing.

If they say that you can develop for iOS 11 only with Swift 4, not Swift 3< or Objective-C, everybody will have to switch to Swift 4, or else they won't be able to develop to iOS 11.

Even though I also think that Swift is better than Objective-C, and getting better in each version, is not really comparable with Ruby/Python/etc.

Although I haven't had to use it the go solution looks really good to me, just run gofix update your code.

Ugh, looking at the comments it seems that the Rust "nightly vs stable" article a few days ago [1] has sparked a lot of confusion. Either that, or people are trying to spread FUD about Rust.

The original author ought to write a retraction.

[1] https://news.ycombinator.com/item?id=13251729

In retrospect, and looking at what those projects have been doing in recent years, it is probably a better idea to break early, break often, and fix a constant stream of breakages, on a regular basis, even if that means temporarily exposing breakage to users, and spending more time seeking strategies to limit the damage caused by introducing breakage.

It amazes me how the author managed to conclude completely the opposite from what he described. Make the users experience the breakage. On purpose. To me as an engineer, that's utterly unacceptable.

Empathy is still a core engineering value


the solution isn't to expose breakage to users in order to make the developers' lives easier; we are developers precisely because we are to tackle the difficult problems, so that users could get work done.

As engineers, we convert the science and the theory into working products and tackle difficult problems so that useful work could get done. If we are disrupting the users so that our work and lives would be easier, we don't respect our users. We should never make our work easier for us at the expense of our users.

The solution is to design for backwards compatibility from the onset; version interfaces if you can't predict for the future, but always remain backward compatible. This way, new consumers of the interfaces benefit from the new features transparently, and existing consumers continue to function without disruption. Plan ahead; don't just code: planning is 50% (or more) of the work.

(I'm the author)

I did not mean that developers should break things on purpose. Of course, seeking strategies to limit the damage caused by introducing breaking changes, by providing some amount of backward compatibility where possible, is always a good idea.

But I think that what those examples show is that strategies where disruptive changes are introduced on a regular basis (what I call "break early, break often"), requiring some fixes from time to time, work better than strategies where a different branch is used to break the world, and then require a huge transition for users.

Maybe what I failed to articulate in the article is that most compilers, interpreters, libraries introduce minor breaking changes on a regular basis, with most major or semi-major releases. When that amount stays manageable, it's usually not a big problem for users (= developers that use those compilers, interpreters, libraries), that just deal with that.

Also note that real end users are usually protected from this kind of breakage because they use software coming from distributions (that do the work of trying to make sure that everything they ship can work together), or package managers that allow for strict versioned dependencies.

Maybe what I failed to articulate in the article is that most compilers, interpreters, libraries introduce minor breaking changes on a regular basis,

You appear to assume, or even imply, that this is acceptable because it is the norm today.

There are those of us, just like Keith Wesolowski whose reply I cited previously, who vehemently disagree with that implication and that assumption: we represent the engineering ethos of sitting down, thinking things through ("what is it that we're trying to solve?"), and planning ("how can we implement this so that it can be made backward compatible, so we do not break things for users?")

And that's my problem with your essay: that you accept that, because there is so much haphazard hacking without planning going on, it is okay to sometimes (in reality often) break things. I personally do not, can not, accept that things will break because I had been unwilling or even lazy to think things through and plan in advance, or because I had assumed that everyone has the same amount of spare time as I do or did. It's an engineering ethos.

That is not to claim that things will never break, but I can assure you that exceptional effort will be taken on my part to remain backward (and forward, wherever possible) compatible. That's my engineering promise, to all my users.

The core idea of the essay is based on the premise that haphazard implementations are the reality we should simply accept without second thought, and the proposed solution is reactive, not proactive. Tolerating organic growth instead of planning is the root cause.

Also note that real end users are usually protected from this kind of breakage because they use software coming from distributions (that do the work of trying to make sure that everything they ship can work together), or package managers that allow for strict versioned dependencies.

I don't discriminate: users are users to me. Their time is extremely valuable, and I appreciate that all of them use the computer because they want to get something done. They all have certain rights, and deserve to be treated with respect.

Sadly, the computer industry lost a great spirit when Keith Wesoloski decided he had had enough and retired[1], but the engineering ethos is no less valid today than it was when he wrote these words, and lives on in illumos and SmartOS:

In other words, every entry in the test matrix is either identical to the behaviour seen today or superior. The state of the system has advanced; its capabilities have improved in newly-built software, nothing works less well than before, and customers have the opportunity to apply new capabilities to old binaries if they have the kind of supernatual awareness of those binaries that the GNU/Linux model assumes of everyone. Most importantly, the approach is safe: we do not attempt to change the system's interfaces as presented to existing software. Those contracts are honoured unless specifically requested by the customer, with (one hopes) full knowledge of the consequences.

[1] http://dtrace.org/blogs/wesolows/2014/12/29/fin/

This article talks about problem porting patches back to Linux 2.4, and then makes the comparison with python 2.

Conceptually, the "breaking backwards compatibility" argument works, especially for the dependent ecosystem.

But, it is worth noting that python 2 and 3 build out of the same branch of the same tree. Indeed, you can build the 2.5, 2.6, 2.7, 3.X grammars all off the same code. Sure, there will be some python 3 specific stuff that won't be included when you build 2.X, but any patches are by definition already backported.

merge early. push automated tests to the limit. use feature flags to disable unstable features.

The author is mistaken in adding Linux to this list. There was no "big, disruptive change" in the Linux userspace-facing API (stable syscalls, basically). The Linux developers are quite serious about not breaking that and what Linus said back in 2005 still stands - http://yarchive.net/comp/linux/gcc_vs_kernel_stability.html :

> We care about user-space interfaces to an insane degree. We go to extreme lengths to maintain even badly designed or unintentional interfaces. Breaking user programs simply isn't acceptable. We're _not_ like the gcc developers. We know that people use old binaries for years and years, and that making a new release doesn't mean that you can just throw that out. You can trust us.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact