Hacker News new | past | comments | ask | show | jobs | submit login
Writing and linting Python at scale (fb.com)
119 points by el_duderino on Nov 22, 2023 | hide | past | favorite | 153 comments



They are talking about this project : https://pypi.org/project/fixit/2.0.0a1/#description

https://libcst.readthedocs.io/en/latest/ powers Fixit 2. I would love to hear if there are any production users here for this new linter, and how is that working out for them?


There's a lot of neat Python research coming from Facebook.

libcst (which Fixit uses) is super cool - I use it in https://gitlab.com/harford/logzy


Why do you use that instead of flake8, which also finds that?

(Not meant to be a pointed question; genuinely curious.)


AFAIK flake8 only lints and doesn't apply the fixes.

fixit probably does a better job than my tool but it was fun to write. :-)


(It's a podcast, not an article)


The description links to the tech article.

https://engineering.fb.com/2023/08/07/developer-tools/fixit-...


https://engineering.fb.com/2023/08/07/developer-tools/fixit-...

Just a test, for some reason your link isn't clickable for me.


Looks like they unintentionally (or intentionally for some reason I can't parse ATM?) triggered code formatting. If you have a line preceeded by two or more blank lines that is indented with two or more spaces then it'll <pre> formatting.

  Like This


unintentionally, sorry for that one.


Changed to that from https://engineering.fb.com/2023/11/21/production-engineering.... Thanks!

(I fixed the link in your comment btw. More at https://news.ycombinator.com/formatdoc)


I dream with a unique tool that makes all formatting and linting and sorting and whatever. I don't care too much of the final style. I just prefer not to think too much about it. Ruff is close, let's see.


The fact that ruff is expanding with VC funding has me cautiously optimistic. They will hopefully be funded and incentivized to capture the existing fragmented ecosystem of Python dev support tooling. I’m sure their monetization will be more of the same cloud hosted perks etc we see from similar companies. Which is just fine with me, as it keeps their core offering OSS orthogonal to their value prop, which is usually convenience related rather than feature related.

I normally hate one-tool-to-rule-them-all ambitions, but when it comes to linting, type hints, auto formatting, import ordering, etc (basically anything you’d want to put in a pre-commit hook), all of that could have a much more coherent story if brought under one hood. Subverting pre-commit itself would also be great. The story around the virtual envs these hooks run in is incohesive with development best practices (define your dependencies in one and only one place), and author has made it aggressively clear on multiple occasions they are uninterested in fixing that.


It looks like the interesting feature of their tool Fixit 2 is that its lint rules know how to auto-apply themselves. I'm fine with an auto code formatter, an auto import organizer, but not sure how much I trust a linter to auto-apply "fixes".


Ruby's Rubocop linter has had this for a long time and it works great. I thought this would be the case for most of the mainstream languages.


As someone who worked on a similar tool (https://github.com/ssbr/refex/tree/main/refex/fix/fixers, I did a bunch of the work to prep this for open-sourcing, though I think all my contributions are hidden behind the "Google-internal" anonymization), having auto-applied or auto-appliable fixers like this is super useful.

They can be auto-applied by post-commit (e.g. a generic `git fixcommit` style command that runs all the relevant lint tools and fixes them in the working copy, letting you review before push), or applied during code review (automatic comments with a "click here to apply fix" interface), both of which are nice.

Plus the same underlying tooling can be used to write more complex one-off fixes that may be used for migrations or cleanups.


In the C#/.NET world this is the standard and it works very well. It definitely increases productivity to not have to double check each and every instance of violating a lint error, as you can just have the fix applied to an entire project without having to worry to much.


> increases productivity to not have to double check each and every instance of violating a lint error,

At what cosmic speed should you be pumping out code for this to be a concern?

Also, in C#/.NET, where programmers predominantly use MSVS, which is an atrocious editor with MSBuild, which is an atrocious build system, both hampering productivity...

Also, plenty of linter errors are actual errors that need non-trivial fixing. So, I struggle to understand how that may be automatically fixed (at least in the context of Python). A trivial example: misspelled variable name -- how would the linter know if that's a typo, or that the programmer intended to declare this variable, but forgot to do that?


> Also, plenty of linter errors are actual errors that need non-trivial fixing.

These sound like they're not lints at all.


No, not really. It is perfectly possible to write unintentionally valid code, which would be caught by linter due to the author not following some convention.

Trivial and popular example of such code is assignment instead of comparison in the context of condition. Some languages allow for this to happen, but it's known as a possible mistake and so the conventions in such languages would discourage the use of assignment in the context of condition, even though it technically produces valid code.


It's very useful for multiple scenarios: 1. Learning new syntax. E.g. the new switch expressions are pretty neat and better than the old switch statements. Changing and learning how to write them is super easy thanks to this feature.

2. A code base with less then good practices is vastly easier to adjust to better standards than in python. I've done both and dotnet is mostly applying suggestions and then looking for more advanced problems which where not caught by automatic analysis. It's not perfect, but it reduces useless churn.

This also holds true for rider and vs with resharper.


And VS Code too with the analyzers that come with the sdk! (or numerous others that are available as extensions e.g. Roslynator)


Adding to what other commenters said, ESLint also works like this.

However in an IDE setting it's not exactly "auto"; you have to click the light bulb and accept the fix (idk about VSCode, but in Neovim you can even get a preview of the diff [1]). This is what I'm working on a Fixit PR for right now.

[1] https://github.com/aznhe21/actions-preview.nvim


You don't have to trust it - just commit your changes then run the linter and inspect the diff.


"black" for python has already done this for quite a while.

Even before that, some flake8 linting rules could automatically apply the fix, but not all of then.


Related and a bit recent:

Fixit 2: Meta’s auto-fixing linter for Python - https://news.ycombinator.com/item?id=37036262 - Aug 2023 (11 comments)


I'm happy with Ruff[0], it's very fast.

[0] -- https://github.com/astral-sh/ruff


Unfortunately ruff is very inconsistent and has lots of differences from the flake8 plugins it tries to emulate. Lots of rules are confused by irrelevant context so that it can miss lots of things it should find when the equivalent flake8 plugin still find them. It's automatic fixing of issues will happily introduce other issues that it doesn't find until the next run. I've tried pretty hard to use it and gave up, it's just not remotely ready.


If you notice differences vs. the originating plugins, would definitely appreciate it if you could file an issue! We tend to be very responsive especially when it comes to matters of correctness.

Candidly, for Flake8 plugins, my experience is that Ruff tends to be more consistent, more robust in its inference and, at this point, more extensively tested than the original implementations -- both via our own testing and via the significant number of projects and companies that now use Ruff in production. (As compared to Pylint, though, we catch fewer issues, since Pylint does some type inference across files, which Ruff doesn't support yet.)

Ruff is also designed such that we will iteratively lint-and-fix until there are no more fixable issues, so if you've seen the linter introduce _new_ fixable diagnostics, that would be a bug too.

Regardless, thanks for giving it a try :)


This isn’t really accurate in my experience. I’ve been using it on several projects large and small for some time now and can only recall one time autofix introduced an issue of any sort and that was many versions ago. It will also handle cascading fixes just fine.


> It's automatic fixing of issues will happily introduce other issues that it doesn't find until the next run.

I think a few of the "bad" auto fixes have been recently disabled by default, they now separate fixes into "safe" and "unsafe" (for example: it used to always change `x == True` to `x is True`, which broken common numpy selection patterns)


We've gone all in on Ruff for several months now, across many projects, and not noticed any of these issues.


If you have a smaller codebase which has had consistent good practices you won't find many things anyway. I have a legacy codebase where ruff finds over 25,000 issues. Many times when I fix all of the issues for a given code (say UP031 just to pick an example) I can then run flake8 and find more valid examples of that same issue. On my desktop running flake8 takes about 7s, so there's literally no reason for me to bother with ruff if it is going to be so much less reliable, even if it is 1000x faster.

Ruff is ok I guess if you barely need it, but I have a medium sized (about 400kloc), quite old, and extremely poorly managed codebase that I support which powers a profitable business. Ruff just can't cope with it, at least as of a few months ago.


Kinda funny, but my experience was quite the opposite there. When migrating our codebase from flake8 to ruff, I actually found a number of bugs in flake8 and flake 8 bugbear that were resolved by a more correct implementation in ruff


But but but HN told me that python is only good for small scale and prototypes, the engineers at meta are wrong


Disclosure: Meta employee, but these views are my own, and not all the experiences are from Meta.

Type annotation have made Python much more scalable in terms of engineers and codebase size.

It still has other scale problems, especially if you actually need threads. One project I worked on managed Python worker tasks, and we resorted to subprocesses (within subprocesses!) because what we thought was IO-bound became CPU-bound, and workers started timing out on RPC calls. I also worked on a Python API service that scaled beautifully horizontally, but we had to manage extra logic for spinning up one worker per CPU.

At some point, you actually start caring about performance, but you're more likely to hit other issues before you care about the extra hardware cost.


The engineers at meta worte a language + runtime on top of PHP to make it work "at scale".

If Facebook wants to make something work they have enough resources to throw at the problem to solve it. Regardless of whether it makes sense or not.


I mean, as far as I'm concerned, everything Facebook has done has done nothing but reconfirm how unwise it is to build large-scale infrastructure on dynamic scripting languages. They have the resources to move heaven and earth to do what amounts to turning their dynamic scripting language back into static languages in everything but name, and they have the hole they've dug themselves into that justifies it. I have neither.

I don't think that Facebook necessarily made a mistake at the time, though. Static languages have come a long way. I can prototype in them very nearly as quickly as I can in dynamic languages now, with the crossover point for where the static language is simply straight-up an advantage being roughly one to two weeks. Circa 2004 I would not even remotely make that claim. By the time static languages reached that state they were already in deep.

But I would consider it a mistake for most cases to start right now, in 2023, with dynamic scripting languages as a base layer.


> dynamic scripting languages.

Why keep repeating this nonsense? "Dynamic" or "scripting" aren't features of languages. When anyone says something like this, it's like talking about square chicken... (i.e. a category error). Obviously, you had some idea in your mind, and you wanted to communicate it somehow, but your readers will not know what it was unless you make an effort to analyze what you want to say and make sure it doesn't have internal contradictions and that readers can within reason understand what you are trying to say.

You seem to be complaining about something. My guess is that your problem isn't even with the language. Your problem is that some tools for working with your kind of programs are missing or aren't effective.


Because it a common term used for a category of languages that everybody understands.

If you have a problem with it, you can take it up with the aforementioned everybody. Personally I think it gets perilously close to the error that if you argue definitions enough you can change reality. It doesn't matter what labels we slap on clearly related languages Python/Perl/Javascript/Lua/PHP/Javascript, I consider it a mistake to build a large system with them in 2023 because of their deficiencies, and given the extreme efforts being exerted trying to fix those very same deficiencies with things like gradual typing and strongly-typed languages like TypeScript that compile down into them, which includes the effort Facebook has exerted, clearly it is not a terribly heterodox opinion.


> I consider it a mistake to build a large system with them in 2023

A mistake in what sense? What are the risks that you see with using these languages for large scale systems, and how do those risks balance around all the other risks that come with not using one of these languages and using a different one?


I think I pretty much laid out the bulk of the argument. Facebook is moving heaven and earth to mitigate their choice of a dynamic language, even if it was the right choice at the time. You can go look at everything they've done, and everything Typescript has done, and all the gradual typing initiatives. I left plenty of threads to pull there.


Dynamic Typing is absolutely a feature of languages. Static Typing is, as well.

I was not confused by using these terms. I cannot imagine many people on HN were either.


> "Dynamic" or "scripting" aren't features of languages

Surely dynamic typing is a language feature? I can't imagine what else someone would refer to with "dynamic".


One could make the argument it's the opposite of a feature...


> Surely dynamic typing is a language feature?

Languages are defined by their grammar. (or you can think about few other ways to define a language, s.a. a set of strings of some shape etc.) There's nothing static or dynamic about languages, just like there's nothing static or dynamic about integers or sausages.

In professional literature these words refer to the fact that some claims (or checks) about types of language expressions can be verified only at run time (when the actual value is known), or either at run time or prior to running the program. First is called "dynamic", second is "static".

Any programs can be checked prior to executing those programs. When there's an argument about "static" vs "dynamic", it is an argument about how useful static analysis can be. For example, in Unix Shell all types can be trivially inferred before executing a program: everything is a string, there aren't any other types. But this kind of analysis is also worthless because it doesn't help to check interesting properties of a program.

Non-professional users of this terminology tend to abuse it to mean something unrelated and inconsistent. You will meet a lot of claims that such and such language is "static" or "dynamic", even though, if you ask the person making that claim to explain what they actually mean by that, you'll discover that they don't really know what that is. Unfortunately, this "classification" is so common that a lot of people take it on faith, without even trying to examine it.

Here are some typical misuses of this terminology:

* If the language doesn't have type annotations, then it is a dynamic language. Of course, this isn't true because there are plenty of languages where type annotations are optional.

* If there isn't a compiler that performs static checks, then the language is "dynamic". Well, nobody so far wrote such a compiler, but that's not a proof it cannot exist. Also, of course, whether or not there is a compiler with w/e properties isn't relevant to the language itself.

* If the language has a mechanism to (automatically) reinterpret a value as belonging to unrelated types, then such language is "dynamic". Unfortunately, this makes virtually every useful language a "dynamic" language, which makes the distinction worthless.

You can probably think about other cases where this attempt at taxonomy fails.

----

To sum it up.

"Dynamic typing" is kind of like "West India" -- a result of confusion, a misunderstanding of the person using the term. The closest thing is "dynamic type checking". In turn, "dynamic type checking" isn't a property of a language, it's a phase of program analysis, specifically, when all values in that program have known types. Every program in any language can be analyzed at run time, which makes claims about some language being more "dynamic" than others nonsense.


Could you enlighten me as to how a type system is not part of a language?

For example, in your own words, what is the main difference between typescript and javascript?


Out of the box php also scaled pretty damn far for them before they had to do anything significant. Hundreds of millions of users as early as 2009, which seems to be before the original Hiphop existed.


https://discuss.python.org/t/a-fast-free-threading-python/27...

In which Meta promises 3 engineer-years for enhancing python.


Like PHP, where FB has spent many man years writing AOT compilers, an new VM, and a strong typed variant from PHP?


Python not only has types but it's type system is superior to typeScript.

Get this python has sum types and exhaustive pattern matching exactly like rust or haskell.

The only problem with python are the libraries are sometimes written with tricks that make static typing ineffective. Other than that it is really really good at scale. Better API then typescript imo which is really it's main competitor.

edit: (rate limiter is preventing from replying to everyone... I will respond to everyone.)


> Python not only has types but it's type system is superior to typeScript.

I strongly disagree. It might be better at some things, but it's much worse at others. Many functions can't be accurately typed (try, for example, to make a well-typed function that concatenates two arbitrary fixed-size tuples), and as far as I know generic type transformations can't be implemented (random example "this type takes a Dict[str, Any] and returns an object with each key turned into a function").


For the tuple example:

  from typing import TypeVar
  
  T, U, V, W = TypeVar('T'), TypeVar('U'), TypeVar('V'), TypeVar('W')
  
  def concatenate(a: tuple[T, U], b: tuple[V, W]) -> tuple[T, U, V, W]:
      return a + b
For the generic type transformation example, I'm not sure what you mean:

  from typing import Any, Callable
  
  Transformer = Callable[[dict[str, Any]], dict[Callable, Any]]
This seems to match your question but it's really weird.


Your tuple example only works if both tuples have two elements. I specifically mentioned arbitrary fixed-size tuples (as in, tuples with an arbitrary non-variable length).

Your generic type transformation example also doesn't come close to what Typescript does. The resulting dict will not have known keys based on the keys of the input dict. In Typescript I can write a function that takes an object with known keys, and it returns an object with those same keys having their values mapped to a different type, with the keys still known. I threw together a quick example - just look at the type of resultA/resultB by hovering over the variables[0].

But both are great examples - they are probably the closest you can get in Python, and they are so far removed from the thing I want to represent that they are completely useless.

[0]: https://www.typescriptlang.org/play?#code/MYewdgzgLgBAhjAvDA...


"Arbitrary fixed-size tuples" probably don't have a widely accepted meaning :) I read it as "size known when compiling the function".

The second example is cool. But I can't find a good practical use case for either example.

If you have a collection that's both heterogeneous and whose [size/key set] is statically known, when would it make sense to apply such generic transformations to them? This sounds like you have tuples or dataclasses where all elements have different meanings (since their types are fixed and different) _and_ you want to treat them like a generic collection _and_ you need the type checker to infer the result type.

The main use of tuples or dataclasses or `NamedTuples` is to pass or return values to/from functions without dealing with long lists of arguments. The elements aren't in the same category, it doesn't make sense to process them as one big collection, they mean different things.

(Also I think you made a mistake in your previous message, you wrote "an object with each key turned into a function" but it's the values that change types here.)


> "Arbitrary fixed-size tuples" probably don't have a widely accepted meaning :) I read it as "size known when compiling the function".

That is exactly what I'm talking about - the size of the tuples is known statically.

> The second example is cool. But I can't find a good practical use case for either example.

There are many interesting use cases in libraries, especially for some of the more esoteric features. Everything can be used for additional type safety.

> If you have a collection that's both heterogeneous and whose [size/key set] is statically known, when would it make sense to apply such generic transformations to them? This sounds like you have tuples or dataclasses where all elements have different meanings (since their types are fixed and different) _and_ you want to treat them like a generic collection _and_ you need the type checker to infer the result type.

Simple example - I have functions that return a Rust-like Result type, and I want to transform that into a different tuple-based format using a decorator. The transformation itself is static, but I can't write one function that handles it all, because Pythons type system is simply not developed enough. Something that would be incredibly easy in Typescript.

> The main use of tuples or dataclasses or `NamedTuples` is to pass or return values to/from functions without dealing with long lists of arguments. The elements aren't in the same category, it doesn't make sense to process them as one big collection, they mean different things.

But I have a use case for exactly this feature. Why should the language limit me? Why should I implement x functions that take different tuple lengths, with me having to choose the correct one for each use case, when I could write one function that does all?

> (Also I think you made a mistake in your previous message, you wrote "an object with each key turned into a function" but it's the values that change types here.)

Sure, though I could also literally write a Map that turns all object keys into functions.


> But I have a use case for exactly this feature. Why should the language limit me? Why should I implement x functions that take different tuple lengths, with me having to choose the correct one for each use case, when I could write one function that does all?

If the PSF ran a poll for the most-wanted type checking features, I don't think this would come close to first. This sounds very niche. The people working on typing seemed very busy in the last few versions.

> Simple example - I have functions that return a Rust-like Result type, and I want to transform that into a different tuple-based format using a decorator. The transformation itself is static, but I can't write one function that handles it all, because Pythons type system is simply not developed enough. Something that would be incredibly easy in Typescript.

Returning errors as values isn't really how you're supposed to use Python though. And why is there a second tuple-based format that does the same thing?

It also smells slightly off that the rest of the code takes an object that's exactly similar to the first function's result, but with a transformation uniformly applied over the values. Shouldn't the first part's output and the second part's inputs both be clearly declared independently of each other? And then wouldn't it be an extremely niche case that both types are identical except for one transformation applied to all values? Is it worth the language complexity and a dedicated function?


> If the PSF ran a poll for the most-wanted type checking features, I don't think this would come close to first. This sounds very niche. The people working on typing seemed very busy in the last few versions.

Sure, but while it's not possible to type basic functions like ones that concatenate tuples, I can say that Pythons typing system is not superior to Typescripts.

> Returning errors as values isn't really how you're supposed to use Python though.

Okay, so how am I supposed to handle non-exceptional errors? Because using exceptions for that kind of thing absolutely isn't good practice.

> And why is there a second tuple-based format that does the same thing?

Legacy. Typescript allows me to do refactoring of things like these step by step and very easily. Python doesn't, because it's inflexible.

> It also smells slightly off that the rest of the code takes an object that's exactly similar to the first function's result, but with a transformation uniformly applied over the values. Shouldn't the first part's output and the second part's inputs both be clearly declared independently of each other? And then wouldn't it be an extremely niche case that both types are identical except for one transformation applied to all values? Is it worth the language complexity and a dedicated function?

Are we taking apart my code now or what? I can't stop the world and focus only on refactoring things to be neat and tidy for months on end. But I can improve individual parts, bit by bit - and in a language with a better typing system, I can do so way easier.


> Sure, but while it's not possible to type basic functions like ones that concatenate tuples, I can say that Pythons typing system is not superior to Typescripts.

That's not really fair. The uses that make sense when considering Python's convention ("Pythonic" code, nebulous but usually well-understood) are supported.

I think what confused me about these examples is that they imply multiple values that have completely different meanings, but all get processed as equals anyway. That was before you talked about refactoring old code though.

> Okay, so how am I supposed to handle non-exceptional errors? Because using exceptions for that kind of thing absolutely isn't good practice.

If an operation on a homogeneous collection can say "nope" for some values and process others, the values would be typed "T | None". If the data isn't really a collection but a structured mapping, in general, attributes would be made optional on a case-by-case basis. If all attributes happen to be optional but the mapping itself is non-optional, that sounds more like an accident of this specific case than something we should complicate a language over. If this happens over a whole codebase, I guess I feel for you. Maybe that's when it makes sense to give up a bit of static typing and treat these values a bit more like data and a bit less like separate arguments, no matter what kind of complicated typing the language can do.

> Legacy. Typescript allows me to do refactoring of things like these step by step and very easily. Python doesn't, because it's inflexible.

> Are we taking apart my code now or what?

Honestly yes, this example seems so unusual that it doesn't make sense to debate it without knowing concretely what's happening in your code that this needs to be supported.

I guess that this specific example would get easier. But I would hardly call one type system superior over that.

Edit: my final opinion on this is "this is something that's technically possible if you follow language and conventions to the letter, but with experience you see that it's a bad idea that won't fit well with the language and you should change the design to avoid it". It happens in all languages IMO.


> That's not really fair. The uses that make sense when considering Python's convention ("Pythonic" code, nebulous but usually well-understood) are supported.

You're focussing on a very specific part of what I wrote. Just because my specific example consists of a Result type being transformed into a Tuple, it doesn't mean that the basic use case of "concatenate two tuple types" is so far out there.

> f an operation on a homogeneous collection can say "nope" for some values and process others, the values would be typed "T | None". If the data isn't really a collection but a structured mapping, in general, attributes would be made optional on a case-by-case basis. If all attributes happen to be optional but the mapping itself is non-optional, that sounds more like an accident of this specific case than something we should complicate a language over. If this happens over a whole codebase, I guess I feel for you. Maybe that's when it makes sense to give up a bit of static typing and treat these values a bit more like data and a bit less like separate arguments, no matter what kind of complicated typing the language can do.

But the static typing is extremely helpful, it prevents many kinds of errors. Not being able to use it for these kinds of things makes Python a worse language, no matter how you cut it (in the sense that it would be a better language if you could).

> Honestly yes, this example seems so unusual that it doesn't make sense to debate it without knowing concretely what's happening in your code that this needs to be supported.

Again, my specific example doesn't matter. The use case is "function takes in two tuple types and returns a concatenated version". That's something a type system should be able to handle.

> I guess that this specific example would get easier. But I would hardly call one type system superior over that.

On what metric besides expressiveness would you rate type systems?


> I specifically mentioned arbitrary fixed-size tuples (as in, tuples with an arbitrary non-variable length).

This is wrong. Again, Arbitrary fixed-size tuples are equivalent to structs with an arbitrary amount of properties. Languages shouldn't do this, it destroys the nature of what a TUPLE is which is essentially just a struct with no names.

The concept you are going for is isomorphically encapsulated by ANOTHER type:

    List[Any]
You should be using the above type to encode what you want conceptually.

That being said if javascript has variadic tuples then it's not a very good type system imo. It encodes redundant concepts. Why have a tuple with Variadic arguments when I have Arrays that do the exact same thing?


> This is wrong. Again, Arbitrary fixed-size tuples are equivalent to structs with an arbitrary amount of properties. Languages shouldn't do this, it destroys the nature of what a TUPLE is which is essentially just a struct with no names.

Okay, that might be your personal feelings on the topic. But do you understand the concept of "generic functions"? Sometimes you have to apply generic transforms to data. Being able to correctly express your transformations in a type system isn't "wrong", it's useful.

> List[Any]

Sorry, but I really think you don't understand what I'm talking about. If I write a function that handles tuples of arbitrary length and that function returns a transformed version of that tuple, I keep the information about individual tuple elements. This is thrown away in a list.

> That being said if javascript has variadic tuples then it's not a very good type system imo. It encodes redundant concepts. Why have a tuple with Variadic arguments when I have Arrays that do the exact same thing?

Arrays don't do the same thing, so they are not redundant concepts. Tuples have elements in specified positions with specified types. Arrays have one type (possibly a union type) over many elements.


>Okay, that might be your personal feelings on the topic. But do you understand the concept of "generic functions"? Sometimes you have to apply generic transforms to data. Being able to correctly express your transformations in a type system isn't "wrong", it's useful.

There's nothing like this in any type system I've seen. A struct with a generic amount of properties? Nonexistent. This isn't personal. This is the definition of a tuple. A tuple is a struct with no names. It is not a personal opinion.

You can have generic functions that operate on generic types but there's no such thing as a struct with generic amount of properties. Closest thing is a list.

>Sorry, but I really think you don't understand what I'm talking about. If I write a function that handles tuples of arbitrary length and that function returns a transformed version of that tuple, I keep the information about individual tuple elements. This is thrown away in a list.

This isn't an opinion. There's no such thing as tuple types of arbitrary length unless the implementer decides to get hand wavy with the definition of what a tuple is.

What you're talking about is only possible with dependent types. Very very few languages support this but the risk of doing this is it makes type checking undecidable. It's also extremely challenging to program this way.

Does typescript support dependent types? Probably but that's outside the realm of normal programming it's most likely exists as obscure tricks. You're getting into Idris, proof checkers and such.

Imagine this:

    func (x: Array[N], x:Array[M]) -> Array[M + N]
where M and N is the size of the array. It's called dependent types because types are getting mixed with programming level terms.

This is essentially what you need, but you want this level of type checking with structs/tuples. It's not just a "generic variable" It is much more then that:

   func(x: Tuple[*args1] y: Tuple[*args2]) -> Tuple[*(args1 + args2)]
There's nothing wrong with this but once you get into this it's beyond traditional type systems. Practical programming rarely ventures to far into this world since it's really hard to even fully prove even trivial things. You'll see it's bringing the execution of programs into the type checking level.

Maybe typescript has some shortcut that makes this level of type checking available for tuples, maybe that's what you're getting at. Unlikely that dependent types are supported generically.

If ts Does supports dependent types, this is definitely something I did not know about. It does change the equation, but I suspect it's very much outside normal usage of the language.

>Arrays don't do the same thing, so they are not redundant concepts. Tuples have elements in specified positions with specified types. Arrays have one type (possibly a union type) over many elements.

Arrays are the thing you want for variadic containers. For memory optimized languages like rust or C++ arrays are defined with a size. Array[5] is a different type then Array[3].

You can define a "interface" that accepts generic arguments to arrays:

    func(a: Array[], b: Array[]) 
but you can't define the function above where a function creates a new type that's dependent on the internal types of a and b.


> There's nothing like this in any type system I've seen. A struct with a generic amount of properties? Nonexistent. This isn't personal. This is the definition of a tuple. A tuple is a struct with no names. It is not a personal opinion.

I have literally shown you two type systems that have this feature. Why do you ignore them?

> You can have generic functions that operate on generic types but there's no such thing as a struct with generic amount of properties. Closest thing is a list.

Why are you acting like TypeVarTuples don't exist?

> This isn't an opinion. There's no such thing as tuple types of arbitrary length unless the implementer decides to get hand wavy with the definition of what a tuple is.

Again, why are you acting like TypeVarTuples don't exist?

> Arrays are the thing you want for variadic containers. For memory optimized languages like rust or C++ arrays are defined with a size. Array[5] is a different type then Array[3]

No, they are not, and I don't understand how you still don't get that. Python arrays don't carry any information about their length in their type, Python tuples do. They are not the same.

> but you can't define the function above where a function creates a new type that's dependent on the internal types of a and b.

Python literally already supports typing a function that creates a type that's dependent on the internal types of a and b. Why do you keep claiming it doesn't?


Tuples are already fixed size by nature, so adding a redundant "fixed-size" in that description was confusing. I also thought you meant a predefined size like always 2-tuples or always 3-tuples.


> Tuples are already fixed size by nature, so adding a redundant "fixed-size" in that description was confusing.

No, you can type variable-length tuples in Python. A variable int tuple, for example, can be typed as Tuple[int, ...].

You can't concatenate two variable-length tuples, which makes sense - where would the cutoff be? But you should absolutely be able to concatenate two fixed-size tuples, and it's very limiting that you can't.


> A variable int tuple, for example, can be typed as Tuple[int, ...].

That's a type that matches tuples of any length, not a variable-length tuple. The size of a tuple can't be changed. A variable-length tuple doesn't even really make sense, what you'd want there is a list.

> You can't concatenate two variable-length tuples, which makes sense - where would the cutoff be? But you should absolutely be able to concatenate two fixed-size tuples, and it's very limiting that you can't.

This whole statement doesn't make sense. I'm assuming you're still talking about type definitions and not actually tuples.


> That's a type that matches tuples of any length, not a variable-length tuple.

A tuple with a type that matches variable lengths of tuples is a variable-length tuple for that piece of code. You're free to show me some official definitions that proves this wording false, but until then it's useless nitpicking. Though you should probably take that up with Guido, who also calls them variable-length tuples: https://github.com/python/typing/issues/30

> This whole statement doesn't make sense. I'm assuming you're still talking about type definitions and not actually tuples.

The statement makes perfect sense, thank you. If you have trouble understanding my messages without me repeating the whole definition every time, maybe just skip them.


I think they've interpreted "variable length tuple" as "a tuple whose length can change", not "a tuple whose length could be one of multiple options".

The former is of course not possible with tuples being immutable, which is why they're talking about lists.


Yeah, that seems likely. I'm not sure how to express the concept aside from the name I've seen used in the community (and my many subsequent explanations), considering I was explicitly talking about typing.


I don't know either. Since the alternative interpretation is a contradiction in terms, you'd think this name would cover the intended meaning.

Maybe something like "unknown length tuple"? chatgpt suggested "arbitrary-length" or "undetermined-length" as synonyms, that could be a more easily understood expression perhaps.


Those are good suggestions, thanks!


You mean that the tuples' size is statically known at the calling site, while your message could be interpreted as the size being statically known in the callee.

I think this is clearer. The statement "arbitrary fixed-size tuples" sounded a bit like "an immutable mutable variable". It doesn't really say what's arbitrary about the tuples and in what context the size is fixed.


Considering the opposite is called a "variable-length tuple", and I wanted to express that I'm talking about arbitrary tuples with non-variable length, what wording would have made this clearer?


What's missing is in which context the tuple's length is variable, and in which context it is fixed. You can have a tuple size fixed everywhere (because the callee sets it) or a tuple size fixed at each call (and propagated to the callee statically).

"Arbitrary" doesn't really help because it could refer to the elements' values, to their types, or to the tuple's length. Also "arbitrary" and "variable-length" sound like synonyms to me.

Guido might use some expressions in the context of Python steering discussions but that doesn't make them less obscure for the rest of us who read C++ docs every day instead.


> What's missing is in which context the tuple's length is variable, and in which context it is fixed.

Simple example: a function has a parameter whose type is "variable-length tuple of int". You can pass any tuple in that is known to have 0..n elements, all of type int. What would you have me call that, other than the name I've seen used in discussions on this feature?

> "Arbitrary" doesn't really help because it could refer to the elements' values, to their types, or to the tuple's length.

Read it as (arbitary (fixed-size tuples)). It was meant to forgo answers describing functions with known tuple sizes.


> Simple example: a function has a parameter whose type is "variable-length tuple of int". You can pass any tuple in that is known to have 0..n elements, all of type int.

And n is fixed at the calling site, right? I wonder if something like "TypeVar, but for a list of type arguments" could solve your problem.

What's funny is that this is already kind of implemented in `typing.Concatenate`, but only for function parameters [1], not for type hint parameters.

Anyway, I would have written "a well-typed function that concatenates two arbitrary tuples whose size is statically known at the call site". Can't really remove "at the call site" or "statically known" without being ambiguous.

Edit: just found out about `TypeVarTuple`. So really we're only missing `concatenate`.

[1] https://docs.python.org/3/library/typing.html#typing.Concate...


> And n is fixed at the calling site, right? I wonder if something like "TypeVar, but for a list of type arguments" could solve your problem.

Yep, and TypeVarTuple should - all the syntax etc. is in place, there is an Unpack operator for TypeVarTuples, allowing you to e.g. append or prepend individual types to a TypeVarTuple. But you can't unpack more than one TypeVarTuple in an expression, it's specifically disallowed - so I can't properly type my function.


I'd call that function: polymorphic over tuple length.


That kind of works. I guess I keep thinking about it from the perspective of the type itself instead of the function that uses it.


It can also get confusing, because in some languages there is no general notion of being a tuple or n-tuple if you wish. Even if they are casually all called tuples.

For example in Haskell 2-tuple and 3-tuple are simply distinct types, as distinct as Int is from String. You can't speak to the type system about "all n-tuples".


The tuple thing requires variadic generics from my understanding.

I don't thing variadic generics support is supported in most statically typed languages. The only one I can think of right now that supports this is C++.


Typescript supports it too (quick example[0]) :) and Python actually as well, but currently you can't unpack two TypeVarTuples in the same type expression: https://peps.python.org/pep-0646/

[0] https://www.typescriptlang.org/play?#code/C4TwDgpgBAglC8UDaA...


Wow this is very cool thanks for sharing!


You're welcome! That's why I'm very excited about Typescript, the system is very powerful :)


Python can almost do this with Variadic Generics from Python 3.11; its missing an exception to the single-unpacking rule (which exists to prevent ambiguity) to allow unambiguous cases. Then you would have:

  from typing import TypeVarTuple

  Ts = TypeVarTuple("Ts”)
  Us = TypeVarTuple("Us”)

  def tconcat(
    t1: tuple[*Ts], 
    t2: tuple[*Us]
  ) -> tuple[*Ts, *Us]: ...


No this is actually wrong. It works but it doesn't capture the meaning of the nature of a tuple.

A Tuple is Typed as something as a Fixed size. That's right, it's like this at the Type level. The entire concept of a tuple is a Product type or essentially like a struct but with no names for each parameter.

   Tuple[int, str, float] #correct
Doing what you're doing here is equivalent to creating a Struct with variadic properties.

If you want some container that holds an arbitrary amount of things that is a List

The correct type for what you want is actually this:

   List[Any]


> It works but it doesn't capture the meaning of the nature of a tuple.

That's backwards. It doesn't work (as noted in GP, because Python supports unpacking only a single variadic parameter in a type annotation), but it does capture the nature of a tuple.

It is (or, more precisely, would be, if the syntactic limitation was relaxed to allow it) a generic function that operates on two tuples of arbitrary tuple types and returns a tuple of a third tuple type whose shape is a simple concatenation of the shapes of the input tuples.

> Doing what you're doing here is equivalent to creating a Struct with variadic properties.

Well, a "struct" in Python is a particular low-level byte-mapped datatype, but the generic concept of a Struct differs from a tuple in two ways -- fields identified by name and that Structs are generally mutable.

> If you want some container that holds an arbitrary amount of things that is a List

But... I don't want that, and that's not what this works on.

There's a difference between a function that works on containers of changeable length and a generic function that works on immutable containers (so, fixed length and contents), but is adaptable to any length (or more specifically, shape, including not only length by order of types of elements) of arguments, producing a result of a container type whose shape is strictly determined by the shapes of the inputs, with the shapes statically verifiable.

Python is very close to allowing that, which is very different than:

  def concat(l1: list, l2: list) -> list: ...
which is a fine function, but not at all what the other one is about.


> a generic function that operates on two tuples of arbitrary tuple types and returns a tuple of a third tuple type whose shape is a simple concatenation of the shapes of the input tuples.

This isn't definable without dependent types or a concat operator at the type level. Dependent types aren't a thing we are talking about here and languages that support this feature are waaay out of scope.

You're basically asking for stuff like this:

   func concat(a: Array[int; M], b Array[int; N]) -> Array[int; N + M]
or

   func concat(a: Tuple[*args1], b Tuple[*args2]) -> Array[*(args + args)]
Which is bringing those programming terms up a level into types. This doesn't exist in the realm of programming we're talking about. Idris, Lean and Coq are more you're game here and those languages are clearly off topic.

>But... I don't want that, and that's not what this works on.

You want something well out of scope of this conversation and well out of scope of most practical programming languages. You still don't get what a tuple is. Nobody defines a Product type with an arbitrary amount of properties anymore then someone defines a struct with an arbitrary amount of properties anymore then they would do it for a tuple.

This is NOT the definition of a tuple. People are confusing the Tuple for some Immutable list.

>Python is very close to allowing that, which is very different than:

Python shouldn't allow this, neither should TS. If typescript allows it, it's probably some arbitrary shortcut feature. Dependent types at the type level supported in a very generic way is very very unlikely.

Like you people are trying to explain to me what a generic is while doing all kinds of programming operations on type level entities like concatenating them and combining shapes. Is the concat the only way to do it? What about an intersection? These operators only come into play with advanced theoretical langauges like Lean or Idris and it's very off topic.


> This isn't definable without dependent types or a concat operator at the type level.

It takes relaxing one overeaeger limitation in Python's type annotation syntax, a limitation which is necessary in general to avoid ambiguity, but could be relaxed in specific, enumerable cases that do not produce ambiguity.

> You're basically asking for stuff like this:

I'm not sure exactly what your examples mean because I'm not sure exactly what syntax that is supposed to be, but in actual Python type annotations with a relaxation of the no multiple unpacking rule for variadic generics, what would be allowed is things like the example I posted upthread for tuples:

  def tconcat(t1: tuple[*Ts], t2: tuple[*Us]) -> tuple[*Ts, *Us]: ...
or, for Arrays:

  def product(a1: Array[DType1, *Shape1], a2: Array[DType2, *Shape2]) -> Array[tuple[DType1, DType2], *Shape1, *Shape2]: ...
concat for arrays would take dependent types, but that's not what is being discussed.

> This doesn't exist in the realm of programming we're talking about.

All the concepts necessary for what the upthread poster noted TypeScript already has and was superior to Python because Python lacks, and which I pointed out Python is relaxing one limitation short of also having, already exist in Python since Python 3.11, there is just a syntactic limitation on using the necessary combination to avoid ambiguities (which occur in different circumstances). And, of course, all the necessary concepts also exist in TypeScript. And those two languages are "the realm of programming we're talking about".

> You want something well out of scope of this conversation

No, literally, I'm discussing the thing that is the advantage of TypeScript's type system over Python's that has been the subject of this conversation, which is not, ipso facto, well out of the scope of the conversation.

> and well out of scope of most practical programming languages.

It exists in TypeScript which is widely used, whether or not you thin it is "practical".

> These operators only come into play with advanced theoretical languages like Lean or Idris and it's very off topic.

The discussion was about a comparison between TypeScript's type system (which supports this) and Python's (which does not quite, though it has all the pieces with a syntactic restriction that prevents certain useful combinations.) These are not "advanced theoretical languages", and the discussion you jumped into is not at all off-topic (certainly not of itself, despite your jumping in to derail it, and not of the broader discussion of Python and scale because JS/TS is a similar industrially popular language ecosystem to Python with a dynamically typed base and optional statically type-checked layer.


I see. So ts arbitrarily supports concatenation of generic parameters via unrolling.

All right. I can admit TS is superior from this angle because of that arbitrary support. And I can admit that the arbitrary support follows from the way both languages use unrolling syntax.


It absolutely isn't wrong, and it's strange that you have some aversion against it. Think of Tuple[int, ...] as a union of Tuples of all lengths containing only ints. It's a specific language feature, and it's not "wrong" in any way.

And I have no idea why you claim that a list is "correct".


You're thinking fixed size arrays. It doesn't make sense to have tuples of arbitrary size. To make it arbitrary size you have to constrain it on a single generic parameter which makes it functionally equivalent to a fixed size array. You essentially just do this with an array or a list.

Read my other reply to you to see the problem of taking functions with "generic" number of parameters. What are you going to do with each parameter? What are the types of each parameter? You can't do anything other then breaking out of the type system. Basically you're filling those tuples with a bunch of "Any" types. So what's the point of the type system now? There is No point. Just like List[Any]

To get your idea to work you need dependent types.


> You're thinking fixed size arrays. It doesn't make sense to have tuples of arbitrary size.

My god. No, Python doesn't support "fixed size arrays", but it does support tuples of arbitrary size, no matter how often you claim that to be wrong.

> Read my other reply to you to see the problem of taking functions with "generic" number of parameters. What are you going to do with each parameter? What are the types of each parameter? You can't do anything other then breaking out of the type system. Basically you're filling those tuples with a bunch of "Any" types. So what's the point of the type system now? There is No point. Just like List[Any]

I could, for example, do the thing I've been writing dozens of replies asking for: concatenate two arbitrary tuples. It's a perfectly well-defined operation, and without an artificial limitation on unpacking of TypeVarTuples, it would already be possible.

It is already possible to do what you claim impossible with one arbitrary tuple. How can that be? You keep going on and on about how that doesn't work. Why does it?


>My god. No, Python doesn't support "fixed size arrays", but it does support tuples of arbitrary size, no matter how often you claim that to be wrong.

My god I never claimed this. I'm just talking about programming in general. I'm trying to show you that what YOU are asking for is programming operations to happen at the type level. You're not seeing it correctly.

>I could, for example, do the thing I've been writing dozens of replies asking for: concatenate two arbitrary tuples. It's a perfectly well-defined operation, and without an artificial limitation on unpacking of TypeVarTuples, it would already be possible.

You're not getting it. Ok let me spin it to you another way.

Imagine you Do have two tuples that you concat:

    function concat(x: Tuple[*arg1], y: Tuple[*arg2]) -> Tuple[*(arg1 + arg2)]:
Right? You notice the plus operator at the end there where I concatenate all the generic arguments and unroll it right? You're saying TS supports this and that it's common languages to support that "common" operation.

Well here's another common operation. What if I want the INTERSECTION of the two tuples?

   function intersect(x: Tuple[*arg1], y: Tuple[*arg2]) -> Tuple[*(arg1 | arg2)]:
You see what's going on here? You're asking for programming operators like concat and intersect and union INSIDE of types. You are asking for features that DON'T EXIST in practical type based languages. These features exist in academic languages. And those academic languages are completely off topic.

That's what I'm getting at. Read it and understand it.

>I could, for example, do the thing I've been writing dozens of replies asking for: concatenate two arbitrary tuples. It's a perfectly well-defined operation, and without an artificial limitation on unpacking of TypeVarTuples, it would already be possible.

You can do this and define a function that does this in python and likely typescript. But you would be EXITING the type system when you do this. That means using keywords like Any and potentially running into runtime errors.

I want to bold that part above. I'm pretty sure whatever you're doing either uses a bunch of Any's which DOES not preserve the shape fully. This is likely what's going on. It may pass the type check but it's not doing what you mentioned in your initial post which is Preserving the shape of the type.

I haven't used typescript. But IF what you say about typescript is TRUE and that it type checks the concatenation of shapes then it's likely an arbitrary one off feature. Have a look at those functions again:

       function concat(x: Tuple[*arg1], y: Tuple[*arg2]) -> Tuple[*(arg1 + arg2)]:
       function intersect(x: Tuple[*arg1], y: Tuple[*arg2]) -> Tuple[*(arg1 | arg2)]:
Basically you're saying that you want some language that supports type level programming where you can concat types, intersect types, divide types, likely bring programming terms up into the type level and have propositional terms like

      func add(x: int < M, y: int < N) -> int < M + N - 1
All of this is unlikely to be supported by any practical language. Likely TS has a one off where if you do this:

      func concat(x: Tuple[*args], y: Tuple[*args]) -> Tuple[M]:
           return x.concat(y) 
Where x.concat(y) is just special in that it is the only method that instantiates a typed tuple that is the concatenation of both tuples. Again I repeat it is unlikely for typescript to support what you are asking for in a very generic way where you can program your types.


The operator exists. I can use it today, without exiting the type system, both in Python (limited to only unpacking one tuple type) and Typescript (not limited). Why do you keep claiming this to be impossible, when it already exists right now?

It's also in no way a special case in TS, the type system supports many things you need dependent types for. Maybe read up on it before you claim it can't do these things?


Great. Define the concat and intersection function. Do it in python and ts. That is your claim prove it.

>Maybe read up on it before you claim it can't do these things?

Maybe don't be rude. If I'm wrong I'm wrong. But if you say something like this you need to back up your words. You "read" it. So prove it.


I defined the concat function in TS and Python below: https://news.ycombinator.com/item?id=38391702

Now, obviously Python doesn't support it due to a restriction in the TypeVarTuple PEP, which is the literal entire point of everything I have written. But what more are you asking for? What more do you want as an example?

I am not going to touch the intersection function, as I haven't made any claims in that direction.


> To make it arbitrary size you have to constrain it on a single generic parameter

Your information is out of date since Python 3.11:

https://peps.python.org/pep-0646


Bro, Look at this:

https://peps.python.org/pep-0646/#multiple-type-variable-tup...

Does this not exactly mean: "To make it arbitrary size you have to constrain it on a single generic parameter"


> Bro, Look at this:

I’m not your bro and I've been discussing what that PEP does and does not allow in every post in this subthread.

> Does this not exactly mean: "To make it arbitrary size you have to constrain it on a single generic parameter"

No, it doesn't:

  Ts = TypeVarTuple(*Ts)

  def f(cond: Callable[[],bool], t: tuple[*Ts]) -> None | tuple[*Ts]: ... 
The parameter t is an arbitrary-sized (and shaped) tuple, and the return value of f has the same shape as the parameter it is called with.

No multiple unpacking means you can't type things like concatenations of arbitrary-shaped tuples without losing type specificity (and resorting to just tuple), but it doesn't mean you can't type functions using arbitrary sized and shaped tuples outside of that restriction.


>I’m not your bro and I've been discussing what that PEP does and does not allow in every post in this subthread.

It's called an expression. It's a common one in the US and in countries that have English as a first language.

>No, it doesn't:

Lol yes it does. I said what I said and meant what I meant. I'm in control of that you aren't. What's going on here is that you either misinterpreted it or you're just wrong.

>No multiple unpacking means you can't type things like concatenations of arbitrary-shaped tuples without losing type specificity (and resorting to just tuple), but it doesn't mean you can't type functions using arbitrary sized and shaped tuples outside of that restriction.

You literally and I mean literally re-explained what I meant. Then Proceeded to imply I meant something else. Read it: to make a parameter of arbitrary type you have to constrain it on a single parameter.

Word for word. Don't twist it.


That's the conclusion I also arrived at. The type system is slowly getting there, but many operations like this one are still not possible.

It's a shame, because it makes features like decorators significantly harder to use with static typing.


I haven't seen a type system that allows variadic types for Tuples. This would be equivalent to creating a struct with variadic amount of properties.

The definition for tuples here is similar to a struct.

They are one in the same except structs have names for each property while tuples don't. That is literally the main concept of a tuple, just a struct with no names for properties.

The type system for python is already "there", it is in fact superior to many other type systems from other popular languages.


> I haven't seen a type system that allows variadic types for Tuples. This would be equivalent to creating a struct with variadic amount of properties.

No, it wouldn't.

It would be equivalent to creating a generic type where the concrete types that would be compatible with it would be structs of different shapes.

But if you use the same variadic type within a particular context (e.g., a function signature and definition), then any given call to the function, that variadic type must represent the same concrete type each place.


That's confusing, considering Python has them with TypeVarTuples?


Yeah they're getting flexible with the definitions. The developers are committees of people many of which don't know type theory and introduce arbitrary concepts based off of misguided intuitions.

It's the same with typescript I'm sure.

You will note that this thing doesn't exist in haskell for tuples because haskell devs tend to be well versed with concept of what a tuple is.


So you keep going on and on about how this is wrong, and doesn't work, and doesn't exist, and it's impossible, because it isn't implemented in Haskell? Seriously?

This is one of the strangest and worst interactions I've had on this site, because I essentially made a comment that goes "the sky is blue, but it looks better at sunset when red" and you keep going "nu-uh! Impossible! Doesn't work! The sky is blue, nothing else makes sense!"


>So you keep going on and on about how this is wrong, and doesn't work, and doesn't exist, and it's impossible, because it isn't implemented in Haskell? Seriously?

Yeah seriously. You actually don't know what you're talking about. You're not convinced. I'm failing on that end, but this is real, you're not understanding me. Haskell one of the languages with the most advanced type systems in practical programming. Algebraic data types. You clearly haven't used it so of course you remain unconvinced.

>This is one of the strangest and worst interactions I've had on this site, because I essentially made a comment that goes "the sky is blue, but it looks better at sunset when red" and you keep going "nu-uh! Impossible! Doesn't work! The sky is blue, nothing else makes sense!"

It seems this way to you because frankly, you're not very knowledgeable of what I'm talking about. Read my other replies. I left one more to explain it to you in a bigger way. Hopefully you'll get it and stop violating the rules here with personal insults and calling this the "worst" and "strangest" interactions you ever had on this site. It's just fucking rude.

This is my last explanation for you:

https://news.ycombinator.com/item?id=38391431

If you still don't get it, well I can't help you.


> Haskell one of the languages with the most advanced type systems in practical programming. Algebraic data types. You clearly haven't used it so of course you remain unconvinced.

Wrong.

> It seems this way to you because frankly, you're not very knowledgeable of what I'm talking about.

This wouldn't come across quite as bad if I didn't literally post an example long ago. If you're so knowledgeable about all this, why can't you acknowledge the examples I've already given? Why are you ignoring them?


I'm not, I acknowledge everything you say. It may seem that I'm not acknowledging stuff. But this is not the case. I read and understand all your posts.

You on the other hand. I doubt you read or understood all my responses to you. You only partially understand.

I also don't think you have used haskell before. This is what I mean. You aren't knowledgeable.


> I'm not, I acknowledge everything you say. It may seem that I'm not acknowledging stuff. But this is not the case. I read and understand all your posts.

Then why have you not yet acknowledged the Typescript concat function I posted here yesterday? You have since written 16 replies, not a single one of which acknowledges this. Why do you keep not acknowledging that I gave you the function you consider impossible?


   def concat_tuples(tuple1: Tuple[int, str, float], tuple2: Tuple[str, str, str]) -> Tuple[int, str, float, str, str, str]:
       return tuple1 + tuple2

Your second function you just broke out of the type system with Any. Give me a more exact, are you saying Any is a constrained type variable? The only possibility here is this:

   T = TypeVar('T')
   def transform(x: Dict[str, T]) -> Dict[str, Callable[[], T]]:
       return {key: lambda : value for key, value in x.items()}
If you want that lambda to do something else you have to constrain T. Constraining T will give me more options in the definition:

   from numbers import Number
   def transform2(x: Dict[str, Number]) -> Dict[str, Callable[[Number], Number]]:
       return {key: lambda y: value + y for key, value in x.items()}

   def transform3(x: Dict[str, Number]) -> Dict[str, Callable[[], Number]]:
       return {key: lambda: value + value for key, value in x.items()}
etc...

The thing is because these type checkers are external, anyone can add arbitrary features to them and extend it. Someone can make it to the level of proof checking... eliminating the need for testing in general. Of course the syntax has to support it too.


> concat_tuples

I specifically talked about arbitrary tuples. Your function only handles specific tuples.

> Your second function you just broke out of the type system with Any. Give me a more exact, are you saying Any is a constrained type variable?

I'm not sure what you're trying to say here. Yes, you can implement the function, I never claimed otherwise. I'm talking about the _type system_.

> The thing is because these type checkers are external, anyone can add arbitrary features to them and extend it.

The type system is a core feature of Python. Until it supports what I want, I can't really use what I want with Python.


>I'm not sure what you're trying to say here. Yes, you can implement the function, I never claimed otherwise. I'm talking about the _type system_.

The type constrains the definition. Look at the definition, there is ONLY one possible definition when converting this:

    Dict[str, T] -> Dict[str, Callable[[], T]]
Literally. Try to think of another way to define the lambda, you can't. The more open the type the more constrained the definition the more constrained the type the more open the definition.

> I specifically talked about arbitrary tuples. Your function only handles specific tuples.

I remarked on this in other replies. How will you define the return value of the concat and have it exactly type correct? It's not possible bro. You can't do much with tuples of arbitrary types. Not unless you have dependent types.

>The type system is a core feature of Python. Until it supports what I want, I can't really use what I want with Python.

Type checking in python is an external application. The interpreter doesn't do any type check. Type checking is achieved by running another program. All kinds of additional features and custom interpretations on types can be inserted into these type checkers.


> The type constrains the definition. Look at the definition, there is ONLY one possible definition when converting this:

Yes, and if Python had a better type system, I could write a more concrete definition. Why don't you understand that?

> I remarked on this in other replies. How will you define the return value of the concat and have it exactly type correct? It's not possible bro. You can't do much with tuples of arbitrary types. Not unless you have dependent types.

Of course it's possible, the type system even supports almost everything that's necessary. You can already define what I want with a single fixed-size tuple of arbitrary size, you just can't do it with multiple ones. Why do you have this strange fixation on calling this "wrong" and "impossible"?


>Yes, and if Python had a better type system, I could write a more concrete definition. Why don't you understand that?

I understand everything. It's you who doesn't understand. What I described is for ALL type systems in general. Not python types.

>Of course it's possible, the type system even supports almost everything that's necessary. You can already define what I want with a single fixed-size tuple of arbitrary size, you just can't do it with multiple ones. Why do you have this strange fixation on calling this "wrong" and "impossible"?

It's not a fixation. It's because I know what I'm talking about.

Look just prove it to me. Define the type signature for concat function that will concat two tuples with variadic parameters in TS.

Make sure the return value has the exact required shape. Additionally don't resort to using Any. Also define the intersection function as well. Again, Don't use any. Should be trivial if what you say is true.


> I understand everything. It's you who doesn't understand. What I described is for ALL type systems in general. Not python types.

Then why is what you're claiming to be impossible possible in Typescript? How can that be? Is the whole world wrong, aside from you?

> Look just prove it to me. Define the type signature for concat function that will concat two tuples with variadic parameters in TS.

You mean the one I already posted yesterday? https://www.typescriptlang.org/play?#code/C4TwDgpgBAglC8UDaA...

Or, if you prefer Python, run this:

  from typing import Tuple, TypeVar, TypeVarTuple, reveal_type

  T = TypeVar("T")
  Ts = TypeVarTuple("Ts")


  def concat(element: T, tup: Tuple[*Ts]) -> Tuple[T, *Ts]:
      return (element, *tup)

  element = "foo"
  tup = (True, 42)

  reveal_type(concat(element, tup))
You said it's impossible for the return type to be dependent on the input type. Why does Python disagree with you?


Let me address the one in python.

First off we are talking about concatenating tuples.

What you're doing here is called preppending an element to a tuple. Like come on man. Have you not been reading the examples I've been posting.

Your typescript example is relevant. But like I said it's an arbitrarily narrow thing. It's a special case. Typescript arbitrarily only supports only concat. Which is what I've been saying. It doesn't type level programming.

>Why does Python disagree with you?

There's no need to be a complete ass hole. Why does not being an ass hole disagree with you? First off your python example is completely wrong. Second off no point in saying snarky comments like that.


> Let me address the one in python.

> First off we are talking about concatenating tuples.

> What you're doing here is called preppending an element to a tuple. Like come on man. Have you not been reading the examples I've been posting.

I have no idea how you still do not understand what I am talking about.

My whole point is that Python doesn't support this. It has all the ingredients, it literally supports what you called impossible (having a function return a tuple type that's based on an arbitrary fixed-size input). Yes, the function doesn't support concatenating two tuples while being well-typed. That is what every single one of my messages was about. How do you read all my messages and not understand my basic argument?

And, please explain in great detail: if a typing system can handle prepending/appending an element to a tuple, what is the big difference to concatenating two fixed-size tuples? Why is one obviously possible, and one isn't? What is the fundamental difference?

> Your typescript example is relevant. But like I said it's an arbitrarily narrow thing. It's a special case. Typescript arbitrarily only supports only concat. Which is what I've been saying. It doesn't type level programming.

First off, you're wrong, it's not an arbitrarily narrow thing - Typescript does allow much more complex operations, the type system is literally Turing-complete.

But aside from that: I didn't say that it supports anything else. All I have been saying is: Typescript supports this thing, which Python does not. Why do you keep disagreeing and saying it's "wrong" and "impossible" when it does literally do what I've been telling you?

> There's no need to be a complete ass hole. Why does not being an ass hole disagree with you? First off your python example is completely wrong. Second off no point in saying snarky comments like that.

I have spent enough time trying to explain this to you. You're still fundamentally misunderstanding what I am talking about while acting like you know everything. You're either unable to understand it or a troll - it doesn't matter which it is.


You can't even represent `Json` in Python's type system because it would require a recursive type.

edit: I think this is actually a Python type annotation limitation but it's possible that's mypy, although I think in 99% of cases those are fine to conflate (I have used other Python type systems).


In VSCode (which uses PyRight/PyLance) and Python 3.10:

  JSONObject = None | str | int | bool | list["JSONObject"] | dict[str, "JSONObject"]
  # this type checks
  a: JSONObject = {"a": [1, 2, "7", True, {"false": None}]}
  # this doesn't type check
  b: JSONObject = {"a": [1, 2, "7", True, {"false": object()}]}


Cool, guess they finally fixed this. Must've been in the last ~1 year, give or take. Of course, it relies on quoting your types, which is... a matter of taste, I suppose.


You can also use from __future__ import annotations so the quotes become unnecessary. https://peps.python.org/pep-0563/


Does that work with recursive types? I have had mixed results with `from __future__ import annotations` personally, but I haven't written much Python in ~a year or so.


It's worked for many years, but you won't often see it used outside of class definitions because all of the other tools struggle with it (Pylint, Flake8, Pylance, etc. spit out some variation of an undefined variable error).


I don't think it has worked for years. I had used that same exact __annotations__ and my recollection was that recursive types still did not work.


It's worked for years. I've been using it the whole time.


Quoting types that are defined later is a wart but it's not very bad. VSCode's UI will happily handle it as if the quotes weren't there.


You can.

   from typing import Dict, List
   JSON = Dict[str, 'JSON'] | List['JSON'] | float | int | str | None
I would say the only thing ugly here is the string 'JSON' that is used for recursion. From a static safety perspective with an external type checker the static safety effect is identical.


Yes but the if you want to do ahead-of-time type checking you need to run a tool like mypy, and none of those tools are as comprehensive or performant as Typescript. Also the ecosystem of libraries with type annotations is much smaller (Typescript has the first mover advantage, after all).


I literally said it supports exhaustive pattern matching which adds a level of safety and flexibility superior to that of type script.

Many IDEs have real time type checking that highlights the errors so don't even have to run the external checker.

Even if you don't use IDEs running the type checker is measured in seconds. Not far off from linters that most people will also use for TS.

What you say about the libraries is true though. But you can always place a type "shell" around those libraries such that your code is type safe.

The other main problem with the libraries in python is that a lot of people who use python are data scientists who haven't figured out why types are so great. Those guys are the main ones holding the libraries back.


Maybe comprehensive is the wrong word to use. But Typescript is definitely more mature.

I really like to run these checks alongside black, pyupgrade etc. in my CI pipelines. And under the hood your IDE is just running the same "external" checker and parsing it's results to provide it's smarts.

As for performance, early mypy versions took literally minutes on a moderately sized FastAPI codebase I maintain. The team (which includes Guido himself!) literally created a new python native compiler (mypyc) to try and improve performance of the tool.

I like types in Python but it's early days compared to a more mature tool like Typescript. IMO it's pretty clear cut if you've spent significant time using both.


>I literally said it supports exhaustive pattern matching which adds a level of safety and flexibility superior to that of type script.

I don't have an opinion on which language is superior here, I've never written typescript or even javascript before, but I think saying that python's type system/checkers is superior because of this one feature is not correct.

I'm also skeptical of the claim that typescript doesn't support exhaustive branching. This very well could be true but it seems hard to believe.

I'm a big fan of exhaustive branching, but I think there are other things that are as or more important.


>I don't have an opinion on which language is superior here, I've never written typescript or even javascript before, but I think saying that python's type system/checkers is superior because of this one feature is not correct.

I meant, it supports basically every common/useful type feature in addition to this one which is very powerful.

>I'm also skeptical of the claim that typescript doesn't support exhaustive branching. This very well could be true but it seems hard to believe.

It is true. Look it up. Very few languages support this feature. Python is able to do it because the type checker is external so developers can create all kinds of features and move faster then core python development. But by default other then Rust, Rust is basically the only popular language that officially has this concept.


> sum types

Even PHP have those now. TypeScript is lagging here. Hopefully, they will include it too one day.


Surprise: all languages have types.

Superior to TypeScript is neither a high bar, nor is this any kind of objective metric. I don't know why sum types are a blessing, also I don't know why pattern matching makes anything better.

I can name a lot of problems with Python, and I'm sure that libraries isn't the only one. For example, for no reason, Python has multiple unrelated mechanisms to manage program state (object, clojures, context managers). Or, here's another one: it has multiple public API for dealing with filesystem (i.e. os.path and pathlib), both of which are all sorts of bad, but just the fact that it has two, where one would do is bad.

It's not possible to make static typing ineffective. Static typing is a given, it happens in the virtue of you having a program. You may argue that it doesn't exist if unobserved in the same way how a tree that falls in the forest doesn't really fall if unnoticed. But, really, it's there, and it's there in every language. You cannot make it ineffective. It's like making centimeters ineffective -- I would struggle to imagine what that may possibly mean.


>I don't know why sum types are a blessing, also I don't know why pattern matching makes anything better.

That's because you're inexperienced and haven't used them before. Try haskell or rust. This level of type safety actually reduces logical branching errors. And the key word is <exhaustive> pattern matching.

Googling isn't going to give you the insight here imo you need the experience (probably a couple months). If you don't plan on getting it I suggest you ask a haskell developer or rust developer about why exhaustive pattern matching is such a great feature.

> Python has multiple unrelated mechanisms to manage program state

This is like a slightly bad for people who have ocd towards library organization. Two ways to do the same thing exists everywhere. Do you use looping or tco recursion? Why do some languages support both? Doesn't matter that much. This doesn't make a language horrible just makes it a bit bloated. Much worse is stuff like javascripts undefined value. Also TS has tons and tons of libraries that do the same thing. Why isn't that setting off your "bad" red flag instincts? Is it because they aren't in the std? So redundant commands outside of the std are ok but within the std.. bad bad bad? Have you actually hit a real problem related with this or is it just something that feels bad because of ocd?

>It's not possible to make static typing ineffective.

Categorically false. Python does have patterns and tricks which static type checkers can't catch. You struggle with meaning here because you failed to comprehend what I wrote and you're now arguing against a misinterpretation of my statements. Reread that part again, you definitely misunderstood.


How much experience do you need and what kind? I don't have articles published in IEEE journals, but don't think you can get there in few months. But I can write simple proofs in eg. Coq or TLA+, so, I probably know a thing or two about types.

My work experience is measured in decades at this point. So, maybe you want to reflect on your ideas... it does take time to appreciate both the positive and the negative sides of any given type system. I don't think a few month will be enough, if you start from absolute blank slate. It's also silly to measure this in time, rather than effort. You probably never worked on complex problems, nor did you work on problems that require research, as opposite to copying from "best practices". This is where your conviction comes from, at least this is what it looks like.


You seem to confuse "real"state management solutions (closures, objects) with syntactic sugar like context managers. Any class implementing __enter__ and __exit__ can be used as a context manager. The protocol doesn't impose any semantics to it. As for the presence of closures additionally to objects: this is a natural consequence of having nested functions and lexical scoping. However, it is quite uncommon to use closures to manage state in Python.


Why is pathlib bad?

Edit: I'm asking because pathlib is as good as a Python lib could be for me. Path manipulations are extremely clear and always safe. What more do you need?


It's broken just as os.path is. Python doesn't work well with file names in principle: it wants everything to be Unicode. That works for many, but if you want reliable code... you just have to throw all of that away.

Also, in case of pathlib, it adds no value on top of os.path of which it is a wrapper. Instead, it made the original library it wraps worse, because now os.path also needs to know about pathlib to be able to handle path fragments represented as pathlib instances.

All in all, it offers very little utility (a handful of shortcuts) vs increasing the size and memory footprint of "standard" library, complicating dispatch and therefore debugging... it's a bad trade.

Just not to get you confused. It's not an awful trade. It's not like the sky will fall down on you if you use it. It's just mostly worthless, with negligible downsides.


> Python doesn't work well with file names in principle: it wants everything to be Unicode. That works for many, but if you want reliable code... you just have to throw all of that away.

Windows' APIs use UTF-16 and most file name encodings on Linux are UTF-8. How should Python handle this better?

> Also, in case of pathlib, it adds no value on top of os.path of which it is a wrapper.

Completely disagree. os.path is annoying to use. Treating paths as objects with methods and joining them with / makes my life much easier.

> increasing the size and memory footprint of "standard" library

By a ridiculous amount. pathlib is just another pure Python module with a bunch of simple functions and classes. [1]

> complicating dispatch and therefore debugging

You can simply declare and accept `Union[str, os.PathLike]` and convert the paths to whatever you want at the entrypoints, then use that in your own project. Where is the complexity? I've never seen this make debugging harder, it's just an additional type.

[1] https://github.com/python/cpython/blob/d9fc15222e96942e30ea8...


> most file name encodings

You just repeated what I said. Just read it again. You already know the answer, you just need to understand it.

> Completely disagree.

And make no worthwhile argument.

> You can simply declare and accept `Union[str, os.PathLike]`

What does this have to do with debugging? Do you know what "debugging" means?


I'm guessing verbosity? It also reads like they don't know why pathlib exists and assume they were created at the same time.

os.path came first, often works by poking the filesystem directly even when it doesn't seem like it needs to (vague memory, not completely certain), and I believe has os-specific quirks (so code won't necessarily work without changes).

pathlib was created later as a pure-python implementation that allows for manipulation paths without touching the filsystem except when you explicitly tell it to. Because it's also pure python I don't think it has any os-specific quirks either, but I haven't explored it in depth. Code should work across operating systems without changes.

I think I also remember at one point people talking about completely replacing os.path with pathlib, or at least gutting os.path to the essentials that wouldn't work as part of pathlib.


I've been writing in Python for over ten years, in different roles, for wildly different projects (research, infra, Web, testing, education).

I'm yet to find anything Python was good for. On engineering merits alone Python isn't best for anything, nor is it best for combinations of things. It's silly to think that any tool that works with Python does so because Python was the best language for the job, and they only needed that tool to make it even better.

Most stuff written around Python is yet another layer of band aids on top of a huge ball of band aids that's already there.

So, you may wonder, with all those band aids, didn't they all make it better? And, in a sense, yes, that's what they are for. The band aids improve the experience of the user end. But this isn't how I use the word "better". When I use "better" I mean the quality of execution, not the satisfaction it gives to the user.


I've been writing python for a longer amount of time.

What makes python popular is it's the ease of use and debugging. It may not have been type safe but every time it crashed it's extremely easy to find out where and why it crashed. This is not the case languages like JavaScript or for even something like golang which is ironically type safe. I like to emphasize this here. It is more important for a language to be extremely clear about the origin and nature of a runtime error then it is to be type and statically safe. Look at C++. It has powerful static typing but is marred by seg faults and memory leaks which are hidden errors that are extremely hard to tease out.

That being said, modern python now has types. The bandaids on top of python from sheer coincidence turns it into a powerful scalable language for the web rather then an ugly patchwork.

The main problem with python right now would be the performance and the library ecosystem. Performance is of course mainly the gil and libraries are often written with tricks that make static typing impossible. It's just old tech debt getting in the way not the modern language itself. Linters should prevent anyone starting new code using those patterns again.

Other then that, pythons type system + checkers supports even sum types with exhaustive pattern matching added with extreme ease of debugging runtime errors makes python one of the best languages for big projects. That level of safety isn't even offered by go or typescript. The only alternatives are rust or haskell for sum types.

I highly disagree with you. Python is not without old tech debt warts holding it back, but overall it is a superior language.


Python is always the second-best language for the job. Which makes it a great language to know.


Exactly. It’s not about asking if Python is the best tool. It’s about looking at the alternatives. Python comes out ahead of all alternatives, save for 1-3 others. And those others might not be feasible for a host of other reasons. That’s how you end up with Python. And it’s fine!


Your claim is covered by this part of the post you replied to:

> nor is it best for combinations of things.

In other words, you are wrong. No, Python isn't second best, nor is it anywhere in the top-ten. It's worthless as a language, as in it's not worth paying attention to if you are interested in how you can solve programming problems by making better languages. It's worth knowing Python if you want a programming job, and this is where it shines. Anything else about it is either "meh" or just hands down awful.


I’m not really a fan of Python as such, but after a few decades in the industry, I’m beginning to think that being good at being bandaid is “better”. I can’t think of a single tech where we don’t have a bunch of duct tape (as we refer to it), not so much because we want to but because that’s just how things end up in the imperfect world of organisations. I value the techs that fit into this reality more than the ones which don’t, but you’re right, Python isn’t really great for any technology based merits, it’s good because the world is a messy place where being productive with your band aid is often more valuable than using the “better” programming language.


Your mistake is going from quantitative claims to categorical without evaluating the quantitative part first.

You say "everything is a band aid to some degree", and from that you conclude that "everything is equally bad or good". You conveniently forget the "to some degree" part to further your point.

It's similar to saying that all food has some amount of dust in it, so it shouldn't matter whether you just open the vacuum cleaner and eat the stuff collected on the drum, or if you order a meal at a fancy restaurant.

There's no evidence that Python "fits into reality" more than any other language. The thing that's going on for it is popularity. Popularity doesn't need to be rooted in technical merits, and in the case of Python it isn't. Finally, Python isn't unique in this sense. Even though the stage for programming language popularity pageant was set relatively recently, the participants learned to abuse the rules of the competition very quickly. There's a "rule" by a statistician whose name I cannot recall at this point which states that once people know the metric they are measured on, they will learn to game it. When we assess language popularity, we most assess how well the language authors or their community was able to game the metric rather than measuring any meaningful aspect of those languages.


Old proverb that I think applies here: A jack of all trades is a master of none, but oftentimes better than a master of one.


This very subjective, but I think python is good at syntax. If I had to teach non-programmers programming it would be easily my number one choice. The included libraries are also pretty good.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: