But you can change immutable object:
Well, you can get this to work with a copy-on-write implementation. I don't know enough about the Python runtime to decide whether that's feasible or not.
> if you want to append an item to a list of length 8, Python will resize it to16 slots and add the 9th item
I find it interesting that Python has decided to count their list resizing algorithm in as a formal API…
x = 
y = 
y2 = y
y.append(1) # Should also update y2, but not x
Copy-on-write might make a difference in the case of large non-empty lists, but then there's another problem: you would need to trigger the copy even if something contained nested within the list changed e.g. if `x=[[1,2],3]` and you change `x`. One way around this would be to only allow copy on write for lists of immutable things (excluding immutable things containing mutable things e.g. `x=[([1, 2], 3), 4]`). This is all a really significant change from the current object model.
x = [[1, 2], 3]
y = [[1, 2], 3]
x = 9 # the nested list needs to notify its owner(s)
> a = ['non','empty','list']
> del a
> b = ['new', 'list']
To your original point (that tuples reallocate based on length) I see that if I delete a tuple of length 3 and then create a tuple of length 5 I see the id is immediately changed. So that's correct.
Lists on the other hand seem to keep reallocating the same address in my limited test.
Strings seem to behave like tuples. When I delete a string and create a new one it creates a new object with a new address.... unless the strings are of the same length.
Perhaps this is no real revelation, I'm rather new to python and spending my time poking around to see how it works. :)
Without the `del a`, it does, because they both have active references. If they were unique objects, we'd see a unique ID for `b` as long as a reference to `a` is active.
> Strings seem to behave like tuples. When I delete a string and create a new one it creates a new object with a new address.... unless the strings are of the same length.
String _objects_ (as opposed to variables referring to them) are immutable in Python. They tend to be allocated anew, but for optimization reasons you can end up with cases where the string objects have the same ID. Like here:
>>> a = 'asdf'
>>> a = 'qwerty'
>>> a = 'asdf'
'asdf' has the same ID with no deling involved because its object wasn't GC'd yet.
Below are some relevant links if you want to knock yourself out (some do) but I write an awful lot of Python and even for me this is well into the realm of "what happens when..." after a few beers or job interview trivia.
More generally, mutable objects encourage unnecessarily side-effecty code.
Regular lists seem to work perfectly for this
I find Pythons core data structures really convenient and easy to use, but sometimes the C++ programmer in me wishes that I had more options available to me (eg, Python has lists and tuples, C++ has double ended linked lists, arrays, tuples, vectors, double ended vectors in the standard library), the idea being that I can choose something that is a very close match to usage patterns.
Having said that, I’ve never actually found it to be a limitation in real Python code and when memory or performance matters enough that these differences start to show, Python is usually no longer the best tool for the job anyway, so it doesn’t really need these things. Still, sometimes I feel like I want them anyway, probably just because I’ve done enough C++ to be used to having them :)
Of course, an argument could be made that looking just at the language (without third party libraries) is unrealistic and useless, and its true. You could also argue that Python libraries are easier to install and use than C++ libraries (which is also true), but when you look at the languages in isolation, Python only has a few basic structures and C++ has multiple ones.
Again, I acknowledge that this is an academic discussion since in real life, I’ve never actually seen it matter and, when it does, libraries like numpy that delegate the work to C exist and are easy to use.
Also, having these specific structures as part of the standard library would make you think that what you're doing is acceptable: thinking about the overhead of a doubly-linked-list when your variable assignment operation requires an access to a Namespace object based on a variation of a HashMap.
Seriously: whenever you need a fast solution, Python doesn't stop you from building something in C++. But trying to make Python into C++ is either an incredibly long shot or just having no idea what Python is actually useful for.
E.g. points on a plane (x,y) and an algorithm that sums them to find some center etc.
A type-homogeneous, variable length sequence (List[int]) should be a list/generator/etc. or in some cases a tuple (for example if it was declared explicitly). Other great places for a tuple: the default argument for something that takes in a Sequence, since defaulting to an empty list can be dangerous.
However, for immutable, fixed-length, type-heterogenous sequences (which is where tuples are often used, implicitly as return types etc), a namedtuple (python's lightweight record type) is almost always superior (the only question is "is it worth it to define this"), and for mutable, fixed-length, type-heterogenous sequences (ie. points you can modify), dataclasses/attrs (python3.7 standard library dataclasses, attrs in python2.7+) are almost certainly preferable.
Basically, when you have a known number of attributes, you can give them names better than `` and ``.
And the big problem with namedtuples is that definition is so heavyweight - requires a new line of code, a new class object, picking the scope for that class (top-level means you have to keep track of your nonlocal definition), and namedtuples often block pickling.
I use argparse.Namespace but it has some memory issues namedtuple was invented to solve.
You can use a pandas series which has advantages, esp with many similar objs, but brings a new dependency and has some of the definitional problems.
Python is crying out for a lightweight namedtuple for this purpose.
In : class A:
...: __slots__ = ('a', 'b')
...: def __init__(self):
...: self.a = 1
...: self.b = 2
In : get_size(A())
In : get_size((1,1))
In : get_size(A())
But for performance reasons you'd probably want to use a numpy array to store your points if you're going to be doing a lot of operations over them.
On the other hand, I scoff when I see PHP used for anything whatsoever. It is a god-awful language with terrible performance in high-usage practice. /rant