b = a[:] 0.039ms
b = list(a) 0.085ms
b = copy(a) 0.187ms
b = deepcopy(a) 10.592ms
The type constructor list will convert any sequence into a list and will preserve order. If you pass it a list, all it does is return the sequence using the slice operator anyway. It is slower because of the type checking, but it is implemented in C. So you can think of list() as just [:] with a type cast - no need to call it again if you know you have a list.
copy and deepcopy are implemented in python, and are generic functions that attempt to sniff the type of the object to be copied. They will use the __copy__ magic of the object if it exists, so you can override it in your objects with return self[:]. You need to use these if you have a generator, a list of non-basic types (such as lists of lists, or lists of tuples, or lists of objects). Both functions use a module-level cache and deepcopy will iterate and apply copy
there is very little performance degradation by aliasing copy to deepcopy and using it everywhere, although it could save you time by catching bugs. (Edit: scratch that, I got my benchmark wrong - deepcopy will still be slow even if you pass it a shallow list, see comment below, thanks tedunangst)
Read the source of copy and deepcopy so you can understand them and can implement your own custom version for more advanced types. Find the file:
>>> import copy
 Benchmark times taken from: http://stackoverflow.com/questions/2612802/how-to-clone-a-li... which I had bookmarked as a reference
One way to catch deepcopy bugs might be to create an autocopy function which can detect if it is a 'shallow' object and use copy, or if not use deepcopy.
I am going to try and write an implementation that doesn't slow it down too much. It might be worthwhile since copy bugs are so common in Python projects.
~$ python -S -mtimeit -s "a = list(range(10))" "a[:]"
1000000 loops, best of 3: 0.198 usec per loop
~$ python -S -mtimeit -s "a = list(range(10))" "list(a)"
1000000 loops, best of 3: 0.453 usec per loop
~$ python -S -mtimeit -s "a = list(range(100000))" "a[:]"
1000 loops, best of 3: 675 usec per loop
~$ python -S -mtimeit -s "a = list(range(100000))" "list(a)"
1000 loops, best of 3: 664 usec per loop
The only caveat here is if you are copying a list of lists, you have to use copy.deepcopy() if you want the lists inside your lists to actually be copied.
>>> t1 = timeit.Timer('copy.copy(orig)','import copy;import random;orig = [random.randint(0,255) for r in xrange(100000)];')
>>> t2 = timeit.Timer('orig[:]','import copy;import random;orig = [random.randint(0,255) for r in xrange(100000)];')
>>> t3 = timeit.Timer('list(orig)','import copy;import random;orig = [random.randint(0,255) for r in xrange(100000)];')
>>> print t1.timeit(10000)/10000
>>> print t2.timeit(10000)/10000
>>> print t3.timeit(10000)/10000
>>> t1 = timeit.Timer('copy.copy(orig)','import copy;import random;orig = [random.randint(0,255) for r in xrange(10)];')
>>> t2 = timeit.Timer('orig[:]','import copy;import random;orig = [random.randint(0,255) for r in xrange(10)];')
>>> t3 = timeit.Timer('list(orig)','import copy;import random;orig = [random.randint(0,255) for r in xrange(10)];')
>>> print t1.timeit(10000)
>>> print t2.timeit(10000)
>>> print t3.timeit(10000)
This implies more setup cost for copy() and list(), but after that they're faster.
Also, the minor difference in slice vs. list() for large lists are likely platform dependent and highly sensitive to the details of branch prediction and cache.
This caught me bad once. I wish it were on the page linked in the post (and others like it).
The irony being that in Perl, a list copy looks like this:
@dest = @src
I don't know many non-Python programmers who like to sit down of an evening, fire up their e-reader, and peruse a few hundred lines of beautiful Python code.
We all learn a natural language and its grammar, and use it daily. It's very important for a machine language to be as cognitively compatible with a natural language as possible (aka readability), because that is extremely important for productivity.
p.s. The above is particularly relevant to Python, because it (justifiably) prides itself on its emphasis on readability. To me, even though Python is 5 to 30 times slower than Java, it holds the keys to the future because of its commitment to readability. In the long run, that is the single most important feature of any language - other fundamentals are necessary, but not sufficient to ensure a language's longevity, and can be fixed more easily.
The most relevent one I recently encountered was working with a team and needing to get some numbers crunched - I had knowledge of scipy and was able to show the code to others who hadn't used python but could still understand it.
Readability describes how clear, concise, non-ambiguous, etc a language's syntax is and also the code one writes is. It's not about sharing your code with others who don't know the language, it's about making actual use and collaboration in the language (in a shared project) better.
If you don't know that `[:]` relates to slices then you might start by reading http://docs.python.org/tut
Em, you got it wrong. Readability is not about people "not knowing your language". It's about people knowing your language and having to read your code at a later point.
The problem with a language with poor readability is that it is hard to read even your own code written in it, because the syntax is ambiguous and funky and it involves a large mental overhead.
"Isn’t it better, less cryptic, and more pythonic? a[:] feels a bit too much like Perl. Unlike with the slicing notation, those who don’t know Python will understand that b contains a list."
b = list(a)
Possibly this is because I've known lisp longer than python.
The s[:] gets called faster (builtin syntax dispatches directly) than list(s) which requires a global lookup. One called though, they both run the same underlying code and are therefore equally fast when it comes to the actual copying.
In Python 3.3, we're adding list.copy() and list.clear() because so many people were having issues with the [:] notation for copying and clearing.
from copy import copy
b = copy(a)
I've been working with Python for a year now, and I don't claim to have swum its depths.
l = list(input_data)
# code that uses list-specific stuff and returns a result
Obviously as with anything the choice of list copy method is situation-dependent. Using [:] makes sense if you can guarantee the input is a list and you need maximal speed. Using copy() makes sense if you just want a copy of the input object and don't specifically care that the copy is itself a list. Using list() makes sense if you want to be able to take in all kinds of input values and be assured that the copy is a list. Use what is best for the situation at hand.
Using list(gen) will fully consume the generator, which may be undesirable, but at least it will generate a list with the correct values.
So I don't think you gain any readability over the slicing syntax.
There is absolutely nothing wrong with [:]. This kind of judgmental critique of code is completely absurd IMO. If someone doesn't know something as basic in Python as slices, he or she shouldn't be reading Python code. If by all means you want to have people who don't know Python reading and understanding your code somehow, try copy.copy(). It's not like the performance hit is meaningful compared to list() and [:] is the fastest of the three.
a = b[:]
list1 = list2[:]
The problem is, even with a simple knowledge of slices, someone reading your code may not realise that list1 is actually a new list; you are depending upon a side-effect of the language. If you used the Copy module, or even the list constructor, it would have been much clearer that list1 is a new list. This increases readability, which is very important if other people, or even yourself in the future re-read the code. Personally, I prefer readability in really long programs, and it isn't like you're programming C on an embedded device here.
With that said I must also state that I think
a = list(b)
list(a), copy.copy(a) are more explicit.
Unless you use the stride (third argument to the slice, the part that defaults to 1). If that is negative, then everything goes the opposite way, as in foo[::-1], the reverse of foo).
- a = b <= doesn't copy! its just a reference
- a = list(b) <= works!
- <other methods if you like>
If you want to be truly explicit, why not use the built in copy module?
from copy import copy
b = copy(a) #or deepcopy(), depending on your needs
b = a[:]
I really don't buy the argument about languages being intuitive to people who don't know the language. Languages should be optimized for people who know the language. I think Matz has mentioned it somewhere regarding "principle of least surprise" and Ruby.
Or you could just use list comprehensions :)
b = [x for x in a if somecondition(x)]
b = a.slice()
b = Array.apply(null, a)
The built-in reverse() method will do it in place.
What if you wrote `b = b + `, some languages might just append to b, and not create a new list (python seems to create a new string). Slices can still be seen as having the same problem. Really you should be using the Copy module or the List constructor, which have the implicit guarantee of a new list.
The reason why list(x) is better than x +  is because list(x) works regardless of what type of iterable x is. x +  only works on lists.
A good analogy is probably assuming that pointer sizes are the same as int sizes in C. This assumption was safe for many years, but broke when 64-bit came along. Slices and adding lists will probably always return new lists in python, but it is still good not to depend on such behaviour.
Slices and adding lists will probably always return new lists in python, but it is still good not to depend on such behaviour.
Not buying this. The behavior is documented and Python has a deprecation cycle for changes in documented behavior.
The reason to write list(x) is because that's the generic way to turn anything into a list. It's not for being future proof or being easy for newbies to understand. It's because that's the right way to do it.
b[:] = a # copy a into b, while b keeps its identity.
all = range(10)
allbut2 = all.remove(2)
Actually removes 2 from all as well. Hence, you have to copy lists a lot if you are doing a lot of list creation or change from a master list.
It follows the convention that methods that modify their object inplace should return None. list.pop() is an obvious exception.
You create a new list via list comprehension instead of copying and then removing:
even = [i for i in L if i % 2 == 0] # remove odd numbers
all = range()
allbut2 = all
For example, in Bravo, there are exactly eight list copies, all in contributed code which I didn't write, and another six in Exocet, which I also didn't write. The ones in Exocet are probably required, but the others are from code that was written without forethought.
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
[1, 2, 3]
[1, 2, 3]
>>> c = [4, 5, 6]
>>> d = c
>>> print c, d
[4, 5, 6] [4, 5, 6]
>>> c = 
>>> print c, d
 [4, 5, 6]
>>> e = c
>>> print c, d, e
[7, 8] [4, 5, 6] [7, 8]