
Why perl has separate arrays and hashes: It's as if they thought it through - mst
http://altreus.blogspot.com/2011/06/its-as-if-they-thought-it-through.html
======
haberman
Arguing that this cannot reasonably be done is unconvincing when well-designed
languages like Lua pull it off.

The only thing about Lua's tables that consistently confuses people is the
behavior of the length operator in the presence of "holes." The previous,
somewhat confusing definition in Lua 5.1
(<http://www.lua.org/manual/5.1/manual.html#2.5.5>) has been replaced with:

    
    
        The length of a table t is only defined if the table is a
        sequence, that is, all its numeric keys comprise the set
        {1..n} for some integer n. In that case, n is its length.
        Note that a table like
    
         {10, 20, nil, 40}
    
        is not a sequence, because it has the key 4 but does
        not have the key 3. (So, there is no n such that the
        set {1..n} is equal to the set of numeric keys of
        that table.) Note, however, that non-numeric keys do
        not interfere with whether a table is a sequence.
    
        A program can modify the behavior of the length
        operator for any value but strings through the __len
        metamethod.
    

(<http://www.lua.org/work/doc/manual.html#3.4.6>)

~~~
alayne
Not only does Lua have the length problem, but array offsets are a matter of
convention. Lua chooses 1 as the first element of an array instead of 0. But
that is just convention. Your code could use 0 or -100. A confusing
implementation of length and offsets by convention instead of using nearly
universal array semantics from other languages is not "well-designed". It's a
cheap hack to conflate everything into one table data structure.

~~~
rwmj
Perl 5 has $[ which lets you set the base for every array. eg. $[ = 1; would
cause all arrays to be 1-based. Thankfully they've now deprecated this
misfeature ...

[http://perldoc.perl.org/perlvar.html#Deprecated-and-
removed-...](http://perldoc.perl.org/perlvar.html#Deprecated-and-removed-
variables)

~~~
lambda_cube
I thought that was deprecated a long time ago. When I learned Perl in 2000 I
remember I read not to use that "feature". Apparently it was just not
recommended, for a very long time.

~~~
rwmj
Certainly the advice has been don't use it for a very long time. At least
since the 90s when I learned Perl. However it wasn't actually deprecated until
last year (Perl 5.12). There were a sequence of steps leading up to full
deprecation which you can read about in the link I provided above. Maybe some
people were actually using it?!?

------
birken
First, it is pretty funny for this poster to use this as a reason why perl is
a well thought out language. It like all languages has pros and cons, and
picking one thing about perl that isn't even unique to perl isn't a good
example of how perl is so well thought out.

Second, I agree with the premise that having two separate structures is
better, but a lot of the issues brought up in this post just don't come up in
the real world. For example with PHP once you understand that the array()
construct is just a hash map with a linked list that stores the ordering, a
lot of the these complications are much less complicated. Just don't think of
the keys as indexes in an array, think of them as keys in a hash map that have
nothing to do with ordering. If you want to change the order, you need to
change the linked list, which initially is set by insert order but can be
changed to something else by using various sort functions. Again, I'm not
saying this is the greatest design in the world, but I am saying that living
with this design in practical application isn't a big deal.

In fact, the biggest annoyance has nothing to do with any coding stuff, it is
just that you pay a huge memory penalty if you just want a simple array of
items. Luckily PHP added a few newer data structures
(<http://www.php.net/manual/en/spl.datastructures.php>) like SplFixedArray
which are much more memory efficient. Obviously in standard usage the memory
doesn't really matter, but in some cases where you want very large data
structures these new objects come in handy.

~~~
SwellJoe
_It like all languages has pros and cons, and picking one thing about perl
that isn't even unique to perl isn't a good example of how perl is so well
thought out._

I believe you've misread the intent of the title. I read it as an answer to
the question: "Why does Perl have arrays and hashes, even though the language
I'm used to does not?" And, this article takes a decent stab at explaining why
Perl has arrays and hashes. It does not, as far as I can tell, make any
attempt to explain why this proves that Perl is superior to all other
languages or is more thought-out than other languages (except perhaps those
languages that don't have separate array and hash datatypes).

The way I read it, it's saying in a lot more words, "You'll like it once you
get used to it."

------
sjs
Do so many languages conflate arrays and hashes?

I know of PHP and JavaScript. PHP was not designed at all so no point
discussing it. JavaScript was initially written in a week so we forgive
Brendan Eich.

~~~
mnutt
Javascript doesn't really conflate arrays and hashes in the way PHP does. It
uses the same bracket syntax for both, but you either instantiate an array or
a hash.

I don't think that allowing you to iterate over a hash as if it were an array
is all that unforgivable, though it's often not the right thing to do and you
can't expect the values in any particular order.

~~~
masklinn
> I don't think that allowing you to iterate over a hash as if it were an
> array is all that unforgivable

it's not even doable in JavaScript: bare objects (~hashes) are iterated with
for...in, using that with arrays gives inconsistent results: the array's keys
will be iterated as if they were (string) properties, there are _no
guarantees_ they'll be iterated in numerical order, and any enumerable
property added to the array itself or any of its ancestors will be iterated
over as well.

Friends don't let friends iterate over arrays with for...in.

~~~
__david__
> Friends don't let friends iterate over arrays with for...in.

Which is too bad because it looks so much better.

~~~
masklinn
I just use Array.prototype.[forEach|map|filter] instead, or the equivalent
Underscore.js function (which aliases to native when it can) if IE-
compatibility is needed.

If you need performances, nothing will beat C-style access anyway. And FWIW,
using for...in on an array is slower than using Array.prototype.forEach.

edit: FWIW, even hand-rolling an each() function (in the style of
underscore's) will be faster than for...in.

------
ianterrell
That this post has so many upvotes is evidence that a lot of nonprogrammers
read HN.

~~~
andrewflnr
I don't understand this comment. I'm a programmer, designing a language in my
spare time, and I found it interesting, since I considered going the Lua/PHP
route with my collections. I thought it was nice to find an article about
actual programming an HN again, and I don't know why non-programmers would
upvote it.

Please explain. I'm really curious.

~~~
ianterrell
My comment was reflecting my intuition that most programmers would not find
the blog post useful or insightful.

The first two sentences of the blog post are, "I wonder why we have separate
arrays and hashes in Perl. Other languages don't do it." However, as evidenced
by only 4 contrary languages listed in the comments here, the _vast majority_
of programming languages have separate array and hash structures.

Much of the post is absolutely _trivial._ The post is a combination of
circular reasoning (arrays and hashes must be different because they... are
different) and some sort of straw man gedanken (well it would be easy to do if
we did this but then that doesn't work, so obviously this way's "thought
out").

It's a whole post to point out that arrays and hashes are different and are
used differently. Hurray? Let's upvote it?

~~~
andrewflnr
I see what you mean, in that it could definitely have been a lot shorter and
to the point. But it was satire, especially those first two sentences, to
point out how silly he thinks it is to combine them. I think you may be taking
it a tad too seriously.

------
alexbell
It's almost as if the majority of programming languaguages are "thought
through". I mean, really, who designs a programming language and doesn't think
it through? It's a nontrivial undertaking typically undertaken by people that
are smarter than the average bear.

------
skimbrel
This was a fun post, but I found the other post on this blog more interesting:

<http://altreus.blogspot.com/2011/06/anatomy-of-types.html>

Not terribly useful to anyone who's been around the block in Perl at least
once, but it's a great explanation of the three Perl variable types. Saving it
for the next time I need to teach someone Perl.

------
antihero
How is this different from Python's lists/dictionaries?

Also IIRC PHP does internally differenciate between array(1,2,3) and
array('blah'=>1, 'bloop' =>); Just the syntax is not as clear.

~~~
danudey
The syntax is unclear, but also the behaviour is unclear.

$var[] = "entry1"; $var['something'] = "entry2"; $var[] = "entry3";

$var[0] is now "entry1", and $var["something"] is "entry2", but what's the
index for entry3? And if you iterate over the array, will you get entry2
before entry3?

What if you create $var[$somestring], but $somestring is "29"? Now it will be
at $var[29], and automatic indexing will reset to 30, which you may not
expect.

There's a lot of argument to "just don't write code like that", but with the
inconsistencies in PHP library functions (and PHP's careless type conversion)
it's easy to come across cases where this sort of "do something unexpected
instead of throwing an error" behaviour can bite people in the ass down the
road, through no fault of their own.

~~~
antihero
Oh yes, I don't deny that it's horribly done. I use my own classes "Dict" and
"Li" instead of arrays whenever I need them to be specific, generally.

------
Goladus
Array and hash types in perl likely evolved bottom-up from the lower-level
data structures of the same name.

Other languages use terms like "List" and "Associative Array" for their
abstract data types to differentiate from the implementation techniques. And
yes, lists and associative arrays have slightly different properties, as do
sets, stacks, and queues.

------
drdaeman
tl;dr: arrays, dictionaries, ordered dictionaries (lists, sets and so on) are
all different classes of data structures with different properties. To be
efficient one should chose best one, not rely on some "all included" generic
datatype.

------
wvenable
The PHP array is a wonderful data structure with an awesome set of properties.
This blog post contains a lot of "problems" that are somewhat trivial and have
been reasonably solved in PHP. This comment will be downvoted to oblivion
without discussion.

~~~
pavel_lishin
> This comment will be downvoted to oblivion without discussion.

If you can see the future, I recommend playing the lottery. If you're a snarky
ass, I recommend not commenting.

I've been working with PHP for five+ years now, and only started working with
Python in the last year or so. You're right - working with PHP's arrayhash
isn't too terribly complicated, but having actual arrays and hashes as
different types is _super_ convenient. At least once a day, I think fondly of
Python and wish I could use it - list comprehensions and negative indices
could eliminate the need for half of PHP's array functions and make for
cleaner and more readable code.

~~~
wvenable
> If you can see the future, I recommend playing the lottery.

I can't see the future, but I can edit the past. It's just a little
disappointing to be downvoted without anyone actually commenting on the
subject.

> but having actual arrays and hashes as different types is super convenient.

I agree especially if, in the language, arrays and hashes are objects with
methods.

I didn't claim PHP's arrays were the greatest data structure on earth but
ordered hashes that special-cases integer indexes is a pretty nice procedural-
style data structure. PHP doesn't really have arrays; it's just the hash type
is flexible enough to do double-duty.

Apparently, on Hacker News, having such opinions is both downvote worthy and
punch-in-the-face worthy.

~~~
pavel_lishin
> It's just a little disappointing to be downvoted without anyone actually
> commenting on the subject.

When I saw your comment, it didn't look downvoted to me. You clearly realize
that people can vote on comments - has it occurred to you that comments can
bounce back from a negative score? That is, unless you append some passive
aggressive horse-shit to the end.

Honestly, I don't know if we gain anything by PHP having arrays pull double
duty in this way. The only advantage is, what, having one less keyword to
remember?

------
pkulak
Using the same object for arrays and hashes seems like a terrible idea long
before thinking about it this much. JavaScript does it, but JS goes for this
super-simple, everything-is-a-function-hash-thing simplicity

~~~
mikeryan
As stated above Javascript does not do this it has separate array and Object
types. Javascript developers might.

~~~
voyou
No it doesn't. Arrays are a type of Object, they're not distinct.

