

Java is pass-by-value, Dammit (and so are Ruby and Python, Dammit) - raganwald
http://javadude.com/articles/passbyvalue.htm?repost

======
va_coder
The user refractedthought at reddit had the best explanation IMO:

Ok, I think the real losers in this discussion are the ones who fail to see
both sides of the issue. We can cherry pick "refrence" definitions if we want,
but really the nuts and bolts come down to this: Passing by value means that a
copy of the parameter is made when the function is called and any changes to
that variable will not be reflected outside of the function. Passing by
reference means you pass a pointer to your data (and nothing more) so the
function can dereference the parameter and affect data that exists outside of
the function.

All non-primitives in Java are implicit pointers, so when you pass one of
these into a function, you are passing a pointer (or reference) to data that
exists outside of the function. However, the reference itself, as it exists
inside the function, can be changed to point toward something else.
Essentially, the implicit pointer is passed by value, and that value can be
altered without affecting anything outside the function.

So technically yes, you could get away with calling this pass-by-value.
However, the spirit of pass-by-reference is maintained for any part of the
data that most people care about. The syntax is different from a pass-by-
reference in C++, and there's a silly little trick you can do by changing the
reference pointer (or silly trick you can't do by changing the pointer outside
the function) but for all practical purposes you are doing the same thing.

In my opinion, if you want to insist that Java strictly follows pass-by-value,
then I think it has ceased to be a relevant distinction.

~~~
mpakes
I strongly disagree. Proper terminology is _incredibly_ relevant.

As stated in the original article, pass-by-value and pass-by-reference have
well-known definitions within the computer science community. There is no
ambiguity here.

When we excuse the lazy use of such terms incorrectly, we hamper the ability
to communicate clearly and effectively. Please stop.

The core misunderstanding seems to be a conflation of 'pass a reference by
value' and 'pass-by-reference'.

Passing a reference by value != pass-by-reference. The End. Q.E.D.

Pass-by-reference means that the compiler takes a reference of (or, getting a
pointer to) the parameter passed in to the function. This is not the same as
passing an existing reference (pointer) to the function (by value). Java only
allows the latter (primitive types aside).

A side issue is that some developers mistakenly believe that pass-by-value
roughly implies a deep copy, and thus that the 'spirit' of pass-by-reference
is the avoidance of a deep copy.

I would posit that many C++ functions that operate on object parameters do so
via a pointer (a.k.a. reference) passed by value, just as Java does:

    
    
      void modify_my_object(MyClass *myObject) {
          myObject.setName("New Name");
      }
    
      int main(void) {
          MyClass *myObject = new MyObject("A Name");
          modify_my_object(myObject);
          // myObject.name is now "New Name"
      }
    

This is pass-by-value, and there is no deep-copy happening there.

Just because many developers don't take advantage of (or don't understand the
semantics of) pass-by-reference does not excuse ignorance of the well-known
definitions of those terms.

------
dschobel
When I was interviewing java devs of all levels few (less than 10%, I'd say)
could flat out say "it's pass by value for everything, for non-primitives
you're passing references by value" but a lot more (roughly half) could trace
the following code correctly.

    
    
      Object x = null;
      makeString (x);
      System.out.println (x);
    
      void makeString (Object y)
      {
        y = "This is a string";
      }
    

Maybe it's just a disconnect between theory and practice?

~~~
moss
This.

Java sets up an environment where the parameter passing behavior just isn't
surprising. If you've been programming in Java for a little while, it's very
easy to have an intuitive feel for how parameters will act, without having to
go back to theoretical principles.

Knowing about the theory behind pointers, pass-by-value, and pass-by-reference
will still make you a better programmer. But Java's a stronger language for
not requiring that knowledge.

~~~
raganwald
This!? Was I asleep when "this" meme swept through the internet? Help me out:
what do you mean when you say "This."???

~~~
iron_ball
"The comment to which I am replying is correct."

~~~
raganwald
Oh dear, I was using the little up arrow for that :-)

------
jganetsk
_Can you write a traditional swap(a,b) method/function in the language?_

What about languages that are immutable by default, like Haskell, OCaml, and
Erlang? You can't write a swap function. Therefore, are they pass-by-value?

Actually, in OCaml, you can write the swap function.

    
    
       let swap a b =
         let temp = !a in
         a := !b;
         b := temp
    
       let x = ref 3 in
       let y = ref 8 in
       swap x y
    

But now, you are passing references by value, as Java does. So maybe they are
pass-by-value. But, when your language is immutable, pass-by-reference
(whatever that would mean), doesn't effect the language semantics. They are
pass-by-(it's-moot).

~~~
eru
Haskell is something like pass-by-name. At least staff only gets evaluated at
most once. But it's definitely not pass-by-value, since not everything gets
evaluated down to a value.

~~~
jganetsk
Your argument would get ensnared by that of the original post. If you take the
description of Java's passing semantics, and s/object/thunk/ and
s/primitive/unboxed value/, you get a perfect description of Haskell's passing
semantics.

------
araneae
I've heard that java is both pass-by-value and reference in one class, and
then heard that it was pass-by-value only in another. The difference was that
the first was at a university that primarily used java, and the second was at
a university that primarily taught in C++.

The fact is the usefulness of these concepts depend on what language you're
writing in. In java, where pointers aren't explicit, you need to know when
you're assigning a variable a memory locale versus a value. In C++, where
pointer assignment is explicit, that's not a useful concept, so pass-by-
reference has a different meaning.

I honestly think we should come up with a different name for pass-by-reference
the way it is used in C++. It's not like people actually write code using
pass-by-reference anyway :D.

~~~
omnigoat
I do believe this is why my university teaches Java for one year, then C for
the second, before you even get to glance at (C++|Python|Perl|C#|other). I'm a
compiler-writer, and we _don't_ need different names - they have _well
defined_ meanings. What we need is people to stop misusing them. You've done
it yourself, because C _doesn't have_ pass-by-reference. C++ does (ie,
references...), but C does _not_. Stop doing it! Stop misusing those terms!
You just read an article that actually spelled it out, and still misused them!

~~~
raganwald
Redefining terms that have precise meanings into terms that have imprecise
meanings based on vulgar precedent is a symptom of a pop culture. Cue Alan Kay
quote in 3, 2, 1, ...

~~~
ftl
"In the last 25 years or so, we actually got something like a pop culture,
similar to what happened when television came on the scene and some of its
inventors thought it would be a way of getting Shakespeare to the masses. But
they forgot that you have to be more sophisticated and have more perspective
to understand Shakespeare. What television was able to do was to capture
people as they were.

So I think the lack of a real computer science today, and the lack of real
software engineering today, is partly due to this pop culture."

Full conversation: <http://queue.acm.org/detail.cfm?id=1039523>

~~~
halo
Wow, that's one of the most pretentious paragraphs I've ever read. I mean,
let's use the exact same argument, swapping television for books:

"In the last 25 years or so, we actually got something like a pop culture,
similar to what happened when widespread literacy came on the scene and some
of its perpetrators thought it would be a way of getting the Bible to the
masses. But they forgot that you have to be more sophisticated and have more
perspective to understand the Bible. What books were able to do was capture
people as they were."

~~~
philwelch
Except, historically that's absolutely wrong. Since the advent of printing,
the Bible has been the most widely published book in the world because the
masses, as they are, have historically been religious people.

~~~
kleevr
OT: I did find it very interesting to learn that the second most popular book
(up to the beginning of the twentieth century) was Euclid's _The Elements_
after the printers were done pressing bibles.

------
cturner
One set of English-language terms are being used by two communities.

1) Those of us who are language users care about use of the language from the
perspective of the question, "If this thing gets passed in and I change it,
does that change flow back to the calling method." Our perspective of pass-by-
reference vs pass by value may originate from experiences learning C. We're
probably wrong.

2) This guy is from the community of compiler people. He cares about
implementation, and is probably correct.

We need a spoken language abstraction that allows people to discuss the things
they care about, and then be able to distinguish between them. The people in
the position to cry 'get off my porch' are in the best position to point the
rest of us is the right direction.

~~~
dschobel
It's not an academic point and he's not _probably_ correct.

If you don't understand that it's pass by value of references for non-
primitives then you won't understand why your variable x is null after the
method call:

    
    
      Object x = null;
      fooMe(x);
      //is x null or a Foo?
      [...]
    
      void fooMe(Object y){
        y = new Foo();
      }
    

That's fundamental knowledge.

~~~
ewjordan
_If you don't understand that it's pass by value of references for non-
primitives then_...

Right: pass by value, _of references_. That's the complete answer, and it's
really misleading to say anything else.

I don't think any Java programmers actually have any misunderstanding here,
even the worst of them know how this works. What's the problem?

Also, the article tries to use code like this:

    
    
      void swap(SomeType& arg1, Sometype& arg2) {
        SomeType temp = arg1;
        arg1 = arg2;
        arg2 = temp;
      }
    

to prove how different references in C++ are from Java.

But the reason something like that might work in C++ is not that the semantics
of parameter passing are different (references cannot be reseated in C++ any
more than in Java), but that C++ uses the assignment operator to overwrite the
contents of the lhs, and invokes the copy constructor automatically. The true
equivalent code in Java works just fine, assuming the copy constructor and set
method do the right things (which is a potential pitfall in C++, as well):

    
    
      static public void swap(SomeType arg1, SomeType arg2) {
        SomeType temp = new SomeType(arg1);
        arg1.set(arg2);
        arg2.set(temp);
      }

~~~
tedunangst
SomeType could be a primitive type. Your java code won't work for int or float
or Object * (if you could have such a thing).

~~~
ewjordan
Right, but that's more a statement about Java's quirky treatment of primitives
than anything else - those are certainly passed by value, but nobody has ever
argued anything else. The only confusion/argument is over whether Object types
are passed by value or by reference, and it's massively misleading to say
they're passed by value, since if you don't know the rest of the story, you'll
write code that doesn't run correctly.

The only reason that code won't work for Object is that you can't add a copy
constructor or "set" method to Object, not because of any difference in
reference semantics. In a real-world implementation we might try to simulate
duck-typing by accepting an Object and using reflection to check for the
appropriate methods on the object, and though it would be a bit slow it would
work for any object type that had the methods defined.

My main point was that for any type SomeType that lets the Java code compile,
it will do exactly the same thing as the analogous code does in C++, assuming
you've defined the class methods appropriately. Which makes it a very poor
illustration of the difference between reference semantics in Java and C++.

------
kentosi
I found this article rather surprising since I come from a primarily Java
background.

Do any other languages besides C++ allow this definition of pass-by-reference?
The reason I ask is that it seems silly for me to change the way I talk just
to satisfy one language that i don't even code in.

And if so, when would it actually be practical to use this? The idea of
reassigning a parameter in a function and then having that reassignment affect
variables OUTSIDE the function sounds very strange to me (and possible even
dangerous).

~~~
enjo
I disagree with the sentiment that you should simply ignore proper semantics
because it's inconvenient. As the article points out, this stuff really does
matter.

There are classes of problems that these type of true reference semantics can
be quite helpful. In C/C++ the technique is largely used in contexts where you
want to return multiple values from a function, as there is no native support
for dynamic tuples. Such is the way of statically typed languages.

There are also cases where this type of reference to a pointer magic can
really improve the efficiency of the algorithm.

True reference passing is a tool. For those living in dynamic languages (as I
do now), not a particularly relevant one. But that doesn't make the
distinction any less important.

------
diN0bot

      def swap(a, b):
        t = a
        a = b
        b = t
    
      def swap2(a, b):
        t = a['v']
        a['v'] = b['v']
        b['v'] = t
    
      x = 3
      y = 8
      xx = {'v': x}
      yy = {'v': y}
    
      swap(x, y) # changes nothing
      swap(xx, yy) # changes nothing
      swap2(xx, yy) # changes 'v' values in xx and yy, but not x or y

~~~
dustmop
Hence the word "traditional" in the article.

------
0wned
I think in C++... if you want to be sure the method/function does not alter
the data globally and speed is not a concern, then pass by copy. If you want
the method to alter the data globally, then pass by reference. If you want
speed, but don't want alteration then pass by const reference.

------
lg

      a, b = b, a
    

:)

~~~
swolchok
That has nothing to do with the semantics of passing parameters to functions.

    
    
        def swap(a,b):
          a,b = b,a
    

does not work.

------
jancona
I think I first saw this debated in comp.lang.java in 1996:
<http://bit.ly/5XOSey> I don't think the arguments have changed much.

~~~
eru
Long URL:
[http://groups.google.com/group/comp.lang.java/browse_thread/...](http://groups.google.com/group/comp.lang.java/browse_thread/thread/8b6792bc5e5f87af/71f7ae8dbd7701fa?pli=1)

------
jp_sc
Actually in Python is pass-by-reference with lists and dictionaries.

~~~
pwmanagerdied
Lists and dictionary aren't special, and they aren't pass-by-reference in the
sense referred to in the article.

    
    
        def foo(my_list):
          my_list = [ "my", "new", "list" ]
    

Executing this function will not change the value of the list passed to it, it
will create a new list and reference it locally. If calling foo(old_list)
caused old_list to now contain [ "my", "new", "list" ], then it would be pass-
by-reference in the sense this article refers to.

EDIT: To clarify, you can mutate mutable arguments in-place, but you can't
change what is being referred to by the identifiers used to refer to these
arguments outside of the context of the function, which is what is meant by
pass-by-reference in the article.

~~~
yason
You are wrong.

You can modify the original list in your example by having foo() call
my_list.append("pwnd!"), for example. You can even use id() to verify that it
really is the exactly same object that is visible to the caller and the
callee.

Python doesn't pass by reference (in the aliasing variables C++ sense) and it
doesn't pass by value (in the functional sense) but rather, it passes
references.

Everything is an object anyway but objects aren't copied around. They're just
bound or rebound to new variables.

In the above example Python creates new local variables for all function
arguments and binds them to the objects passed in as arguments.

If those local variables are assigned to, they're just rebound to point to the
new local object while the original variables in the caller code still remain
bound to the original object.

~~~
andrewcooke
you understand python fine; you haven't understood the article. read the
article again - it's interesting, self consistent, and disagrees with the
comp.lang.python consensus.

------
dustmop
I reason I think this discussion is ever a problem is because in C++ "pass-by-
value" has an extra meaning: that a deep copy occurs. There's no precise term
for this, so people started (incorrectly) using "pass-by-reference" to talk
about parameters that don't get deep copied. After all, in languages that do
it right (Java, Ruby, Python, CL) there's only one parameter passing
technique, and therefore no reason to make the distinction.

~~~
bff
There is a precise term for that and you said it - pass-by-value. Recall that
somewhere each language turns into assembly and that in order to pass a value
to a subroutine you must either place values into registers or push the data
of the object you are passing onto the stack. Passing by value is desirable if
you want to make sure that your local value is not modified. This gives the
called function the freedom to modify it without worrying about side effects
and without worrying about explicitly calling a copy constructor. Pass-by-
reference comes from C++'s pass by reference, which is really just some sugar
around passing a pointer by value and referencing it with the * operator in
the function body (hence reference passing).

I disagree that there is a right or wrong to this - C++ was made to be a
generic programming language and thus gives many options to the programmer. If
by "do it right" you meant "hide all options except the one most commonly
used" okay. Otherwise I strongly disagree with that sentence especially
because I use different kinds of "pass-by" whenever I program in C++ and will
be using rvalue reference from the C++0x. Different languages suit different
needs.

~~~
dustmop
Well the term is not very precise if the way that you're using it and the way
the article is using it are completely different. The article makes it clear
that the term "pass-by-value" refers to the fact that the callers' references
cannot be unseated by the function call; but it does not mean deep copies are
made, like by copy constructor, hence that's why it applies to Java. Allow me
to illustrate:

vector<int> numbers; void f(vector<int> args);

Due to C++'s copying, args[0] has a different address than numbers[0], because
a deep copy of the vector was made. As the article states, this is not what
pass-by-value is referring to, otherwise it would be false that Java / Python
/ Ruby et al are pass-by-value. So I disagree that the terms are clear, but
rather that there is confusion around these terms. As for wanting
immutability, it's almost always better for C++ code to write functions not
like f but instead like this:

void g(const vector<int>& args);

And apologies for the editorializing, but I do consider this a language design
flaw, one inherited from the backwards compatibility for C, which already
allowed struct's to be passed (non-pointer) as args and then copied to keep
the caller untouchable. It makes C++ code more verbose and more error prone,
and is quite difficult for new programmers to grasp.

~~~
bff
The value being passed in java is the value of the pointer, where the original
pointer value cannot be changed by the function. This is pass-by-reference.
Think of pass-by-reference as a special case of pass by value where the value
of the pointer to an object is passed rather than the object itself.

If I wanted to write a function that accepted a vector and returned a slightly
altered version of that vector I would make a function that does a deep copy
and would return that - without ever needing to call the copy constructor
myself. I think that it is nice that C++ gives me that option. I don't think
that any language is flawless and it's clear that one programmer's feature is,
in this case, another programmer's flaw. Keep in mind that this applies to
your favorite languages as well.

Edit:

It occurs to me that the distinction is important in multithreaded programs as
well. Beyond just worrying about whether or not the underlying data structure
is thread safe, passing something by reference might mean sharing memory
between two CPUs which can easily lead to a slowdown as the CPUs need to
repeatedly flush and resynchronize what's in their cache. Thus unless I have
specifically built a thread-safe class whose data has a real need to be shared
between between threads I would send data from one thread to another by value
rather than by reference.

~~~
enjo
You just described why I believe pythons 'implicit is always better than
explicit' is the most important language design fundamental ever conceived of.

The problem with the C++ approach is that it's default (pass by copy-
constructor really) is rather implicit. Your doing a potentially a ton of work
with a very innocent function call, with possible side-effects. At least the
'pass a reference by value' behaves the same way EVERY time.

For example, in C++:

myobj a; somefunction(a);

That copy constructor might trigger a database hit to create a new object if
myobj was some sort of ORM. It might call 23 other constructors to fully
complete it's deep copy. Who knows, as it all happens rather implicitly. In
python, that would look like:

a = myobj() somefunction(a)

def somefunction(p): p = copy.deepcopy(p)

In that version I'm now implicitly providing that copy. In your multi-threaded
example this is still better, as I get the best of both worlds. Ya, I might
lose a bit of convenience... but that explictness is worth the trade-off IMHO.

