Hacker News new | past | comments | ask | show | jobs | submit login
The miracle of Smalltalk’s become: (2009) (gbracha.blogspot.com)
89 points by nathell on Nov 23, 2022 | hide | past | favorite | 29 comments



> Take a minute to internalize this; you might misunderstand it as something trivial. This is not about swapping two variables - it is literally about one object becoming another. I am not aware of any other language that has this feature. It is a feature of enormous power - and danger.

I do think implementing or understanding this is trivial. 'Become' can be implemented by 1. swapping all references to 'a' and 'b', or 2. swapping the data at memory locations of 'a' and 'b'. I can certainly see its interesting uses but don't see any value in describing it as a miracle or something that's hard to comprehend. Using the word 'become' to explain the language feature 'become' also isn't great.

I used ObjectStore at one company that does 'pointer swizzling' to lazy load deserialized objects, where segfault takes the place of `doesNotUnderstand`.


I usually use it as an example when people say Python is too dynamic to have a JIT, when Smalltalk was one of the percusors of JIT for dynamic languages with features like become: (SELF then took it even further).


If I'm not mistaken, this is also an amusing retort to the idea that only statically typed languages can have common refactoring tools.


Specially since the first IDEs with such features were for Smalltalk and Lisp.

Mesa (XDE) and then Mesa/Cedar, have several references Xerox papers, on how those enviroments served as inspiration for their IDE like features, like REPL, typo corrections, debugging, code reloading,...


Exactly. Too many folks take "static typing" as the only form of static analysis that can be done. Sometimes the only analysis that can be done on a codebase.


It's really unnerving when people repeat that "too dynamic" remark when there is widely known counterexamples. Fortunately it seems now people stopped denying the problem with the faster cpython project. IIRC, I think the roadmap for 3.12 or 3.13 already has some form of lightweight JIT.


Yes, apparently it took Microsoft's persuasion to make it happen, hiring Guido exactly for that purpose.


"become" is a standard approach in Erlang, Akka and other actor model systems (because it is included in the original actor model axioms by Carl Hewitt, "[a message can] designate the behavior to be used for the next message it receives"). We write state machines with that (on some events, you just 'become' another state with its own reactions to messages).


It helps a lot that there's limited way to inspect an actor in a way which is not mediated by the actor itself, so for starters you wouldn't bother, and even if you did the "replacement" actor can just reply in a way you'd expected.


In fact in most cases dynamic subclassing is enough for such things. In JavaScript that's achieved by changing __proto__ reference on an object.

So "obj instanceof A" becomes "obj instanceof B".

As of persistence case I've solved it on JavaScript internal implementation level. In Sciter there is built-in JSON-ish data persistence module - close to Mongo-DB on feature set (modulo sharding).

Storage loads objects as half-backed proxies that contain only db references. Only when code tries to access props/methods of the loaded object it gets fetched from disk, its __proto__ is set to particular class, etc.

More on this architecture: https://gitlab.com/sciter-engine/sciter-js-sdk/-/blob/main/d...

By the way, dynamic subclassing is not a prerogative of only dynamic languages, here is how I did that in C++: https://stackoverflow.com/questions/21212379/changing-vtbl-o...

Patched QuickJS with storage support is here: https://gitlab.com/c-smile/quickjspp - it uses DyBase of Konstantin Knizhnik as a storage.


Not really, because the goal is that obj is replaced by a different instance, not to change its type.

Become can swap two objects that are exactly of the same type, where the prototype change would be a null operation, achieving nothing.


"obj is replaced by a different instance"

Could you provide real life scenario when you will need that?

I mean to change all references to the object in the heap at runtime ...


Live code editing on Smalltalk debugger, after changes on the class browser, redoing statement that caused the break into debugger, and have all live instances on the image updated.


This would've been very convenient for lazy loading in a recent python project. I ended up making the husk turn into a transparent wrapper for the loaded object, unfortunately adding a layer of indirection to every subsequent call. Does anyone here know a better way to do lazy loading objects in python?


If all the types are the same shape (at the C level) you can swap the class and all attributes of the husk.


Use ChainMap in __dict__


That works? Wow, I need to try this.


I got carried away: you can use ChainMap, but the __getitem__ is not going to get called: https://bugs.python.org/issue1475692

So your only valid solution is to use __getattr__, which is guaranteed to work. But either you load all missing value on first access, and you can only lazy load once, or you pay the price of a method call for attribute access every single time. And in all cases, you won't have type hint doing the ork for you.

An hybrid strategy would be to always use __dict__ pointing to an empty dict you lazy load later, and @property for attribute that are only for the local object. Then you will pay the price of the of the method calls only for non lazy attributes.

Depending of your work load and data shape, one solution will be much better than the other, but nothing as elegant as the ChainMap.


The magic here comes from Smalltalk's object memory system, which makes this object easy. (Well.. okay.. not completely easy, but simple.)

Also... I know in Digitalk, you couldn't do a become: on SmallInteger because of the way their object memory worked, but don't remember if this was a limitation of SmallTalk-80 or Squeak.

Which is to say... when you have time, if you haven't done it already, do a search on "smalltalk object memory" or "Loom" or "bluebook object memory." There are some great references from the dawn of time.


The SmallInteger limitation is probably because as an optimisation they stored them as immediate values rather than generic objects, and Squeak shares this limitation (the newer Spur format extends this to one or more additional types depending on image bitness).

Related to references and the above, Eliot Miranda's blog may be informative as it provides both details and sometimes code for many of the changes that have happened in the Squeak (and now OpenSmalltalk) VM, e.g. http://www.mirandabanda.org/cogblog/2013/09/13/lazy-become-a...


Squeak implements it with horrible performance, IIRC


I hear you. The last time I used squeak as a prototyping tool was around 2002 and yeah, the performance was pretty craptastic. I bumped into David Ungar at a conference a few years back. He mentioned there has been a fair amount of work back-ported to Squeak from some of the later Self work and maybe even from Lively Kernel for all I know.

But lest someone read this thread and use it as evidence that Squeak is completely crappy, I would encourage them to do a bit of testing with a recent version. Also... I don't think people are going to use Smalltalk for it's raw performance, but in it's ability to model problems with pleasant "pure" OOP-ness.


There have been very extensive changes since 2002, yep, such as a much improved interpreter, a JIT, a replacement of the original object memory format which incidentally supports faster become and most recently a new bytecode set.


    Object.prototype.become = function(target) {
        
        const newProto = Object.getPrototypeOf(target);
        Object.setPrototypeOf(this, newProto);
        
        for ( let key of Object.keys(this) ) {
            delete this[key];
        }

        Object.assign(this, target);

    }
Here's as close as I could get in JS. This fails Object.is(a,b) though, it just makes object A's data the same as B's.

The two objects will desynchronize, unless all their properties are Objects (eg. Arrays) and no new ones are added.

As far as I know there's no way to update all references to an object to point to a new object. Though you could take the article's advice and add a bunch of indirection with getters and setters, referencing an internal "true" object which could be swapped out trivially.


Getters and setters will also desynchronize for similar reasons (new properties being added/removed).

In JS you're better served by proxies or to set things up ahead of time so you're not passing a direct object reference, but a mirror instead.

<https://bracha.org/mirrors.pdf>

Once you've got all relevant consumers using a mirror or a proxy and all access is therefore mediated, that's when you can freely perform these kind of swaps.

As noted by the commenters, however, this is quite a heavyweight solution.

Related:

> Inside the engine, a clever trick from Smalltalk called `becomes` is used to swap a newborn Proxy and an existing object that has arbitrarily many live references. Thus an object requiring no behavioral intercession can avoid the overhead of traps until it escapes from a same-origin or same-thread context, and only if it does escape through a barrier will it become a trapping Proxy whose handler accesses the original object after performing access control checks or mutual exclusion.

> The local jargon for such object/Proxy swapping is “brain transplants”.

<https://brendaneich.com/2010/11/proxy-inception/>

<https://bugzilla.mozilla.org/show_bug.cgi?id=580128>


> In the absence of an object table, become: traverses the heap in a manner similar to a garbage collector. The more memory you have, the more expensive become: becomes.

Well, yes, that should be "the larger the reachable object graph you have, the more expensive become: becomes". It's not necessary to do the substitution in areas of the heap that are garbage.

I would rather do it without any sort of object table, because then you can do things like make some string object become the fixnum 42 everywhere. :)

If you're doing something with object persistence, you can't be doing individual become operations object-by-object. There has to be a batch API for doing a mass become where you pass a dictionary of what is to become what, and the objects are traversed once to do all the rewrites.


Closest equivalent that I can think of is the change-class function in CL. That one also happens to be generic so you can control the finer details of what it means to change A to B


Combining ‚become:‘ with the ability to serialize objects was fun.


True become false




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: