Hacker News new | past | comments | ask | show | jobs | submit login
NSCopyObject, the griefer that keeps on griefing (wadetregaskis.com)
140 points by chmaynard 9 months ago | hide | past | favorite | 47 comments



Historical fun fact:

In the original version of Objective-C and NextStep (1988-1994), the common base class (Object) provided an implementation of `copyFromZone:` that did an exact memcpy of the object, a la NSCopyObject. In other words, NSCopyObject was the default behavior for all Obj-C objects.

It was still up to each subclass to ensure that copyFromZone: worked correctly with its own data (not all classes supported it).

AppKit's `Cell` class provided this implementation:

    - copyFromZone:(NXZone *)zone
    {
        Cell *retval;
        retval = [super copyFromZone:zone];
        if (cFlags1.freeText && contents) 
            retval->contents = NXCopyStringBufferFromZone(contents, zone);
        return retval;
    }
Here it needs to make a copy of its `contents` string, using NXCopyStringBufferFromZone, when the copy of Cell is expecting to free that memory (cFlags1.freeText).

OpenStep introduced reference counting and the NSCopying protocol, and removed the `copyWithZone:` implementation in NSObject.

So the equivalent implementation in OpenStep's NSCell class could be:

    - (id)copyWithZone:(NSZone *)zone
    {
        NSCell *retval;
        retval = NSCopyObject(self, 0, zone);
        [retval->contents retain];
        return retval;
    }


> If your superclass uses NSCopyObject, it’s now your problem just as much as if you’d used NSCopyObject directly, whether you like it or not.

One of the more insidious drawbacks of OOP: The tight coupling introduced by inheritance.


AFAIK the pure platonic ideal of OOP (i.e. the thing that Smalltalk and Self do) makes no mention of subclassing — that’s bolted-on brain-damage that’s become conflated with OOP.

The way a “real” OOP language would do subclassing, would just be composition: wrapping an instance of the “parent” and proxying messages up to it.

If such a language wanted to “support inheritance in the language” (i.e. put syntax sugar around that wrapping), the result would probably look something like Golang’s anonymous embedded struct fields — where embedding a “parent” object of type X, gives your wrapping object Y default proxy methods to call the X instance’s methods of the same name; and default proxy getters/setters for the X instance’s fields; both of which you can override as you please / are implicitly shadowed by methods+fields of the same name declared on the wrapper type.


Smalltalk and Self use prototyping for inheritance, like JavaScript and Lua. It's a dynamically typed version of subclassing, where your parent is itself a live object, not just a definition. I suppose it's proxy-like, except forwarding of messages/method invocations is automatic and part of the language semantics, and the context object (self) is the one used in the caller's invocation.

I've not used Objective C much, but I believe the core of Objective C works the same--non-overridden methods (message handlers) of an instantiated object are invoked using the parent prototype's instance definition. Objective C derives both it's semantics and technical jargon (e.g. selectors) straight from Smalltalk. But it also supports so-called protocols, aka interfaces in other languages, which because it's a more familiar paradigm seems to be how most people associate OOP with Objective C.

I guess I'm making a hash of it, but in any event the end result is quite similar to if not largely indistinguishable from class inheritance. And because it's all dynamic, and depending on the language mechanics and your particular masochistic proclivities, it's even often quite straight forward to achieve multiple inheritance, e.g. in Smalltalk or Lua using a catch-all/fallback handler to select a method from a second (or third, fourth, etc) prototype subtree.


Self goes one step further, in Self to create new forms of objects, first one has to clone an existing object, add the required object structure to it, dynamically, and then use it as prototype for newer instances of that form.

Additionally Self, in its original form while the Java vs Self vs Smalltalk wars were going at Sun, is a full single user workstation, only with the kernel code/JIT written in C++, everything else is Self.

It is this kind of plasticity that provided the brewing ground for high performance JITs, that eventually found their way into JavaScript, and why the usual "Python is too dynamic" excuse, is just that, an excuse for not wanting to spend the engineering resources that Smalltalk and Self JITs have trailed.


Smalltalk is definitely class based, though you are right for Self. It's rather unrelated to types, though I see the analogy - that said some paper on statically-typed Wyvern mentioned adopting the Self model for delegation.


"Smalltalk encourages well-factored designs through inheritance. Every class inherits behavior from its superclass … all forms of ordered collections in the system will instantly acquire this new capability through inheritance."

1981 "Design Principles Behind Smalltalk" Daniel H. H. Ingalls

https://www.cs.virginia.edu/~evans/cs655/readings/smalltalk....


Apparently, the earliest versions of Smalltalk as designed by Alan Kay didn't have inheritance. It was only added later on, seemingly in the name of performance:

> After significant revisions which froze some aspects of execution semantics to gain performance (by adopting a Simula-like class inheritance model of execution), Smalltalk-76 was created. This system had a development environment featuring most of the now familiar tools, including a class library code browser/editor. Smalltalk-80 added metaclasses, to help maintain the "everything is an object" (except variables) paradigm by associating properties and behavior with individual classes, and even primitives such as integer and Boolean values (for example, to support different ways to create instances).

https://en.wikipedia.org/wiki/Smalltalk#History

https://wiki.squeak.org/squeak/989


If you'd like to know about the early history of Smalltalk, then I'd recommend Alan Kay's "The Early History of Smalltalk" (in this case say pages 19-31)

[pdf] https://web.archive.org/web/20230503050530/http://www.metaob...

"… since things can be done with a dynamic language that are difficult with a statically compiled one, I just decided to leave inheritance out as a feature in Smalltalk-72, knowing that we could simulate it back using Smalltalk's LISPlike flexibility. … By the time Smalltalk-76 came along, Dan Ingalls had come up with a scheme that was Simula-like in its semantics but could be incrementally changed on the fly to be in accord with our goals of close interaction." p31


Only for those that misunderstand OOP with inheritance, and never learned there are many forms of OOP.


I know that, but the majority of OOP code today uses languages with inheritance, ObjC (what the article is about) being just one example.


Not only, protocols, categories, object swizzling, message passing, are all mechanisms to do OOP in Objective-C with zero inheritance.


The issue that stuck out to me is the fact that top hits of "how to do X" on the web, often lead to bad solutions, bad practices, outdated info. And, they keep spreading forever. Someone finds the bad solution, copies it into their code, someone else references their code, copies the bad solution again.

I have no solutions to fix this but I sometimes see it happen where someone asks a question on S.O. or Reddit. Someone posts a bad-practice solution that happens to work. It's marked as the answer, upvoted, and now it spreads like an infection.


If only the software in question had been written by a company worth nearly four trillion dollars, which would allow them to spend some money on comprehensive documentation that was kept up to date.


This would be less of an issue if official documentation was less universally terrible.


They need to filter out possible malicious actors so a downvote alone wouldn't cause the question to be disregarded... but still, whenever you see that, cast your vote!

Also, a comment is the most powerful tool you have. I've benefited multiple times from kind souls that added a comment in an incorrect answer, to the tone of "as of 2024 this is a bad solution and it's much better to do as told here [link...]". So try to add a comment too if you spot bad answers, please :-)


I agree S.O. is good at this since it has comments and you can add new answers as well as edit old ones. But, the rest of the internet is covered in outdated articles that never get updated. Also Reddit which auto-closes its threads.


It's good for a career, though, especially if you enjoy cleaning up messes


And now we have AI coding assistants training on that code as well


Sometimes I even notice when you go into comment sections trying to say that certain techniques are not best practices, people call you elitist, say you're making nothing more than appeals to authority, and say there's nothing wrong with (whatever equivalent of) NSCopyObject, so get off your high horse.


As with many problems this can be handled by a layer of indirection.

Have your subclass push all its data into a storage object that itself is a simple ObjC type held in a standard ARC-managed property/ivar. Then the NSCopyObject memcpy copies the pointer to your storage and the ARC fixup ensures the retain count of the storage object is correct.

This is the simplest way to resolve the problem for something like NSCell without anyone making future code changes needing to think about it too hard or accidentally introducing a regression.


> And yet, Apple still use NSCopyObject themselves to this very day

On one hand, I'm not surprised. Change at that low a level is danger. The most risk-averse path is probably what Apple did: leave the classes calling NSCopyObject() alone, and really really warn developers not to use it (and be sure not to introduce any new Apple classes that call it). (The author raises the obvious problem though with subclassing ... would be embarrassing for Apple to also tell devs to stop subclassing NSCell)

I think with how dynamic the OS has shown itself to be, we generally hope that problems like this just "go away". Apple is probably hoping people, decades on now, are no longer subclassing NSCell — have moved off of NSAnimation....

Interesting read. I always like to learn how things work under the hood.

It's a funny thing that when you are introduced to a new language or framework — like ObjectiveC and Cocoa were to me, it feels like a thing that works in a magical way. And initially you treat it like the black box it appears to be. You adhere to things you are told like, call AutoRelease() before returning an object if you expect the caller to retain or whatever - not really knowing what all that means. Only later finding out that the "magic" of retain/release is just a count that is incremented and decremented.

In time the veneer of magic fades and you remind yourself that the thing was written in C after all.


I doubt there’s too many people subclassing NSCell directly these days, but unless I’m missing something (entirely possible), it’s difficult to avoid subclassing NSButtonCell in a lot of cases where you want a stock yet lightly customized control. That’s where I’d expect to see the bulk of third party NSCell subclassing.


> Only later finding out that the "magic" of retain/release is just a count that is incremented and decremented.

Pretty sure most devs know that's what reference counting ultimately boils down to (it's right in the name after all), but many papers over the decades testify to the fact that maintaining that counter in a way that's both fast and reliable turns out to be surprisingly tricky.


This is a good overview of how even with ARC Objective-C's automatic memory management is rather fragile, and how the tracing GC initial approach was an herculean attempt to make it work right without crashes and memory corruption.


From my time at then-Facebook: ARC is a huge pain at scale (hundreds of developers contributing to the same codebase) because anyone anywhere can cause a retain cycle.

E.g. imagine if you open some news feed story, then go back to news feed so the view controller you opened is lost, but the person who implemented that page was an android dev who recently switched to iOS and doesn’t know about weak pointers, and had something deep in the view hierarchy retaining a pointer to the root… now everything associated with that page is leaked, which can include megabytes of decoded images or worse, videos…

The issue was compounded by the fact that, at least at the time, their interview process primarily rewarded regurgitating memorized answers to leetcode problems as quickly as possible.


Random color commentary you might appreciate: I grew up on iOS for several years, then went to Google, and after a couple years migrated to working on Android's Springboard equivalent.

One of the odder things to me about Android is this could happen too, with what felt like ~same rate as iOS. There wasn't nearly as good tooling as Instruments, instead it was ~always caught by testing but it was so much more work: testing was done on a batch of commits, so if there was a regression, there needed to be an additional layer of bisecting + test engineer in the loop to let you know it happened.

If someone does Android regularly and sees this, I'm very curious what you use for detecting cycles: my iOS habit was to, every month or so, and with each release, sit down with this tool called "Instruments" that displayed live objects and run through the app, and see if there was any unexpected objects representing screens / core workflows piling up.

I have a feeling this is totally possible on Android (maybe that Square leak canary library?), and it was more manually processed at Google because there was no way to enforce SWEs actually did it, and we couldn't(?)/didn't put the Square library in the OS.


Doesn't Android have a tracing garbage collector? How do retain cycles cause leaks?


_Really_ handwaving here, tl;dr in a pure cycle scenario its resolved but it seemed it didn't help much compared to the number of footguns available. ex. felt like on iOS you'd cover 90% of cases by just "just do [weak self] in every dispatch_async"

In Android you have to remember that, and there's this god-object called Context that you use to retrieve strings, images, and layouts (~XIBs). Each Activity (ViewController) has one and its incredibly easy to accidentally retain it, and thus the whole screen.

The standard way of handling button presses / whatever IBActions are now also would cause it, and these heavily used things called BroadcastReceivers would usually be spun up when an Activity was started, and instantly form a cycle with the Activity that for some reason can't be noticed easily.

Sorry for the handwaving, it's been very strange spending so much calendar time working on something I had to understand a lot less. (tl;dr: solo founder that shipped a point of sale system to 2,000 restaurants starting on iPad 1.0, had to get to 0 leaks all by myself. versus the warm safe blanket of bigco).

This Medium article is an excellent overview I used to remind myself, though the AI art is incredibly shitty, enough to make me wonder if the article is plagirized. https://medium.com/@naeem0313/top-10-android-memory-leak-cau...


Makes sense!


I just avoid using the copy(_:) method. I feel that it's one of those "code smell" flags. If I find myself contemplating using it, I probably need to revisit my design. This is the kind of stuff that does not age well.

Those of us "of a certain age," may remember the pre-MacOSX CopyBits[0] function.

That was fun. :P

[0] http://preserve.mactech.com/articles/develop/issue_06/Othmer...


If you have objects storing NS(String|Array|Dictionary)s, you ought to copy them before storing, otherwise you might inadvertently end up storing an NSMutable(String|Array|Dictionary), which can get updated behind your back, producing hard to debug issues.


`copy` alone is not a red flag. There are definitely cases where you need to use it for safety -- mutable pair types like property list types, for example. But you can't trust an object like -- for example -- NSWindow to do something sane with it unless the behavior is explicitly documented.


Well, most of my work, these days, is in UIKit, so I'm fairly used to "let" not meaning much.

I also tend to rewrite the code completion implicit unwraps (!) with explicit ones (?). Forces me to acknowledge that I'm skating on thin ice.

If there's something that I need to keep around, I make a local let copy of just that property, as opposed to the whole shooting match that contains it, or I look up the value in a JiT manner. I tend to have a lot of computed properties. I'll often extend base classes and structs, to add computed accessors.


>I also tend to rewrite the code completion implicit unwraps (!) with explicit ones (?). Forces me to acknowledge that I'm skating on thin ice.

Hold on. Do you prefer your code failing silently instead of crashing?


Nope.

I can't use the properties without explicitly unwrapping them. Implicit means that I can pretend they aren't optional.

But I often have lines that use ?? to bring in alternatives, and will, if the occasion calls for it, allow the app to fail silently. I tend to like using things like assertions and preconditions, if I want to make the app crash.


Lots of fuss to make sure you stay on the CopyBits fast path… no masking, rectangular regions, no format conversions, color table has identical ctSeed, and you can even get a pointer to a lower-level, specialized version of CopyBits if you need.


> And yet, Apple still use NSCopyObject themselves to this very day

Of course they do. The existing behavior involves NSCopyObject (for decades at this point) and user code written against those existing APIs, therefore depends on that behavior. Even if removing the NSCopyObject usage would make things "better" in principle, doing so would likely break existing code so it simply isn't an option.

Hence it's deprecated, and all the documentation says "don't use this", but the system still does, not because Apple is being hypocritical, but because it has to support existing code. That's just the reality of providing ABI stability on any OS: sometimes you have to do distasteful things - an example I'm acutely familiar with is with JSC's API on 32bit platforms: the internal representation of js values is 64bit, but the API is 32bit so sometimes accessing a property can result in GC allocation of an object to wrap immediate values.


Yeah, it's wildly hyperbolic. Stuff like

> Blindly copying the bytes of an object instance, and just hoping that somehow that works correctly – in an object-oriented language derived from Smalltalk where even numbers are often reference types – is farcical.

is a nonsensical thing to say, when NSNumber started using tagged pointers 15ish years after NSCopyObject came into existence. But I'm sure it gets more clicks than just pointing out that a very old piece of software has some API warts which should be avoided.


> Ironically, Swift’s attempts to prevent incorrect code actually make it harder to write correct code

Yes, and strict concurrency deserves a post on itself. I wonder how NSCopyObject works with actor isolation?


It's been deprecated since 2012, so it probably shouldn't be called from APIs introduced 3 years ago anyway.


I think probably Xcode should say something if one tries to. (concurrency checking is brand new in Swift 6).

Like the post says, it’s not always clear if you’re working with a class which inherits from NSCopyObject, so how does one “break out” of strict concurrency to copy an object like this?


The articles says it's used mostly in old UI components like NSCell, that can only be used on the main thread. NSImage & NSImageRep have questionable thread safety, and I don't know how often NSAnimation is copied on a non main thread.


And is SwiftUI's @MainActor annotation "main thread" enough for this? e.g. if a View with @MainActor annotation renders a child which is a NSCell/NSImage/NSImageRep/NSAnimation wrapped in a representable - all is good? I don't know why Apple keeps adding things to Swift without removing the old things.


For two main reasons: a) because there are untold amount of apps relying on the old things b) because the people working on the language are not the same people working on the frameworks.


So we just have to live with every mistake and “mood” Apple has dropped into Swift, accumulating complexities and “time to ponder what I’m doing wrong” while debugging?


NSCopyObject isn't something that was "dropped into Swift," it's a function that has existed since the mid 90s, and is available in Swift because Swift can call C and Objective-C functions.

But yes, you do have to live with the history of the platform you're developing for. Opportunities to break backwards-compatibility are few and far-between, because users don't like when OS updates make their apps stop working.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: