Hacker News new | comments | show | ask | jobs | submit login
Swift 3.0 Unsafe World (meronapps.com)
74 points by robjperez 20 days ago | hide | past | web | 45 comments | favorite



The article includes the following example:

    let a = UnsafeMutablePointer<Int>.allocate(capacity: 1)
    a.pointee = 42
FYI, you should never do this.

Assignment to the `pointee` is only safe where the `pointee` is already initialized. If the pointee is uninitialized, you should be using `initialize`. e.g.

    let a = UnsafeMutablePointer<Int>.allocate(capacity: 1)
    a.initialize(to: 42)
While it doesn't actually matter for `Int`, it does matter for classes, unspecialized generics and other Swift reference counted types since Swift will try to release the destination before assigning a new value into it. If the destination is uninitialized garbage, this can cause a crash or other misbehavior. Following proper practice is a good idea – even if you're using a hard coded pure value type like `Int` – since it prevents problems down the road if you decide to change.

Similarly, you should `deinitialize` the buffer when you're finished, not simply `deallocate`.


From the doc comments:

  /// Accesses the `Pointee` instance referenced by `self`.
  ///
  /// - Precondition: Either the pointee has been initialized with an
  ///   instance of type `Pointee`, or `pointee` is being assigned to
  ///   and `Pointee` is a trivial type.
  public var pointee: Pointee
So, `a.pointee = 42` is fine since we all know that `Int` is a trivial type.


Jaywalking when a street is empty is safe; it does not mean that jaywalking is safe in general, and that one should get used to it.

In this case, `.initialize` is likely also faster because any previous-value-checking logic than might be there is explicitly not used.


When it's explicitly allowed by the rules, it's no longer "jaywalking."

You're right that it doesn't mean this is safe in general, but the original comment is wrong that you should never do this.

What previous-value-checking logic are you referring to? I would expect .pointee = 42 to compile down to a single store instruction.


> FYI, you should never do this.

That's a bit too strong. This is safe if the type is trivial, which it was in the article. The author should have added the caveat you wrote, though.


Is garbage collection strictly worse than other memory managementarment techniques?

Got distracted one sentence in by an assertion that's so at odds with my understanding of the world.


The Garbage Collection Handbook [http://gchandbook.org/] has a good overview of the advantages and disadvantages of reference counting as garbage collection.

In summary,

Advantages:

A. Objects are collected immediately when they become garbage.

B. Programming language runtime support is simpler.

C. Memory can be reclaimed easily in programs with distributed heaps.

Disadvantages:

A. Natural race conditions exist in multi-threaded programs when reference counting is used. Atomic read/writes are necessary when incrementing/decrementing the reference count for an object.

B. Cyclic data structures cannot be reclaimed. (This is why you have to jump through hoops to avoid this in Objective-C and Swift.)

C. Every object in your language which is reference counted needs a field to hold the reference count. Or you need a special object in its stead which "points" to your object, like a shared pointer.

D. Pauses with reference counting are still possible. As stated in the book, "When the last reference to the head of a large pointer structure is deleted, reference counting must recursively delete the descendant of the root."


Two more advantages which are relevant for Swift:

1. Memory high-water-mark is generally lower with RC, which is important for mobile.

2. RC lets you easily determine when there is only one reference to an object. This enables Swift's collections to be value types, one of Swift's key semantics.


> 2. RC lets you easily determine when there is only one reference to an object. This enables Swift's collections to be value types, one of Swift's key semantics.

It also enables some optimisations e.g. Python has immutable strings, but CPython can take advantage of rc=1 to extend strings in-place (avoiding accidentally quadratic concatenation loops until you start storing intermediate references or you try to run the software on alternate implementations)


Another major advantage: No need for "finalizers" which can often run on an unpredictable thread, resurrect objects, etc. There is extensive literature about how problematic finalizers are in practice. deinit in Swift suffers from none of these problems.


There are better alternatives to finalisers for GC'ed languages. Phantom refs in Java are a good example of a finalisation mechanism that explicitly prevents resurrection of objects.


> C. Every object in your language which is reference counted needs a field to hold the reference count. Or you need a special object in its stead which "points" to your object, like a shared pointer.

For a long time (until the 64b class pointers I think) Objective-C used external tables of reference counts instead of embedding refcounts in objects.


The side table only gets a slot for an object when the object is retained (has its reference count incremented) after creation. (Every object starts with a count of 1 at creation.) Before ARC, you could certainly have lots of objects that were never retained after creation. Since ARC tends to be more careful than a human about retaining objects (meaning it retains objects in more places), I don't know how much more load ARC puts on the side table, but I suspect it's not zero.


Regarding D, you can even get stack overflows if the destructors require nested calls.


re ref counting pauses from cascading frees, you can lazily free big data structures by deferring to a later event loop tick or idle event.


No, absolutely not. Swift uses reference counting under the hood because of its Objective-C runtime ties, not because reference counting is better than tracing garbage collection. The author is just not well-informed on that issue.


Chris Lattner on the advantages of reference counting over tracing garbage collection: https://lists.swift.org/pipermail/swift-evolution/Week-of-Mo...


Swift also uses reference counting because the people at Apple think it's outright better, not just because of the Objective-C heritage. Whether they're correct is certainly debatable, but it's not a compromise in their view, it's the best choice.


>Swift uses reference counting under the hood because of its Objective-C runtime ties, not because reference counting is better than tracing garbage collection

Not really. Obj-C already had a GC and dropped it, to implement ARC because it's faster and more memory efficient.


Completely wrong.

GC was dropped from Objective-C, because Apple did not manage to make it work safely with C semantics.

Objective-C developers weren't able to mix Frameworks compiled with and without GC, there were lots of corner cases and forum discussions regarding GC runtime crashes were quite common.

Apple's decision in such scenario was to take the alternative route to make the compiler do what the Cocoa manual reference counting patterns already required.

Nothing to do with being faster and more memory efficient, rather having something in Objective-C that would work at all.

However this urban myth keeps being repeated.


Saying that GC "didn't work at all" isn't right: it did work and it shipped. Xcode dogfooded GC, and Xcode is not a small app.

There were definitely pain points, including C interop. But these could have been worked through. Apple is good at these sorts of transitions - think x86 or 64 bit, which had the same issues (can't mix 32 bit and 64 bit frameworks, etc).

Probably if Apple had not started making mobile devices, ObjC would have retained GC. But mobile devices had performance requirements that GCs struggle to meet. It's too broad to say that ARC is "faster and more memory efficient"; rather it's more efficient in the ways that are important to Apple. For example, avoiding pause times is more important than total allocation throughput (have to avoid dropped frames). Memory high-water-mark is more important than fragmentation. etc. Servers are the reverse.

Note that GC has been painful on all mobile platforms, not just iOS. That's why Microsoft has taken a step back from GC (WinRT uses ARC). And in Android, GC is one of the most common drivers of performance issues, and avoiding GCs is a key optimization technique. The compiler even warns on allocations inside functions like onDraw().


Microsoft has not taken a step back from GC, rather WinRT is built using COM, which requires RC to make C++ happy.

Android GC has been quite behind the state of the art from what production embedded JVMs like Aonix and Websphere Real Time are capable of. Actually Dalvik was left in limbo for quite awhile before they eventually came up with ART.

So it is easy to just generalize GC vs RC without regarding what algorithms are actually being used, and the qut of said implementations.

There are military weapons control systems and factory robots using real time GC implementations of Java in hardware much more constrainted than a smartphone.


Microsoft built their next-generation platform around RC instead of GC, despite their enormous investment in .NET, and years spent pushing it. That's clearly taking a step back.

It may be the embedded world has very advanced GCs that would perform well on modern mobile platforms, and just nobody has gotten around to integrating them yet. I don't know! But embedded devices with predictable workload that you can model and test are quite a different beast from a general-purpose computer, so that remains to be proven.


The problem in GC vs RC and decisions taken by companies, is that many times the decisions are political and not technical.

The outcome of Singularity and Midori is a proof of it, in spite of the outcomes prooven by their research, the projects were killed by management.

At least they allowed some of the tech to be repurposed for .NET Native.


Basically in the unrealistic situation of a closed system, GC can be faster. But it makes FFI significantly harder (e.g. V8 API)


Not necessarily, in many GC enabled systems programming languages you can also allocate outside GC heap.


It's OK to point something as "Completely wrong" and as a "Urban myth", but a source is appreciated.


The source are the Apple developer forums and WWDC sessions.


How about Chris Lattner, who implemented that in Obj-C and then Swift?

>>Has a GC been considered at all? >GC [also] has several huge* disadvantages that are usually glossed over: while it is true that modern GC's can provide high performance, they can only do that when they are granted much more memory than the process is actually using. Generally, unless you give the GC 3-4x more memory than is needed, you’ll get thrashing and incredibly poor performance. Additionally, since the sweep pass touches almost all RAM in the process, they tend to be very power inefficient (leading to reduced battery life). I’m personally not interested in requiring a model that requires us to throw away a ton of perfectly good RAM to get an “simpler" programming model - particularly on that adds so many tradeoffs.*

So it's hardly "completely wrong", especially for iOS.


I would say that in the specific case of this article (talking about passing pointers to managed objects to C) it is easier without a tracing GC simply because reference counting doesn't ever move the object and so pointers won't be broken (except, as the author shows, if you let the only reference go out of scope before you're done using the pointer). A GC language _may_ implement pinning to overcome this, but not all do.


Tracing GC was supported in objective C for awhile but was deprecated/dropped awhile back because of performance issues. Referencing counting does have some advantages over tracing GC, but it's all really a bunch of trade offs to make.


Performance wasn't the main problem, rather crashes and memory corruption.

https://news.ycombinator.com/item?id=12589998


Ah, that makes sense. I remember them putting it in and then yanking it out for some reason.

Referencing counting seems to be in vogue these days because it is somewhat more predictable on mobile systems, at least this is what I heard about the Metro/Store/UWP API at Microsoft (C# is still garbage collected, but the APIs are primarily ref counted).


WinRT is just COM with a new base interface extending IUknown and .NET metadata instead of TLB files.

If you search for Ext-VOS on Don Syme blog, it resembles quite a lot the genesis of .NET before they went with the CLR.


There is nothing inherently wrong with garbage collection, but it does take your control away. You have to allow for a certain level of "magic" to have elements be freed. This doesn't always work so memory leaks are a bit harder to track down.

In a system where you control the allocations and deallocations, you have finer control but it's easier to forget to deallocate memory no longer in use. This is easier to track down, but a headache none the less.


Also, if we're talking about passing pointers to objects to C code, GC frequently moves objects in memory which will invalidate the pointer. It's not possible to reliably predict when this will happen, so you need extra language support for pinning to do any of the things in this tutorial without accessing invalid memory.


Well, passing raw pointers to garbage collected objects is impossible without language support for "pinning". This is because the GC will move objects around in memory breaking any pointers you're using in your C code. That may be a generous interpretation of the author (it could just be bile), but in the context of the article (about C interoperability) that's what comes to mind.


That's not quite true.

For example, in Cedar, Modula-3, Active Oberon, D you can mark a pointer as not being under GC control.

Other GC enabled systems programming languages offer similar feature.


> This is because the GC will move objects around in memory

Not necessarily, you could use (and guarantee) a non-moving GC. Moving is but one of the attributes/details you can use for your GC.


And this is definitely not just theoretical. Any conservative collector must also be non-moving. The Boehm GC for C and C++ and Apple's GC for Objective-C are both existing examples.


IIRC Go's GC is also non-moving for now (or has that changed in a recent version? I believe it was still non-moving around 1.5)


Without trying to be unfair to the author, this article is not super well-written so intentions may not be expressed clearly.


Great read on a topic I've been looking for more info on, recently coding an audio app that needs to manipulate audio buffers. And this is the only way to access them. I've been working in swift since 1.0 and this is the first time this all makes sense.


It's one of the clearest articles I've read on the subject, too. I would have liked it if there had been a section on calling "malloc" and "free" directly from Swift though. I do that a few places in my audio app, and there are a few gotchas involved (eg: at least in Swift 1, shouldn't use "free" and "dealloc" interchangeably. Probably still the case).


To someone unfamiliar with swift, it seems quite informative about the mechanics of unsafe pointers. Where it misses is in discussing how best to segregate the "parts with unsafe" from "the bulk of your code (hopefully)", since the parts with unsafe by definition deserve more careful review and auditing (or even that you should do this sort of segregation)

There are probably many ways you could design the partition; for instance, I would probably try to design a safe API that was patterned closely after the underlying one (Posix, OpenGL, etc); others would prefer to design more Swift-ian APIs that still expose all the useful capabilities of the underlying libraries.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: