Hacker News new | past | comments | ask | show | jobs | submit login
Atomics in Objective-C (biasedbit.com)
41 points by ovokinder on May 5, 2015 | hide | past | favorite | 20 comments



This is called a semaphore, it's already implemented in GCD.

Also, why talk about performance and then make obj-c method calls...?

It's quite easy using NSProxy to create a throttler that will wrap any object, then you can abstract throttling from the behavior of the underlying object.

  @interface Throttler : NSProxy {
     dispatch_semaphore_t _semaphore;
     id _object;
  }
  - initWithObject:(id)obj concurrentOperations:(int)ops;

  @end
  @implementation Throttler

  - (id) initWithObject:(id)obj concurrentOperations:(int)ops {
     if(self = [super init]){
       _semaphore = dispatch_semaphore_create(ops)
       _object = obj;
     }
     return self;
  }

  - (void) forwardInvocation:(NSInvocation*)invocation {
     if(dispatch_semaphore_wait(_semaphore,0)){
        @try {
        [invocation setTarget: _object];
        [invocation invoke];

        }
        @catch (NSException* e){
          @throw e;
        }
        @finally {
        dispatch_semaphore_signal(_semaphore);
        }
        return;
     }
     @throw [NSException
          exceptionWithName:@"InsufficientResourceException"
          reason:@"Insufficient Resource"
          userInfo:nil];
  }

  @end 
https://developer.apple.com/library/mac/documentation/Genera...


Why no mention of GCD here? GCD is very, very good at synchronizing access to shared resources.

The most Cocoa-compatible way of handling background execution of expensive procedures is always going to be best executed, quickest, using Grand Central Dispatch.

For example:

    @interface Foo ()
    @property (nonatomic) dispatch_queue_t backgroundQueue;
    @end
    
    @implementation Foo

    - (BOOL) veryExpensiveMethod:(id)arg completion:(void (^)())completion {
        dispatch_async(self.backgroundQueue, ^{
            if (_counter++ > N) {
                _counter--;
                return NO;
            }
            // Critical section
            _counter--;
            return YES;
            dispatch_async(dispatch_get_main_queue(), completion);
        };
That will ensure every call to -veryExpensiveMethod is run in sequence, and won't require waiting on your end.

These problems have been solved, better.


You're missing the most important point of the entire Throttler goal: gracefully returning fast, with success or failure. Nowhere is it stated that the goal is to enqueue tasks for execution.

If you had read 'til the end you would have found multiple statements that OSAtomic* is merely an alternative. Not a silver bullet. Not the fastest.

From the conclusion:

"It's very important to understand that every example in this article could have legitimately been solved with different concurrency primitives — like semaphores and locks — without any noticeable impact to a human playing around with your app."

Also, "(...) is always going to be best executed, quickest, using GCD." is kind of a blanket statement. I'd be careful around the use of "always".


> This post talks about the use of OS low level atomic functions

This is a pet peeve of mine, to call that an "OS" feature. In all recent CPUs I know of, atomic ops are not a privileged operation, and there is absolutely nothing for the operating system to manage in a traditional sense. You don't trap into the kernel and have it compare-and-swap, you just, um, compare and swap.

Maybe your OS provides a convenient C API, but it is not "OS" functionality. It's just instructions on your CPU. You could just as well write them inline. In many common uses, that's what ends up happening - the atomic ops are put inline with the rest of your code.


If your use of atomics consists entirely of calling functions provided in a library as part of the OS, what's wrong with calling them "OS atomic functions"? That is what they are. The fact that you can accomplish the "atomic" part without the "OS functions" part doesn't change the fact that, as written, the article discusses the "OS functions" part.


Well, it looks like Apple named this particular API "OS atomics", a name which makes me cringe a bit.

But, a few points:

1. There is historically a distinction between "operating system" and "shared library shipping with the operating system". I think I am losing this battle though.

2. On a number of platforms (not sure if Apple's "OS atomics" concept counts here), the atomic wrappers are not even functions in a shared library. They may be declared inline in a header file. Or they may be compiler intrinsics, where the compiler doesn't generate any function call in any circumstances. Is that an "OS function"? Not really.

In either case #1 or case #2, I think "OS atomics" is a dumb name. We are really talking about CPU features, not OS features. If it doesn't generate a trap into kernel mode that doesn't very much sound like the OS doing the work.

Calling it "the OS" sounds more like a fundamental misunderstanding of dynamic linking and what it is. I hear so many variants of this core misunderstanding all over the place. Thinking that spinlocks need kernel help is one such manifestation.


#1 sounds like a pretty minority viewpoint. OSes these days are a lot more than just the kernel and things that call into the kernel. If you do a fresh install of the OS then I think what you find on the disk afterwards can all be reasonably referred to as "the OS."

#2 does not apply here. This particular API isn't implemented inline, or with compiler intrinsics, or assembly. They're actual calls.

If you're talking about the general concept of atomics, then I agree that "OS atomics" is a dumb name. But we're not talking about that. We're talking about a specific API that happens to implement that general concept. That specific API is part of the OS.


Fair. I just didn't know how to rephrase that small sentence without unfolding into the two paragraphs you just wrote.

How would you rephrase that? Just "low level atomic", "atomic"?


If atomic operations were compatible across processor "families" then you'd have "the family atomics." (Obscure Dune reference.)


That was worth 10 upvotes, good sir. Sadly, I can only provide one.


hardware-provided atomic?


At least for ARM on Linux, the kernel does provide cmpxchg in the VDSO, but I think that is for support on older ARM architectures. IIRC, the ARMv7 does not need to use the kernel helpers.

https://www.kernel.org/doc/Documentation/arm/kernel_user_hel...


Yes, pre-ARMv6 does not have the load-link/store-conditional instructions.

The Linux kernel hack for that is actually kind of awesome. Notably, it's not a syscall and you don't enter the kernel to do them. Since pre-ARMv6 is always single core, it simply becomes a matter of detecting you are in the middle of an atomic op at interrupt time, and patching the result. This means the atomic op has to happen at a well-known (kernel-provided) address. Details here: http://lwn.net/Articles/314561/

I'm also aware of some older systems that needed kernel intervention for atomic ops. But on x86, ARMv6+, even no longer relevant arches like SPARC, POWER, ... this is not the case. It really is rare that the kernel needs to do this job these days.

Edit: ARMv6, not ARMv7, per the link I provided...


This is all claimed to do this for 'performance' but there's no figures in this document as to whether the incrementAndGet / decrementAndGet is any faster than @synchronize.

(I suspect it probably is, but fundamentally, @synchronize is implemented using compare and swap / other processor atomics, so it's probable that the difference is very slight - e.g. there's only a measurable difference if the thread is descheduled while holding a lock).


The goal of the article isn't about sheer performance — there are plenty of notes about it. If it was about pure performance, it'd be recommending moving away from objc classes and methods and using C functions or C++ classes instead, like std::atomic<>.

It's meant to be a somewhat-easy-to-digest introduction to lock-free design, where applicable.

What @synchronized ends up doing is far more complex — it has to be, to ensure the correctness of its purposes: https://github.com/opensource-apple/objc4/blob/master/runtim...


@synchronized calls objc_sync_enter and objc_sync_exit. Source code is available[0]. Best case, the thread already has the object locked and only needs to increment a lock count. Worst case, it spin locks, searches through a linked list of existing locks, then needs to malloc a new entry and create a new mutex for it.

0: http://www.opensource.apple.com/source/objc4/objc4-646/runti...


It's actually more than you'd think -- @synchronized has to deal with re-entrant locks, try/catches that also release locks accordingly while bubbling exceptions, etc. There's a lot more to it than compare and swap.


Or just use std::atomic and other std::mutex in Objective-C++. Under Objective-C world, no memory semantics are well-defined, and all these are hacks on pile of other hacks.


Can you explain more about hacks piled on other hacks?


@synchronization defines a memory barrier as I understand.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: