Concurrent Operations Demystified

NSOperation and NSOperationQueue are available on Leopard or the iPhone to help you parallelize your code. The idea is that if you have code that takes a long time to execute, you create an NSOperation subclass, override main, and put your long running code in there:

@implementation CalculatePiOperation

- (void)main
{
    // Calculate PI to 1,000,000 digits
}

@end

To execute an operation, you typically add it to an NSOperationQueue:

NSOperationQueue * queue = [[NSOperationQueue alloc] init];
NSOperation * piOperation = [[[CalculatePiOperatin alloc] init] autorelease];
[queue addOperation:piOperation];

If you add multiple operations to a queue, they all execute in parallel on background threads, allowing your main thread to deal with the user interface. The queue will intelligently schedule the number of parallel operations based on the number of CPU cores your users have, thus effectively taking advantage of your users’ hardware.

The only caveat is that the lifetime of an operation is the main method. Once that method returns, the operation is finished and it gets removed from the queue. If you want to use a class that has an asynchronous API, you have to jump through some hoops. Typically you have to play games with the run loop to ensure that the main method doesn’t return prematurely.

While there are times when you want to do this, it can also be a pain. In other cases, you may not be allowed to use the API on a background thread because it is designed to only work on the main thread. Enter concurrent operations.

Operations come in two flavors: concurrent and non-concurrent. In an unfortunate case of confusing terminology, the default NSOperation subclass is called non-concurrent. I say unfortunate because the way the are used on an operation queue, they run in parallel. So, yes, operations that run in parallel are called non-concurrent.

Concurrent operations are created by overriding the the isConcurrent method in your subclass to return YES:

- (BOOL)isConcurrent
{
    return YES;
}

When a concurrent operation is added to an operation queue, it is started not on a background thread, but on the thread on which they were added. So, yes, concurrent operations all run on the same thread whereas non-concurrent execute in parallel on different threads. Got that? Good.

Update 2009-09-13: This is no longer true as of 10.6. The start method is always called on a background thread as of 10.6. To work properly with main-thread only and asynchronous APIs that rely on the run loop, we need to shunt our work over to the main thread. More on this in a followup post.

In any case, another major difference with concurrent threads is that you override start, instead of main. Also, the operation is not finished once the start method returns. This allows you to control the lifetime of the operation.

When dealing with asynchronous APIs, we can begin the asynchronous call on the main thread in start and keep the operation running until it finishes.

We also have a few more responsibilities. We need to keep track of isExecuting and isFinished ourselves, and we need to modify them in a key-value coding compliant manner. I typically do this using instance variables. The operation is only considered finished when the isFinished property changes to YES.

For example, if we want to write an operation that downloads data from a URL using URLConnection, its initializer would be:

- (id)initWithUrl:(NSURL *)url
{
    self = [super init];
    if (self == nil)
        return nil;
    
    _url = [url copy];
    _isExecuting = NO;
    _isFinished = NO;
    
    return self;
}

The start method shunts itself to the main thread, kicks off an asynchronous NSURLConnection, and returns:

- (void)start
{
    if (![NSThread isMainThread])
    {
        [self performSelectorOnMainThread:@selector(start) withObject:nil waitUntilDone:NO];
        return;
    }

    NSLog(@"opeartion for <%@> started.", _url);
    
    [self willChangeValueForKey:@"isExecuting"];
    _isExecuting = YES;
    [self didChangeValueForKey:@"isExecuting"];

    NSURLRequest * request = [NSURLRequest requestWithURL:_url];
    _connection = [[NSURLConnection alloc] initWithRequest:request
                                                  delegate:self];
    if (_connection == nil)
        [self finish];
}

There are three important points here. First, we have to make sure we are running on the main thread. Second, we have to change the isExecuting property to YES. Third, our start method returns before the NSURLConnection has completed, but the operation is still executing. This means our operation stays on the queue while the NSURLConnection is running, all without having to play games with the run loop.

We are using a private finish method to end the operation:

- (void)finish
{
    NSLog(@"operation for <%@> finished. "
          @"status code: %d, error: %@, data size: %u",
          _url, _statusCode, _error, [_data length]);
    
    [_connection release];
    _connection = nil;
    
    [self willChangeValueForKey:@"isExecuting"];
    [self willChangeValueForKey:@"isFinished"];

    _isExecuting = NO;
    _isFinished = YES;

    [self didChangeValueForKey:@"isExecuting"];
    [self didChangeValueForKey:@"isFinished"];
}

The key point here is that we change the isExecuting and isFinished flags. Only when these are set to NO and YES, respectively, will the operation be removed from the queue. The queue monitors their values using key-value observing.

The URLConnection delegate methods accumulate data or end the operation, as appropriate:

- (void)connection:(NSURLConnection *)connection
didReceiveResponse:(NSURLResponse *)response
{
    [_data release];
    _data = [[NSMutableData alloc] init];

    NSHTTPURLResponse * httpResponse = (NSHTTPURLResponse *)response;
    _statusCode = [httpResponse statusCode];
}

- (void)connection:(NSURLConnection *)connection
    didReceiveData:(NSData *)data
{
    [_data appendData:data];
}

- (void)connectionDidFinishLoading:(NSURLConnection *)connection
{
    [self finish];
}

- (void)connection:(NSURLConnection *)connection
  didFailWithError:(NSError *)error
{
    _error = [error copy];
    [self finish];
}

As you can see, we don’t have to turn an asynchronous API into a synchronous one, and yet we are still able to package up this task as an operation. While it may seem a little counterintuitive to use an operation, it does have its benefits. For example, you can use the queue to limit the number of parallel downloads to two:

    _queue = [[NSOperationQueue alloc] init];
    [_queue setMaxConcurrentOperationCount:2];

Also, you can use operation dependencies to make sure tasks occur in a proper order.

In Textcast, we use concurrent operations almost exclusively. We package up NSSpeechSynthesizer, PubSub, and WebKit as concurrent operations since they all have asynchronous APIs. All of these APIs also have thread safety issues of some sort and are better run on the main thread. Concurrent operations make this easier to manage.

Download a full example project demonstrating how to use concurrent operations: Concurrent.tgz