Wednesday, December 27, 2006

How to Write a Cocoa Web Server

A couple of weeks ago a friend asked me to write a Mac-based HTTP server for an intranet web service he was building. I hadn't forgotten how Mac OS X was praised because of its Unix underpinnings, and the strong networking capabilities that come along with that. As a Cocoa developer, I was confident that I could find a nice Cocoa API that would do the hard work for me, so I accepted...

There are many powerful Cocoa APIs indeed, but none of which seemed to be exactly addressing the problem of writing a small homegrown web server. For example, there is the Distributed Objects (DO) framework that allows objects to send messages to each other through the network. Although extremely convenient, it is limited to communication between Objective-C objects, and cannot be used to build a more general server. In fact, it turned out that it was not possible to realize the HTTP server I sought without using some Unix and Core Foundation APIs as well.

However, there are some rather useful, although somewhat non-obvious, Cocoa classes that are most helpful for writing a web server. In this article I will show you what they are and how to use them. I will also show you when it is necessary to (or when some additional benefits can be gained) use Unix or Core Foundation APIs. This article assumes a basic knowledge of Cocoa, so I won't explain any Cocoa topics here, but will show everything you need to get started with networking. To have a concrete example at our hands, we will use the general classes we are building to create a server that accepts a URL and instead returns a PDF rendering of the web page at the given address. The full source code of the Xcode project is available at culturedcode.com/cocoa.

NSFileHandle and Network Communication

One of the problems in writing server software is the need to make sure that the server is always able to accept new requests no matter how busy it is fulfilling the old ones. Usually this requires the use of multiple threads. But fortunately, Cocoa provides us with a class that relieves us of creating a multi-threaded application: NSFileHandle. Yes, that's right. Despite the word "file" in the class' name, and the fact that it is not listed under Networking in Apple's documentation, it is just the right choice for handling all of our network communication needs. The explanation for this odd naming lies in Mac OS X's Unix foundation.

In Unix, pretty much everything is treated as a file. Physical devices, network connections, and yes, also ordinary files on your hard disk are all accessed using a common abstraction: file descriptors. A file descriptor is simply an integer that provides an index into a per-process table that references resources available to your process. Once we have the file descriptor fd of an already available resource, we can use it to create a new NSFileHandle object, like in the following code fragment:

NSFileHandle *fileHandle;
fileHandle = [[NSFileHandle alloc] initWithFileDescriptor:fd
closeOnDealloc:YES];

NSFileHandle does not create network resources for us, but manages existing ones. For example, it will listen in the background for new connections or incoming data, notifying its delegate when appropriate. This capability of NSFileHandle is relieving us of spawning a new thread to do that work. Passing in YES for closeOnDealloc: makes sure that the resource managed by fileHandle is properly closed when we are finished with it.

But how do we get an appropriate network resource and corresponding file descriptor? The Unix abstraction for network resources is called BSD Sockets, and we will see shortly how to get our hands on one.

But first we need to choose on which port to make our service available to the outside world. Remember that a computer in a network is typically available at a specific IP address, which is commonly specified using four numbers (in the range from 0 to 255) separated by dots, e.g., 127.0.0.1. However, a single computer may provide many different services. To make this possible a single IP address is further divided into 65,536 different ports. All incoming messages specify a destination port. An application (or rather a process) can bind to specific ports, which means that incoming communications on those ports will be forwarded to the bound process.

For experimentation it is advisable to choose one of the unregistered ports (49152-65535). Ports can be registered with the Internet Assigned Numbers Authority (IANA).

The Easy Way to Sockets

To receive communications, we need to create an entry point in our process where data can flow in, and bind it to the IP address of our computer and the port number we have chosen. Such entry points are known as BSD sockets, or simply sockets, and are identified via file descriptors.

Again, Cocoa has a class that handles the necessary details for us. This class is known by the not so surprising name of NSSocketPort. This time, however, the gotcha is that its documentation is buried deeply within the Distributed Objects API, so much so that some developers even doubt that it can be used reliably for purposes other than that of serving the DO architecture. Rest assured that NSSocketPort will do a nice job for us. For all those still in doubt, the end of this article shows how to replace NSSocketPort with direct Unix calls and why one might want to do so.

To obtain our sought for file descriptor fd we use:

NSSocketPort *socketPort;
socketPort = [[NSSocketPort alloc initWithTCPPort:PORT_NUMBER];
int fd = [socketPort socket];

PORT_NUMBER is the port number of our service (0...65535). It is important not to release socketPort before being finished with fileHandle.

Now that we have established a connection to the outside world, it would be nice if we could actually start accepting incoming connections. And we are only one method call away from reaching this goal:

[fileHandle acceptConnectionInBackgroundAndNotify];

This method call immediately returns, but fileHandle will keep listening to the socket until the first connection comes in. Then it issues an NSFileHandleConnectionAcceptedNotification, which carries further information about the incoming connection.

SimpleHTTPServer

To wrap up we will create a SimpleHTTPServer class and put everything we have learned so far into an initWithPortNumber:delegate: initializer method as follows:

@interface SimpleHTTPServer : NSObject {
int portNumber;
id delegate;

NSSocketPort *socketPort;
NSFileHandle *fileHandle;

NSMutableArray *connections;
NSMutableArray *requests;

...
}

id)initWithPortNumber:(int)pn delegate:(id)dl
{
if( self = [super init] ) {
portNumber = pn;
delegate = [dl retain];

connections = [[NSMutableArray alloc] init];
requests = [[NSMutableArray alloc] init];

...

socketPort = [[NSSocketPort alloc] initWithTCPPort:portNumber];
int fd = [socketPort socket];
fileHandle = [[NSFileHandle alloc] initWithFileDescriptor:fd
closeOnDealloc:YES];

NSNotificationCenter *nc = [NSNotificationCenter defaultCenter];
[nc addObserver:self
selector:@selector(newConnection:)
name:NSFileHandleConnectionAcceptedNotification
object:nil];

[fileHandle acceptConnectionInBackgroundAndNotify];
}

- (void)dealloc
{
[[NSNotificationCenter defaultCenter] removeObserver:self];
...
[requests release];
[connections release];
[fileHandle release];
[socketPort release];
[delegate release];
[super dealloc];
}

The requests and connections instance variables will be used to keep track of open connections and pending requests, while delegate holds a pointer to an object responsible for doing the actual processing necessary to fulfill the request. In our example it will render a web page to PDF.

Once a SimpleHTTPServer object is created it will immediately begin listening to the specified port for incoming connections. As soon as a client begins to talk to our computer on the right port, NSFileHandle will issue an NSFileHandleConnectionAcceptedNotification which causes our newConnection: method to be called, since this is what we specified when we registered with NSNotificationCenter.

The implementation of the newConnection: method is as follows. Note that we are already using a soon to be declared SimpleHTTPConnection class. While SimpleHTTPServer has the responsibility to accept newly connecting clients, all communication from a specific client will be channeled through a corresponding SimpleHTTPConnection object:

- (void)newConnection:(NSNotification *)notification
{
NSDictionary *userInfo = [notification userInfo];
NSFileHandle *remoteFileHandle = [userInfo objectForKey:
NSFileHandleNotificationFileHandleItem];

NSNumber *errorNo = [userInfo objectForKey:@"NSFileHandleError"];
if( errorNo ) {
NSLog(@"NSFileHandle Error: %@", errorNo);
return;
}

[fileHandle acceptConnectionInBackgroundAndNotify];

if( remoteFileHandle ) {
SimpleHTTPConnection *connection;
connection = [[SimpleHTTPConnection alloc] initWithFileHandle:
remoteFileHandle
delegate:self];
if( connection ) {
NSIndexSet *insertedIndexes;
insertedIndexes = [NSIndexSet indexSetWithIndex:
[connections count]];
[self willChange:NSKeyValueChangeInsertion
valuesAtIndexes:insertedIndexes forKey:@"connections"];
[connections addObject:connection];
[self didChange:NSKeyValueChangeInsertion
valuesAtIndexes:insertedIndexes forKey:@"connections"];
[connection release];
}
}
}

The most notable thing here is the fact that through NSNotification's user info dictionary we are passed another instance of NSFileHandle. This new file handle represents the specific client who has connected to our service. We are using it to initialize a new SimpleHTTPConnection object. It is important not to forget to call acceptConnectionInBackgroundAndNotify on fileHandle again, since NSFileHandle will otherwise stop listening for incoming connections.

As a bonus, the code presented here makes sure it plays nicely with Cocoa bindings. It is possible to bind the connections instance variable to an NSArrayController, which in turn can easily be hooked up to a user interface widget, all from within Interface Builder. Bracketing our changes of the connections array with explicit calls to the observer notification methods willChange:valuesAtIndexes:forKey: and didChange:valuesAtIndexes:forKey: makes sure that the bindings layer will take notice.

Usually, such notifications are created automatically when instance variables are set. Here, however, we are not setting the connections instance variable itself, but we are mutating the referenced array. This will go unnoticed by the Cocoa bindings layer as long as we are not either using so called indexed accessor methods or are creating observer notifications manually. We have chosen to do the latter. Care has to be taken though, to not mix automatic and manual notification. Our newConnection: method will never participate in a binding. If it were, we would have to disable automatic observer notification first, for otherwise unexpected behavior and even crashes could result. More details on manual versus automatic observer notification can be found in Apple's Key-Value Observing Programming Guide.