Deep Dive into the SwiftNIO Architecture

Deep Dive into the SwiftNIO Architecture

Featured on Hashnode

SwiftNIO is an elegant low-level networking framework for Swift applications. I say low-level because you wouldn’t use SwiftNIO directly to build web applications, instead you’d use a web framework based on SwiftNIO like Vapor or the Smoke Framework instead.

This is a deep dive into the various pieces of SwiftNIO and is meant to supplement the official documentation. I find that the best way that I learn something new is to try to explain it to others, so this is my attempt at making SwiftNIO a little less complex.

Thread per connection model

When I first learned about networking and building servers, I was introduced to the thread per connection model. In this model, the server has one thread consistently running that blocks by listening on the socket awaiting new connections. When a new connection is created, it spawns a new thread to handle the connection. Every time a client connects to the server, a new thread is created which is responsible for handling that connection. Each thread blocks to perform I/O on the socket to read and write data to the connections. This is known as Multi-threaded Blocking I/O.

image

Unfortunately, this solution doesn’t scale. If there are 100,000 client connections, 100,000 threads are created to handle the connections. This introduces immense overhead due to excessive context switching and high memory usage since each thread stack has to be stored in memory. Also, I/O is not a CPU bound task, so while our thread is waiting for data to come from the socket, it’s being wasted since it could be doing CPU bound work.

SwiftNIO and Non-blocking I/O

With non-blocking I/O, the application no longer blocks when reading/writing data. Instead, it defers to the OS to notify the application when an event of interest has occurred such as a new socket connection opened. With this approach, there is an EventLoop that runs constantly waiting for new events. When the EventLoop is notified of an event such as data ready to be consumed from a socket, it can then process this event. Now, the thread is always doing work as long as there are events. The following sections take a deeper look at the SwiftNIO architecture.

File Descriptors

While not SwiftNIO specific, file descriptors play a role in its functionality. File descriptors are indices to the OS’s file descriptor table. The table contains a map where the key is the file descriptor (FD) and the value is a file pointer. This pointer points to some file data structure which itself contains a pointer to where the data is actually stored on the hard drive. On *nix systems, everything is a file, including sockets, so they are accessed through file descriptors.

Channels

Channels manage the lifetime of a file descriptor (like a socket). They are responsible for processing any inbound or outbound events on that file descriptor. Whenever the EventLoop has an event on that file descriptor, it notifies the Channel associated with that file descriptor. Channels provide a consistent API for accessing file descriptors whether they are sockets or something else under the hood.

EventLoop

An EventLoop can be thought of as a while loop that is constantly waiting for events. It fires a callback when an event of interest occurs such as data received from a socket or an accepted connection on a socket. Essentially, it is a while loop that blocks until there is work ready to be processed. It runs for the entirety of the application's life cycle.

The following pseudo-code from the Apple docs demonstrates at a high level the operation of the EventLoop:

while eventLoop.isOpen {
    /// Block until there is something to process for 1...n Channels
    let readyChannels = blockUntilIoOrTasksAreReady()
    /// Loop through all the Channels
    for channel in readyChannels {
        /// Process IO and / or tasks for the Channel.
        /// This may include things like:
        ///    - accept new connection
        ///    - connect to a remote host
        ///    - read from socket
        ///    - write to socket
        ///    - tasks that were submitted via EventLoop methods
        /// and others.
        processIoAndTasks(channel)
    }
}

If we dig into the source code for SwiftNIO, we can see there is an implementation of EventLoop called SelectableEventLoop. The SelectableEventLoop observes events from the system using kqueue on macOS or epoll on Linux. kqueue and epoll are system calls that monitor multiple file descriptors and return those which are ready to perform I/O on. The EventLoop does not block on performing I/O, the OS does instead. As long as there are events to be processed, we are fully utilizing our threads. I/O bound tasks do not require the CPU, so blocking on such tasks is a waste of CPU cycles. We could instead make progress on other work. However, while this architecture enables non-blocking I/O, CPU bound tasks can still block.

Selectors

epoll on Linux and kqueue on macOS are examples of selectors or I/O multiplexers. Their job is to monitor certain file descriptors for particular events and notify those interested when the events occur. The EventLoop in SwiftNIO uses these selectors to determine when a socket has data to be read or is ready for writing. For more details on epoll, see this article.

The following diagram shows the EventLoop accepting two new socket connections. The EventLoop creates the file descriptors and Channels for the sockets and subscribes the file descriptors with epoll (on Linux) to receive notifications when events occur on the file descriptors. The EventLoop will block until there is work to be done on its task queue or an event fired from the file descriptors it is interested in.

In this example, I show that task 2 gets blocked because task 1 is taking too long to finish. In practice, we must do what we can to ensure this doesn’t happen as this will block the EventLoop from making progress on task 2. In general, we want to try to reduce the amount of time a task is stuck in the blocked state and if necessary, move any long running tasks onto a background thread.

image

EventLoopGroup

EventLoopGroups can be thought of as a thread pool. It gathers EventLoops together and distributes work among them. The MultiThreadedEventLoopGroup object implements EventLoopGroup and it creates a number of threads while placing a SelectableEventLoop on each.

// This call initializes a number of threads and places an EventLoop on each one
let group = MultiThreadedEventLoopGroup(numberOfThreads: System.coreCount)

image

Each EventLoop is associated with its own thread and has Channels assigned to it. For the entire lifetime of the Channel, all operations on that Channel are executed on the same EventLoop it is assigned to, as well as the same thread; thus preventing race conditions.

ChannelHandlers

The Channel hands off the event it has received to its registered ChannelPipeline which is just a sequence of ChannelHandlers. The ChannelHandlers process the event in sequence, performing any mutations as they go. This is like a data processing pipeline.

ChannelHandlers can either be inbound (handle only inbound events like reading from a socket) or outbound (handle only outbound events like writing to a socket).

The ChannelHandlerContext lets a handler keep track of its position within a ChannelPipeline. It contains a reference to the previous and next Channel in the pipeline. It is also how the handler can interact with the external world by sending write requests or passing off the message to the subsequent handler.

All code on the ChannelPipeline is executed on the same thread as the EventLoop that received the event so we must make sure we don't write any blocking code.

image

In the above image, the EventLoop will use epoll (or equivalent on other OS’s) to receive notifications when an event has arrived on a particular file descriptor. Once an event is received, say some data has arrived on the socket, the EventLoop notifies the associated Channel for that file descriptor of the event. The data is then handed off to the Channel’s ChannelPipeline.

The data received is considered an inbound event so it will be handled by the InboundChannelHandlers. There are N_i InboundChannelHandlers in the above example. Each handler will process the data, performing some business logic, and pass the result to the next inbound handler calling fireChannelRead() on the ChannelHandlerContext object.

To write back to the socket, an OutboundChannelHandler is required. In the above example, the last InboundChannelHandler calls write() on the context object which passes the data to the OutboundChannelHandlers. This flow is similar to before, each handler performs some logic and passes the result to the next OutboundChannelHandler using write() .

When the final OutboundChannelHandler in the pipeline calls write() on the ChannelHandlerContext, it initiates the write to the socket. It doesn’t write directly to the socket, but to the associated Channel instead. Just calling write() will only queue the data in a buffer. To actually initiate the write, flush() must be called. The Channel then submits the write event to the EventLoop’s task queue. The EventLoop will eventually write the data to the socket when the socket is ready to be written to and the write task has been popped off its task queue.

ByteBuffer

The ByteBuffer is a copy-on-write byte buffer that provides an abstraction for reading and writing raw bytes. These buffers are used extensively within the ChannelHandlers in SwiftNIO. get APIs don’t mutate the buffer, i.e. the reader index is not updated. read APIs will move the internal reader index and "consume" the bytes. This lets the system remove bytes that have already been read.

The same ByteBuffer will be used to read from and write to.

  • Discardable bytes: bytes that have already been read and can be reclaimed by the system
  • Readable bytes: bytes that are currently available to be read using the read APIs
  • Writable bytes: These bytes are undefined so we should not do gets for bytes beyond the writerIndex. When you want to write to the buffer, the writing will begin at the writerIndex

image

Read data from buffer

var buffer = ... // defined elsewhere
// Lets us know how many bytes are available to be read
let readableBytes = buffer.readableBytes
// Reads `readableBytes` bytes starting at the current `readerIndex` and then 
// increments `readerIndex` by `readableBytes`
let readData = buffer.read(length: readableBytes)

Write data to buffer

var buffer = ... // defined elsewhere
// Increments the `writerIndex` by the length of the string in bytes
// Can also use `writeBytes`, `writeInteger` APIs
buffer.writeString("Some string to write")

EventLoopFuture & EventLoopPromise

An EventLoopFuture acts as a container for a result that will be provided at a later time, after some asynchronous work has been completed. A function that promises to do work asynchronously can return an EventLoopFuture. The recipient can add callbacks to the result so it can be notified when the underlying result is available.

In essence, an EventLoopFuture is a read-only placeholder the caller can use to attach handlers to access the value when it is eventually defined. An EventLoopPromise is a writable container which is used by the asynchronous code to write the value once it is ready after performing the async work. So, the easiest way to remember the distinction: EventLoopPromises are written to, EventLoopFutures are read from.

func getAsyncData() -> EventLoopFuture<NetworkData> {
    // Create the promise on the current event loop
    // Returns an EventLoopPromise
    let promise = eventLoop.makePromise(of: NetworkData.self)

    DispatchQueue.async {
        // Do some asynchronous work
      do {
            let result = doSomeWork()
               promise.succeed(result)
      } catch {
            promise.fail(error)
         }
   }
    // Function returns immediately
   // After promise is completed, either successfully or failed, the underlying value will populate the future
    return promise.futureResult
}

You can transform the result of the EventLoopFuture using map() which will be called with the eventual result. Use map if the work to be done is synchronous:

let asyncData = getAsyncData()

let processedResult: EventLoopFuture<ProcessedValue> = asyncData.map { data -> ProcessedValue in
    // Do some work
    return syncProcess(data)
}

Use flatMap() if the work to be done is asynchronous. The return value of the flatMap closure is another EventLoopFuture.

let result: EventLoopFuture<AsyncResult> = processedResult.flatMap { result -> EventLoopFuture<AsyncResult> in
    return doAsyncWork(result)
}

Async/Await

The NIOCore module has provided async/await support for SwiftNIO that allows us to bridge EventLoopFutures to the async/await world. The code can be found here and it lets us call .get() on any EventLoopFuture and await it instead of chaining flatMap or map calls. An example of this can be found below in the Bootstrap section.

Bootstrap

Bootstrap objects put all of the above together. They streamline the creation of Channels and setting up the ChannelPipelines.

Server Bootstrap

ServerBootstrap is an object that provides an easy way to create ServerSocketChannels which can be used to listen for network connections. Some key points:

  • Calling bind() on the ServerBootstrap instance will create a ServerSocketChannel with the current SelectableEventLoop and a newly created EventLoopGroup (childEventLoopGroup) on which to run the accepted SocketChannel
  • The ServerSocketChannel wraps a ServerSocket instance
  • ServerSocket has an accept() method which returns a new Socket once a connection has been established
  • The returned Socket is then wrapped in a SocketChannel with the EventLooopGroup being the child EventLoopGroup that was created by the ServerSocketChannel. This is how the ServerSocketChannel creates child channels

Example

let bootstrap = ServerBootstrap(group: eventLoopGroup)
            .serverChannelOption(ChannelOptions.backlog, value: 256)
            .serverChannelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)
            .childChannelInitializer { channel in
                channel.pipeline.addHandlers([
                    BackPressureHandler(),
                    MyChannelHandler()
                ])

            }
            .childChannelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)
            .childChannelOption(ChannelOptions.maxMessagesPerRead, value: 16)
            .childChannelOption(ChannelOptions.recvAllocator, value: AdaptiveRecvByteBufferAllocator())


let serverChannel = try await bootstrap.bind(host: "0.0.0.0", port: 5000).get()

// This will block until the closeFuture fires (i.e. server is shut down)
try await serverChannel.closeFuture.get()

I leave it as an exercise for the reader to figure our what the different channel options mean, but the key point of this example is how channel pipelines are setup.

The childChannelInitializer is used to set up a ChannelPipeline . For every channel created, the childChannelInitializer runs, which means for every request received by the server, a new ChannelPipeline is setup. ChannelHandlers are attached to the channel's pipeline through the addHandlers() method.

Client Bootstrap

This class is used for bootstrapping TCP client Channels. It is similar to the ServerBootstrap but it doesn’t create a ServerSocketChannel, only a SocketChannel that represents the connection to the server.

Example

let bootstrap = ClientBootstrap(group: eventLoopGroup)
            .channelOption(ChannelOptions.socketOption(.so_reuseaddr), value: 1)
            .channelInitializer { channel in
                channel.pipeline.addHandler(MyChannelHandler())
            }

let channel = try await bootstrap.connect(host: host, port: port).get()

// Do something with the channel, such as write to it
let buffer = channel.allocator.buffer(string: "Hello, server!")
try await channel.writeAndFlush(buffer)

In this example, we're allocating a new ByteBuffer that will be used to hold our message to the server. writeAndFlush will send the data through our ChannelPipeline invoking any OutboundChannelHandlers we have. Any server response will be handled by any InboundChannelHandlers we have.

Where to go from here?

Now that we have a better idea of how the various SwiftNIO pieces work, the world of server-side Swift has really opened up for us. Like I mentioned at the beginning of this article, it's unlikely you'll ever need to use SwiftNIO directly unless you're in the business of building web frameworks, networking libraries and the like, but having this knowledge introduces a greater appreciation for the craft, in my opinion.

So, the next step is really dependent on whether you want to build out web APIs or work with the lower-level networking primatives. My goal with these articles is to demystify the world of server-side Swift and demonstrate its viability outside of mobile development so I will be covering various use-cases where I think Swift can really shine on the server.