June | 2020 | A p p D e c e n t r a l

It’s perhaps suicidal to talk about the Combine framework on the eve of WWDC2020, but I’m going to take a stab at it. Combine will no doubt change a lot on Monday, but I don’t expect the aspect I wish to discuss to change drastically.

I have worked a lot with asynchronous code over the years. When developing an app, you can usually get away with a bare minimum of it, perhaps just the code needed to query a web server. The rest, from disk store queries to UI updates, is generally synchronous.

But when you develop sync frameworks like Ensembles and LLVS, asynchronicity is everywhere. There is the obvious networking involved, such as querying CloudKit, but there are also long running consolidation tasks that cannot be executed synchronously on the main thread. Performing a “sync” operation in a framework like Ensembles can take a few seconds, but I even know extreme cases that can take hours. In a system like that, just about everything is asynchronous.

When you are developing in a codebase like that, it is of fundamental importance to begin with a strategy for handling asynchronicity. You can do a lot worse than good ol’ completion handlers, but they cop a bit of flack for the dreaded Pyramid of Doom. Don’t let that put you off: completion handlers are extremely easy to understand, because they offer a clear serial path through the code. And you can combat the ‘pyramid’ by using extra functions, or an asynchronous queueing approach, which is what I usually do.

Combine is Apple’s official approach to asynchronous programming. It is still too new for me to consider using in any of my existing products, but I have been working with it from time to time in smaller side projects, in order to learn the framework, and understand how to think in this new paradigm.

I won’t provide a complete introduction to Combine here. There are plenty of resources out there to get started. (Personally, I took a course with the great Daniel Steinberg to kick off my learning.) The nutshell intro is that Combine is a framework for building asynchronous pipelines. You have a number of tasks that need to be done asynchronously, often in order, and sometimes concurrently. Combine gives you tools to build a dynamic structure in memory that will ensure that these tasks execute in the required order, passing the results of one task on to the next, and also offering a means of propagating and handling errors. I like to think about it as building the same sort of operations and structures you have in standard serial Swift code, but dynamically with types. (I’m sure one day programming languages will evolve to do these same things, but for now it is all in the framework space.)

Standard introductions usually start with a URLSession fetching some data from a web service, which then channels the downloaded data into a map operator that transforms it into some model objects, with the pipeline terminating in a sink that presents the data in the UI. This is a completely logical way to get started, but it doesn’t offer much more than your standard completion handler.

I want to skip over that to a much more advanced aspect of pipeline building that took me a long time to really grok. But until you do grok it, you won’t be able to do much more with Combine than replace some of your completion blocks with slightly more elegant code.

The operator I want to discuss is flatMap. A lot of Swift developers still carry the scars of the early days of the language, and twitch at the sight of it. When we first started using flatMap, on arrays and other containers, it was wrongly named; it wasn’t a true flatMap in the functional sense, and that confused things greatly. It has since been renamed to compactMap, but the confusion reigns on. (If you want to know all about the theory behind flatMap, you can better ask Daniel Steinberg — functional programming is one of his passions.)

If you still carry these scars, and are assuming that flatMap will remove nil values from an array, let that go right now! And also, let go of the notion that it is related to map. Maybe that is true in some theoretical way, but if you start from that position, you will just end up confused. Combine’s map transforms the data values that get passed along the pipeline in the same way the map function transforms the elements in an array one at a time. flatMap is a completely different animal: The way I think about flatMap is that it squashes down the upstream part of the pipeline, replacing it with an entirely new upstream pipeline. In other words, flatMap flattens the pipeline itself — the publishers and operators that came before — into some other publisher.

“OK, …”, I hear you say, “… let’s assume I accept this. So what? Why does that matter?” It matters a whole lot. It is this quality that allows you to build powerful asynchronous Combine pipelines. Without it, you can only build static, unchanging pipelines, which only solve the simplest of problems. What flatMap introduces is the possibility of changing the pipeline on-the-fly. It provides a mechanism for branching. And I want to spend the rest of this post looking at a concrete example to demonstrate.

I first realized the importance of flatMap in Combine when I was working with CloudKit. I was writing code to setup a CKRecordZone. The goal was to come up with a fire-and-forget Combine pipeline to query the cloud to see if the zone already existed, and if not, create the zone. This sort of branching is extremely common in every piece of code we write, but how do you setup a Combine pipeline to handle the same branching?

First, let’s take a look at the zone setup function.

var zone: CKRecordZone?

func setupZone() -> AnyCancellable {
    retrieveZone()
        .retry(3)
        .map { CKRecordZone?.init($0) }
        .replaceError(with: nil) // We are giving up
        .receive(on: RunLoop.main)
        .assign(to: \.zone, on: self)
}

When you see code like this, you are seeing Combine at its best. It is simple to read, and yet there are some pretty powerful asynchronous operations going on behind the scenes.

We will look at retrieveZone in a minute. For now just accept that it returns a publisher which provides a record zone, or gives an error.

func retrieveZone() -> AnyPublisher<CKRecordZone, Swift.Error>

In setupZone, we take the publisher returned by retrieveZone, and add a retry in case an error occurred during the retrieval. It will retry the retrieval three times, before giving up. When it does give up, the replaceError will come into play, setting the result to nil. The retrieve simply makes sure we are on the main thread, and the final assign sets a property to the zone value that resulted (…or nil if an error arose).

You may be wondering about the map. I had to think twice about it myself when I saw it again. It’s there because we are eventually assigning to an optional CKRecordZone property; to make all the types in the pipeline match up, we have to the convert the non-optional CKRecordZone that retrieveZone gives us, into an optional. We can do that with a map. It’s a bit silly, but hey, that’s strict static typing for you (…and that is a whole other discussion).

To use the setupZone function, we simply call it to build the pipeline, and retain the result in a property. (One of the most common mistakes you will make when first using Combine is to forget to retain the pipeline, in which case it gets cancelled immediately.)

var setupZoneSubscription: AnyCancellable?

init() {
    setupZoneSubscription = setupZone()
}

So far, so good, but what about retrieveZone, and weren’t we going to talk about flatMap? The moment of truth has arrived.

retrieveZone begins by wrapping a CloudKit zone fetch in a future.

func retrieveZone() -> AnyPublisher<CKRecordZone, Swift.Error> {
    let fetchZone = Future<CKRecordZone?, Swift.Error> { 
        [unowned self] promise in
        self.cloudKitDatabase.fetch(withRecordZoneID: self.zoneId) { 
            zone, error in
            if let error = error {
                promise(.failure(error))
            } else {
                promise(.success(zone))
            }
        }
    }

If you are wrapping older asynchronous APIs for use in Combine, you will nearly always reach for Future. It is basically the Combine equivalent of a completion handler. The future has a result type, in this case CKRecordZone?, and an error type. The block where the CloudKit fetch takes place is passed a promise; when the asynchronous task is complete, the promise is called, passing in either a .failure result, or a .success result, with the corresponding associated value.

If the fetchZone future indicates there is no zone in CloudKit, we want to create one. For that we need another future, again wrapping around a CloudKit operation.

    let createZone = Future<CKRecordZone, Swift.Error> { 
        [unowned self] promise in
        let newZone = CKRecordZone(zoneID: self.zoneId)
        let operation =
            CKModifyRecordZonesOperation(recordZonesToSave: [newZone], 
                recordZoneIDsToDelete: nil)
        operation.modifyRecordZonesCompletionBlock = { 
            savedZones, _, error in
            if let error = error {
                promise(.failure(error))
            } else {
                promise(.success(savedZones!.first!))
            }
        }
        self.cloudKitDatabase.add(operation)
    }

This future returns a non-optional CKRecordZone. The CKModifyRecordZonesOperation will either succeed, leading to a CKRecordZone, or it will fail, and give an error. (The fetchZone future used an optional, because it was valid for it to return a nil value, indicating the zone was not found.)

Now we are ready to put it all together, and more importantly, see how flatMap allows us to adapt dynamically as the pipeline executes.

    return fetchZone
        .flatMap { existingZone in
            if let existingZone = existingZone {
                return Result<CKRecordZone, Swift.Error>
                           .Publisher(.success(existingZone))
                           .eraseToAnyPublisher()
            } else {
                return createZone.eraseToAnyPublisher()
            }
        }.eraseToAnyPublisher()
}

Our returned pipeline begins with the fetchZone. A flatMap intercepts the result, and we add our branching code. If the zone is non-nil, we just put it straight into a Result publisher with a success value and the associated zone. If, on the other hand, the zone from fetchZone was nil, we return the createZone future we built above. Whichever publisher is returned from flatMap will replace whatever came earlier, in this case, the fetchZone.

Of course, you aren’t restricted to just simple if-else statements in a flatMap. You can do whatever you like in there. But whatever it is that you do, you have to end up returning a publisher with the correct result type and error type. That is really the only restriction: the publisher returned can be anything you like, but every return must have matching types.

Hopefully this has delivered an ah-ha moment for you. If you are very new to Combine, it may be a bridge too far, but keep it in the back of your head, because you will need it at some point. The first time you come across some Combine code where you wish you could branch when the pipeline is halfway though executing, think about this post, and most of all, think about flatMap.

WWDC is just around the corner, and most of us will be fully focussed on what happens with SwiftUI. The unveiling last year generated great excitement, but there are still plenty of holes to be filled before it is a solid replacement for UIKit and AppKit (…or even a solid complement).

One of the big questions on my mind is how well the fully declarative approach scales to complex apps. You can already build quite reasonable portal apps for your favorite web service with SwiftUI, which is 90% of the iOS app market, but I am much more interested in the other 10%. Can you build an advanced iOS app or serious macOS app with SwiftUI? Could you develop an app like Keynote or Xcode using SwiftUI?

I spent a lot of time this year — probably more than desirable — working with the framework. At Agenda, we decided to ‘kick the tyres’ by adopting it in a new internal tool that we wanted to develop. The tool was not trivial either: it loaded significant amounts of data, shared it between multiple users with LLVS backed by the CloudKit public store, and allowed collaborative editing and discussion. Effectively, it was like a fully decentralized web app, built with native UI.

Ambitious? Sure. Too ambitious? Undoubtedly. But we learnt a lot before abandoning the idea. And it did actually work, though after a while we concluded ironing out the performance issues was not really worth the effort.

And this brings me back to the present day, and what we should be looking for at WWDC. One thing I learned from our SwiftUI tool is that a fully declarative style presents some performance challenges. By its nature, SwiftUI is based around view tree diff-ing. When something in the model changes, it triggers a binding to fire, which in turn invalidates the view tree. In order to update the screen, SwiftUI now compares the new tree with the old, and determines what changes are needed to the scene that the user sees.

This process is fast for a few standard views, but there are bigger challenges looming. For example, take a large array of values that you want to present in a List. We encountered some serious performance issues when updating data in such a List, as SwiftUI would attempt to diff thousands of View nodes. You could restructure your UI in order to avoid such large lists, but that is a bit of a cop out, because the technologies SwiftUI is designed to supplant can handle such large lists.

The problem is perhaps even more apparent in a text view. We briefly saw a TextView type appear during the SwiftUI beta in 2019, but it didn’t make the final release. My guess — and it is just a guess — is that someone in Apple concluded that it worked fine if you were presenting tweets, but things fell apart when you wanted real-time editing on Chapter 5 of “War and Peace”.

You can bind a TextField to a model String property, and it will work fine, but do you really want to bind the contents of a TextView to an AttributedString the length of a novel? Diff-ing that is not going to cut it.

One way they could take SwiftUI in order to solve this is to introduce a new type of text model, one with internal hierarchy that makes diff-ing extremely efficient. Think about a tree-like structure, so that it could very quickly establish which subtrees were altered, and ignore all other parts of the tree. Effectively, it would be introducing a structure similar to the view tree itself, but inside the model.

This option would see NSTextStorage replaced with a highly diff-able structure, but it may also be possible to introduce a similar approach at a different level. Perhaps the model data would be left unchanged, and a specialized TextStorageBinding type added, which would embellish the model with metadata to make diff-ing efficient. Or could they even introduce View types such as Paragraph which shadow the model text content, and allow exactly the same diff-ing mechanism to be used as in the rest of SwiftUI?

Some of these proposals may seem far fetched, but this is exactly how we structured the text in the Agenda notes app. First, our model has finer granularity than just putting the complete text of a note into a single property. Each note is broken into paragraphs, meaning that a user edit doesn’t require a complete refresh of the whole note. Instead, just the affected paragraphs can be saved, refreshed on screen, and synced to other devices. This approach does require extra bookkeeping — we have to pass around the ‘edited’ paragraph identifiers — but it is this type of data that SwiftUI will no doubt need to introduce if it is ever to host an editable TextView containing the whole of Tolstoy’s classic novel.

At this point, it is anyone’s guess which direction SwiftUI will take. Perhaps Apple figure the days of complex apps are over, and they are not going to do anything to address such concerns. Or perhaps it is just a question of priorities, and we will see how they intend to tackle these scaling issues at WWDC2020. Either way, I will be seeking out the documentation for TextView; that’ll be my canary for where SwiftUI is headed, and whether it will soon evolve into a capable replacement for UIKit/AppKit.

A p p D e c e n t r a l

Swift development in a data decentral universe

Monthly Archives: June 2020

Why grokking flatMap is essential to effective Combine

Why TextView is my SwiftUI canary