Monthly Archives: September 2020

The Danger of Playing it Safe

It’s an old chestnut that Swift developers love to sink their teeth into: Should you force unwrap optionals? Everyone has an opinion about it, and I’m going to state mine as clearly as I can — forcefully unwrap early and often.

Of course, this is just one opinion. We even disagree about it within our two man Agenda team, so there is probably not much hope for the world. But I want to explain when and why I think it is good to force unwrap, and give a recent real world example that demonstrates the cost of not doing it.

First, my reasoning. I think you should force unwrap in any situation where logic tells you that an optional simply cannot be nil. A standard example might look like this.

var count: Int?
count = 0
print("\(count!)")

Could count end up being nil in this case? Perhaps, through some strange memory corruption or solar flare. Do I want my app to continue to run if count is nil? No. To put it bluntly, that is cancer. The app would be in an undefined state, which may well be dangerous. I don’t want it deleting user data or causing any other havoc — it should just crash. I will be informed of the problem, and it can’t do any more damage.

But what is wrong with the softly, softly approach? Shouldn’t we be defensive here? Here is how you could do it.

var count: Int?
count = 0
if let count = count {
    print("\(count)")
} else {
    print("error!")
}

There are several problems with this. Firstly, it is more verbose, making it more difficult to read. Secondly, and even more importantly, it disguises developer intentions. It says to the next developer “count could conceivably be nil, so we will check for it”, even though it was clear to the original developer that it should never be nil in any well-defined circumstance. Lastly, where is the app at after this code runs? Who knows? The state is completely unclear.

So be assertive with forced unwrapping. If there is no case where the optional should ever be nil, force unwrap. But don’t force unwrap in cases where there is a valid situation in which nil could arise. If a foreseen error could lead to nil, that requires explicit handling.

I mentioned that I wanted to finish off with a recent real-world example. When Apple announced iOS 14 would ship, they only gave about a day’s notice, leaving most of us scrambling to get ready. For Agenda, we had done a lot of work on widgets, but we had never had them in a beta, or extensively tested on the device. That was planned for the week or so that we usually have following release of the Xcode GM, but before the new iOS is available. Alas, in 2020, that was a luxury we didn’t have.

So I merged our branches in Git, and uploaded the binary. Downloaded from TestFlight, and — you guessed it — the widgets were broken. It was just showing a placeholder, which I later learned is what you get when a widget crashes.

I dug into the code, and I found that the problem was an incorrect file path. I had been testing with a special development target, and hadn’t properly tested the production target. The faulty code was as follows:

let identifierSuffix = (Bundle.main.object(forInfoDictionaryKey: "AgendaIdentifierSuffix") as? String) ?? ".development"

It is attempting to get a string from the Info.plist, which it will then use in the aforementioned path, but I had forgotten to put the string in the Info.plist. That wasn’t the real crime though: As you can see, in the case of nil being returned, it tries to play it safe by falling back on the development target’s string.

This was bad. It meant the production app had the wrong path, so it crashed, but I didn’t see the crash until very late in the development cycle, even though it was there the whole time. And the cost? I probably wasted an hour or two uploading the binary, then sluthing the problem. “What’s an hour or two?” I hear you ask. Well, Apple were pushing us to submit our new widget feature so that Agenda could be considered for featuring. I don’t think two hours was enough to jeopardize that, but you never know. I’m sure the App Store is a madhouse on release day.

Here is how I think the code should have been written.

let identifierSuffix = (Bundle.main.object(forInfoDictionaryKey: "AgendaIdentifierSuffix") as? String)!

It is simpler, clearly stating that I am assuming the entry is in the Info.plist, and the widget only works if that assumption is fulfilled. If there is no entry in there, the widget is entirely useless, and should crash before causing more damage. And, with this code, it will crash early in development. I would have discovered the issue as soon as I tried to run a debug version, and fixed it long before it hit the App Store.

(Update: It has been pointed out that the last line of code is a bit convoluted. It would read better as

let identifierSuffix = Bundle.main.object(forInfoDictionaryKey: "AgendaIdentifierSuffix") as! String

That’s very true. I wanted to change the original code as little as possible, to stay on message. The point is the same: defensive coding did more damage than good, and an assertive force-unwrap would have been better.)

Replicants All the Way Down

In the previous installment of this ongoing series on Conflict-Free Replicated Data Types (CRDTs) in Swift, I introduced you to tombstones and Lamport counts, in constructing a replicating set. This time we’ll take a similar approach for a more common data type: the dictionary or map.

The code for our ReplicatingDictionary will be quite similar to what we used in the ReplicatingSet, so I won’t focus too much on those repetitious aspects. Instead, we will look in detail at a new trick the dictionary type has up its sleeve: the ability to recursively merge whenever the values it contains are also of a replicating type.

This ability to compose new types out of existing replicating types is one of the strengths of the CRDT approach. We’ll see in a few posts from now that you can construct your app’s data model entirely out of replicating types — the whole model becomes one monstrous replicating type, and merging across devices becomes a single function call. Simple — and completely robust — sync.

Introducing the ReplicatingDictionary

As usual, you can find all the code from the series in this GitHub repo. The core of the dictionary type is very similar to the set type from last time.

public struct ReplicatingDictionary<Key, Value> where Key: Hashable {
    
    fileprivate struct ValueContainer {
        var isDeleted: Bool
        var lamportTimestamp: LamportTimestamp
        var value: Value
        
        init(value: Value, lamportTimestamp: LamportTimestamp) {
            self.isDeleted = false
            self.lamportTimestamp = lamportTimestamp
            self.value = value
        }
    }
    
    private var valueContainersByKey: Dictionary<Key, ValueContainer>
    private var currentTimestamp: LamportTimestamp

The data is stored in the valueContainersByKey dictionary, but instead of just storing the values directly, like in a ‘normal’ dictionary, we wrap the values in a struct with some extra metadata. The metadata in question is a Lamport timestamp, and an isDeleted flag, which gives us tombstone functionality.

Just as for the replicating set type, this metadata allows values modified on different peers to be merged in a deterministic way, so that each peer ends up with the same set of key-value pairs. The mechanics of the dictionary type are almost identical to the set type; the main difference is that the set is just storing values, and the dictionary is storing values and keys. Whereas last time the entries in the set were used as the keys of the storage dictionary, this time we have a dedicated key for that, and the values get boxed with the metadata.

The rest of the type is fairly straightforward. The main challenge is taking into account the tombstones. For example, here is the set block of the subscript func.

        set(newValue) {
            currentTimestamp.tick()
            if let newValue = newValue {
                let container = ValueContainer(value: newValue, lamportTimestamp: currentTimestamp)
                valueContainersByKey[key] = container
            } else if let oldContainer = valueContainersByKey[key] {
                var newContainer = ValueContainer(value: oldContainer.value, lamportTimestamp: currentTimestamp)
                newContainer.isDeleted = true
                valueContainersByKey[key] = newContainer
            }
        }

It begins by incrementing the Lamport timestamp, because the contents of the type are being changed. It then checks if the newValue being set is non-nil; if it is, it creates a new ValueContainer box, with the new timestamp and value, and puts that in the valueContainersByKey storage. If newValue is nil, the function looks to see if there is an old value already stored for the key, because if there is, it needs to be marked deleted. We don’t just remove the entry from valueContainersByKey, we update it with the isDeleted flag set to true.

Recursive Merging

The RecursiveDictionary type has a standard merge method very similar to the one in ReplicatingSet, but it has an extra string to its merging bow: an extra merge method specialized for the case when the value type is also Replicable.

Why would we want this? The standard merge treats values atomically. If two devices update the value for a given key at the same time, later — when they sync up — one of the two values will win. You can never get half of one value, and half of the other — it’s an atomic merge.

But what if it actually would be useful to partially merge the values? Imagine you have written a nice replicating type to store the first name and last name of a contact, and you store values of this type in a ReplicatingDictionary. With the standard merge, if the first name of a contact was edited on one device, and the last name on another device, the eventual result would not be correct. One of the two values would win, and either the first name or the last name would still be wrong. But if the merge func of the values themselves could be called, we could have our replicating name type properly integrate the two edits, and both names would end up correct.

The specialized merged method looks like this.

extension ReplicatingDictionary where Value: Replicable {
   
    public func merged(with other: ReplicatingDictionary) -> ReplicatingDictionary {
        var haveTicked = false
        var resultDictionary = self
        resultDictionary.currentTimestamp = max(self.currentTimestamp, other.currentTimestamp)
        resultDictionary.valueContainersByKey = other.valueContainersByKey.reduce(into: valueContainersByKey) { result, entry in
            let first = result[entry.key]
            let second = entry.value
            if let first = first {
                if !first.isDeleted, !second.isDeleted {
                    // Merge the values
                    if !haveTicked {
                        resultDictionary.currentTimestamp.tick()
                        haveTicked = true
                    }
                    let newValue = first.value.merged(with: second.value)
                    let newValueContainer = ValueContainer(value: newValue, lamportTimestamp: resultDictionary.currentTimestamp)
                    result[entry.key] = newValueContainer
                } else {
                    // At least one deletion, so just revert to atomic merge
                    result[entry.key] = first.lamportTimestamp > second.lamportTimestamp ? first : second
                }
            } else {
                result[entry.key] = second
            }
        }
        return resultDictionary
    }
    
}

The two valueContainersByKey get merged together with a reduce, just like in the other merged functions we have seen, but the details are a bit different. It is very important to take into account the tombstones. If either of the values is deleted, the method just reverts back to the standard merge algorithm, picking the one that is ‘most recent’. If the value exists in both copies of valueContainersByKey, and is not deleted, rather than picking one value or the other, it calls the merged method of the value itself, and stores the result.

                    let newValue = first.value.merged(with: second.value)
                    let newValueContainer = ValueContainer(value: newValue, lamportTimestamp: resultDictionary.currentTimestamp)
                    result[entry.key] = newValueContainer

The rest of the function is just taking care to increase the Lamport timestamp appropriately.

Using Recursive Merge

That’s how the ReplicatingDictionary works, but how do you use it? In particular, how do you use that recursive merging functionality? There’s no better place to find out than in the unit tests.

final class ReplicatingDictionaryTests: XCTestCase {
    
    ...
    
    var dictOfSetsA: ReplicatingDictionary<String, ReplicatingSet<Int>>!
    var dictOfSetsB: ReplicatingDictionary<String, ReplicatingSet<Int>>!

    override func setUp() {
        super.setUp()
        ...
        dictOfSetsA = .init()
        dictOfSetsB = .init()
    }

    ...

    func testNonAtomicMergingOfReplicatingValues() {
        dictOfSetsA["1"] = .init(array: [1,2,3])
        dictOfSetsA["2"] = .init(array: [3,4,5])
        dictOfSetsA["3"] = .init(array: [1])
        
        dictOfSetsB["1"] = .init(array: [1,2,3,4])
        dictOfSetsB["3"] = .init(array: [3,4,5])
        dictOfSetsB["1"] = nil
        dictOfSetsB["3"]!.insert(6)
        
        let dictOfSetC = dictOfSetsA.merged(with: dictOfSetsB)
        let dictOfSetD = dictOfSetsB.merged(with: dictOfSetsA)
        XCTAssertEqual(dictOfSetC["3"]!.values, [1,3,4,5,6])
        XCTAssertNil(dictOfSetC["1"])
        XCTAssertEqual(dictOfSetC["2"]!.values, [3,4,5])
        
        let valuesC = dictOfSetC.dictionary.values.flatMap({ $0.values }).sorted()
        let valuesD = dictOfSetD.dictionary.values.flatMap({ $0.values }).sorted()
        XCTAssertEqual(valuesC, valuesD)
    }

This test sets up two ReplicatingDictionary’s, which contain ReplicatingSet’s as values. We introduced the ReplicatingSet last time; because it also conforms to Replicable, the specialized merged function is used.

Concentrate on the code for the key “3”.

    func testNonAtomicMergingOfReplicatingValues() {
        dictOfSetsA["3"] = .init(array: [1])
        
        ...
        dictOfSetsB["3"] = .init(array: [3,4,5])
        ...
        dictOfSetsB["3"]!.insert(6)
        
        let dictOfSetC = dictOfSetsA.merged(with: dictOfSetsB)
        ...
        XCTAssertEqual(dictOfSetC["3"]!.values, [1,3,4,5,6])

        ...
    }

The set in the first dictionary just contains the value 1. The second dictionary has a set containing 3, 4, and 5, and is updated to also include 6. The two dictionaries get merged to form dictOfSetC, and the test shows that the value for key “3” contains 1, 3, 4, 5 and 6. In other words, it has merged the two sets together, rather than atomically picking the first or the second.

This recursive merging is very powerful, and it doesn’t just work with sets. You can put any replicating data type you like in the dictionary, and it will undergo partial merging. With this tool at our disposal, we can build up complex data models that sync up autonomously.

Next time…

We have one more replicating type to go before we are ready to build our distributed text editing app. Next time we will develop a replicating array — the most complex type we will encounter in this series. It will act as the textual engine of our collaborative editor.