Maintaining Order in Replicating Types

In our continuing series on Conflict-Free Replicated Data Types (CRDTs) in Swift, we have so far introduced a number of types, beginning with the very basic register type, and working up to multi-value containers like sets and dictionaries.

Now we are ready to introduce the most advanced type in the whole series. It is a replicating type that brings order — literally. Unordered collections are one thing, but arrays are at a whole other level. Maintaining order (pun fully intended!) in replicated copies of an array is a challenge, and we will need to pull out all stops to win this one.

Approaches to Capturing Order

During our journey so far, we have been introduced to a variety of tools and techniques that are very useful for building replicated types, and, in particular, for merging them in a deterministic way, leading to eventual consistency when all devices have synced up:

  1. Data elements are uniquely identifiable.
  2. Timestamps can be used to determine when changes were made, and “choose a winner” when merging.
  3. Tombstones allow us to explicitly track removal and deletion.

To handle ordered containers, we will need more. We need a way to relate one element to the next. On a single device, this could be achieved with indexes, like in a standard array, but this doesn’t suffice in a distributed data scenario, because the element of a given index on one device may reference a completely different element on another device. Indexes are fragile, and we need something more supple.

Priorities, Priorities

One way you could introduce order into a type is by simply giving each element a priority. For example, if you use a floating point number as the priority, and sort the elements based on that priority, you have a way to determine order when merging changes made on different devices.

There are a few caveats. For example, you need a deterministic way to handle ‘collisions’, where the same priority is introduced for different elements by edits on two separate devices. But there are ways to deal with this. In fact, we use this exact approach to order the paragraphs in the note taking app Agenda.

We won’t use priorities in this case. The ultimate goal here is to use the replicating array as the basis of a text editor. The priority-based approach doesn’t lead to very pleasant results when merging strings. For example, if you enter “cat” on one device, and “dog” on the other, the merged result is unlikely to be “catdog”, but “cdaotg”. In other words, elements tend to interleave, instead of being logically grouped by edit. We need something a bit more fancy to achieve this.

Not a Linked List, a Linked Tree

In Computer Science 101, you no doubt learned that a linked list can be a useful way to store ordered values. A list has the advantage that it doesn’t use indexes, which we concluded above are quite fragile in a distributed data system.

But a list also doesn’t merge together very easily after being modified separately on two devices. We need something like a list, with the same advantages, but more easily merged together.

A tree is a generalized list. Where elements in a list each have at most one following element (child), an element in a tree may have more than one following elements (children). If we represent our replicating array internally as a tree, it is not so difficult to merge changes, because we can simply combine the children.

Our Tree in Pictures

A picture tells a thousand words, so here come 4000 words worth of pictures to explain how it will work.

Imagine first that we represent our string as a tree of characters. Each character has a pointer to the previous character, which we will call its ‘parent’ or ‘anchor’. Suppose there are two devices, and the user types in the letters “THEAT” on Device A. You get a chain of letters like in the diagram below. Once the text is typed, assume the replicated array is copied over to Device B, and is now ready to be edited simultaneously on the two devices.

The text “THEAT” is typed on Device A, and is transferred to Device B.

Our user — or two separate users if the app is a collaborative editor — now edits the text separately on each device, without an interceding merge. We call this a concurrent edit.

On Device A, the user enters a “C” after the “E”. The parent of the new character is the “E”, so we end up with a small branched tree, as shown to the left of the diagram below.

Concurrent edits are made on Device A and B.

On Device B, the user enters the characters “R” and “E” at the end of the string, again forming a simple tree. (Technically, by adding the characters at the end, the tree on Device B is still a linked list, but that is coincidental.)

We have arrived at the moment of truth. Can we merge these two structures back together and get a sane result for our text editor? All we have to do is combine all the children of each node, to form a new tree.

Trees from Device A and B are merged together.

To get back the string to show to the user, we use a depth-first traversal. When we have a node with more than one child, we choose the newest child, and get all the elements from that subtree. Then, we get all the elements of the subtree of the next child, and so forth. Applying this approach at every node leads to a depth-first traversal.

The diagram above shows that the resulting string is what the user would probably expect. The “C” is in the right place, and the “RE” is at the end. But we aren’t quite finished. What happens when a character is deleted? Do we just remove the corresponding node? No, we can’t do that, because that would also disrupt all the children of the deleted node. They would no longer be attached.

What we will do instead is make use of an old friend: the tombstone. We will keep the deleted node around, and simply mark it as having been deleted. It’s children can stay attached, and be traversed just as if the tombstone was a standard parent node. The only change is that we don’t include the tombstone character in the string we show to the user — we hide it.

Tombstones represent deleted entries, and are hidden in the array.

The diagram above shows how we would handle deleting the last “T”. It is marked as a tombstone — shown here as a black square — and is not included in the final string. Instead of “THECATRE”, the user sees “THECARE”.

Introducing ReplicatingArray

It should be clear that the ReplicatingArray type is going to be more complex than the other types we have developed so far. We have to come up with a tree structure, and functions that can merge copies, and traverse the tree to build the final array. You can find the result, as usual, in the GitHub repository of this series.

The nodes in the tree are represented by the internal ValueContainer type.

public struct ReplicatingArray<T> {
    
    fileprivate struct ValueContainer: Identifiable {
        var anchor: ID?
        var value: T
        var lamportTimestamp: UInt64
        var id: UUID = UUID()
        var isDeleted: Bool = false
        
        ...

This private type is similar to types used for other containers (eg sets). There is a Lamport timestamp, which is needed to order the children of a node when traversing. The id is very important, because that is used as the pointer for referencing the parent — it is how we build up the tree. The isDeleted flag is set to true for a tombstone, and false otherwise.

The one new property is the anchor. This is the id of the parent node. It is optional, because a node at the start of the string has no parent.

The current values are stored in the property valueContainers in the correct order. This makes it easy to access the values on basis of index, as well as to quickly insert or delete an element. Ultimately the type has to work like an array, and this choice makes working with indexes simplest.

    private var valueContainers: Array<ValueContainer> = []
    private var tombstones: Array<ValueContainer> = []

Tombstones are stored separately, in the tombstones array. If we included them in valueContainers, the indexes would not correspond to the indexes in the array of values the user works with.

The choice to split the tombstones out into separate storage makes many methods simple to implement. For example, accessing the values is just a simple map, and the count even simpler.

    public var values: Array<T> { valueContainers.map { $0.value } }
    public var count: UInt64 { UInt64(valueContainers.count) }

Mutating ReplicatingArray

Inserting and appending values is also quite straightforward. A new value container is created, and inserted directly into the valueContainers array at the index passed.

    mutating func insert(_ newValue: T, at index: Int) {
        tick()
        let new = makeValueContainer(withValue: newValue, forInsertingAtIndex: index)
        valueContainers.insert(new, at: index)
    }
    
    mutating func append(_ newValue: T) {
        insert(newValue, at: valueContainers.count)
    }

What is involved in making that new value container? The usual suspects, such as incrementing the Lamport count, but we also need to make sure we setup our tree. We get the existing value container at the index, and use that as the parent. It’s id is set as the anchor of the new value container.

    private func makeValueContainer(withValue value: T, forInsertingAtIndex index: Int) -> ValueContainer {
        let anchor = index > 0 ? valueContainers[index-1].id : nil
        let new = ValueContainer(anchor: anchor, value: value, lamportTimestamp: lamportTimestamp)
        return new
    }

Removing elements is similarly straightforward. We set the isDeleted attribute, move the value container to the tombstones array, and remove it from valueContainers.

    @discardableResult mutating func remove(at index: Int) -> T {
        var tombstone = valueContainers[index]
        tombstone.isDeleted = true
        tombstones.append(tombstone)
        valueContainers.remove(at: index)
        return tombstone.value
    }

Merging

This is all quite simple. So where is the complexity? Most of it is pushed into the merge function. It begins by gathering together tombstones, and making sure there are no duplicates.

    public func merged(with other: Self) -> Self {
        let resultTombstones = (tombstones + other.tombstones).filterDuplicates { $0.id }
        let tombstoneIds = resultTombstones.map { $0.id }

Then we create an unordered set of the non-tombstone containers. In doing this, we filter out any containers that are in the new tombstone array created above.

        var encounteredIds: Set<ValueContainer.ID> = []
        let unorderedValueContainers = (valueContainers + other.valueContainers).filter {
            !tombstoneIds.contains($0.id) && encounteredIds.insert($0.id).inserted
        }

Now that we have a set of undeleted values, we have to order them.

        let resultValueContainersWithTombstones = Self.ordered(fromUnordered: unorderedValueContainers + resultTombstones)
        let resultValueContainers = resultValueContainersWithTombstones.filter { !$0.isDeleted }

And finally we are ready to prepare a result ReplicatingArray, setting the arrays we have built up.

        var result = self
        result.valueContainers = resultValueContainers
        result.tombstones = resultTombstones
        result.lamportTimestamp = Swift.max(self.lamportTimestamp, other.lamportTimestamp)
        return result

Traversing the Tree

Wait a minute…that didn’t look so difficult. Was there some sleight of hand? Well, a bit, yes. The complexity is actually all in that Self.ordered(fromUnordered:) function. It probably looks like it is just doing a sort, but actually, it contains the whole tree traversal mentioned earlier. It goes through the elements in order, traversing the subtrees as described above.

It begins by sorting the value containers, and then builds up a dictionary so that we can quickly find the child containers for a given parent id. This will allow us to traverse into the children through the tree.

    private static func ordered(fromUnordered unordered: [ValueContainer]) -> [ValueContainer] {
        let sorted = unordered.sorted { $0.ordered(beforeSibling: $1) }
        let anchoredByAnchorId: [ValueContainer.ID? : [ValueContainer]] = .init(grouping: sorted) { $0.anchor }

A nested function is called recursively to traverse subtrees, and adds any encountered value containers to the ordered result array.

        var result: [ValueContainer] = []
        
        func addDecendants(of containers: [ValueContainer]) {
            for container in containers {
                result.append(container)
                guard let anchoredToValueContainer = anchoredByAnchorId[container.id] else { continue }
                addDecendants(of: anchoredToValueContainer)
            }
        }

With the addDescendants function in place, all we have to do is call it on the first containers in the array, which are the ones that have an anchor that is nil. addDescendants will recursively call itself to build up the whole result array.

        let roots = anchoredByAnchorId[nil] ?? []
        addDecendants(of: roots)
        return result

Pros and Cons

That’s it. Not a trivial type, but also not impossibly difficult given it is doing some pretty advanced merging of distributed data edits.

The pros of this type should be clear enough from the description above. You can use it to allow concurrent edits to an ordered array, and have the resulting replicas merge in a neat way, particularly suitable for text.

What about the cons? Like other container types, it uses more and more memory as edits are made, because tombstones are never removed, and so accumulate over time. There is also the overhead of having a Lamport count, identifier, and anchor for each entry. In short, the usual suspects.

One More Time…

We are entering the home straight. All of our replicating types are ready to go. All we have to do now is put them together into a real app, and see if it flies. That’s what we will be doing next time — the really fun bit.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s