TLDR; I’m launching a new Swift framework called Forked for working with shared data, both on a single device, and across many.
A few years ago, I was knee-deep developing the collaboration feature of our app Agenda. Agenda is mostly local-first, so it was a challenge. Effectively, Agenda is a decentralized system, and the collaboration feature would allow anyone in a group to edit a shared note at any time — even when they were offline for days. When each copy of a shared note was transferred over to the devices of other members of the group, the result had to be consistent. It would be unacceptable for two people to end up with different versions.
I mentioned that Agenda is a local-first app. That means there is no central server with any understanding of the data model, taking care of conflicts — there is no central truth. Each Agenda client app has to take the data it gets from the cloud, make sense of it, and merge it in such a way that the result is the same as what other devices end up with, even if the data in question is days old.
What I realized back then is that this problem has already been solved very elegantly by a product that is extremely well-known and popular, and right under our noses. It’s called Git.
If you treat each copy of the Agenda data as something akin to the latest commit in the branch of a Git repository, you can use the same approach as Git to merging data. And Git works: developers can go hiking in Alaska, develop code completely offline, come back and merge their changes, and all is good with the world.
Back to Agenda: I decided the solution was a class called BranchedFile. My goals at the time were to create a simplified, embedded version of Git, that would operate on a single file. It would support branching, with main and auxiliary branches that could be used to handle concurrent changes to the file, and merging to reach eventual consistency.
The system should not require a complete history of changes, but keep enough versions of the data to facilitate the 3-way merging used in Git. With 3-way merging, you use the two recent conflicting versions, and compare to a common ancestor. The common ancestor is a copy of the file at the point the two branches diverged.
This approach worked well. I was able to come up with some fairly straightforward rules for which versions of the file I needed to keep around in order to fulfill a merge. All of this is implemented in BranchedFile. Agenda has been using this now for several years whenever two or more people want to collaboratively edit a note.
I hadn’t looked much at that code for several years, but that changed early in 2024. I attended the inaugural Local-First Conf in Berlin. I gave a short talk about Ensembles, which is the Core Data sync framework I have developed for more than 10 years ago, and then I watched the other talks. And I got inspired, and started to wonder: what if I could make my BranchedFile type more generic, and perhaps even turn it into a genuine modeling framework, like a mini version of SwiftData.
I started to dream:
- It should use structs instead of classes
- It should track changes in branches, and have 3-way merging
- It should be possible just to store data with
Codable - Where merging is an afterthought in many data modeling frameworks, this framework should support advanced merging, employing the latest Conflict-free Replicated Data Types (CRDTs)
- It should be possible to sync via iCloud and other cloud services with no change to the model
- It should be useful not only for sync, but even for subsystems within an app on a single device
Today the dream has been fulfilled, at least up to the point of an MVP.
Today, I’m launching Forked, a new approach to working with shared data in Swift. And it has actually worked out better than I expected. I wasn’t even sure it would be possible to build, but with the new Swift macros, I was able to come up with a minimal API that seems to work great. I’m really looking forward to dog fooding it.
Let’s just finish up with a little code, so you can see how simple it turned out to be. Here’s a model from the Forkers sample app, which is basically a basic contacts app:
@ForkedModel
struct Forkers: Codable {
@Merged(using: .arrayOfIdentifiableMerge)
var forkers: [Forker] = []
}
@ForkedModel
struct Forker: Identifiable, Codable, Hashable {
var id: UUID = .init()
var firstName: String = ""
var lastName: String = ""
var company: String = ""
var birthday: Date?
var email: String = ""
var category: ForkerCategory?
var color: ForkerColor?
@Merged var balance: Balance = .init()
@Merged var notes: String = ""
@Merged var tags: Set<String> = []
}
What I love the most about Forked models is that they are just simple value types. The @ForkedModel macro doesn’t change the properties at all, it just adds some code in an extension to support 3-way merging. So you can use this on any struct, and the result can do everything your original struct could do, from encoding to JSON, to jumping seamlessly between isolation domains in Swift 6.
The merging that @ForkedModel provides is pretty powerful. It does property-wise merging of structs, and if you attach the @Merged attribute, you can add your own custom merging logic, or use the advanced algorithms built in (like CRDTs).
To give an example, the notes property above is a String. With @Merged applied, it gets a hidden power — it can resolve conflicts in a more natural way. Rather than discarding one set of changes, or merging to give somewhat arbitrary results, it produces a result a person would likely expect. For example, if we begin with the text “pretty cool”, and change the text to “Pretty Cool” on one device, and to “pretty cool!!!” on another, the merged result result will be “Pretty Cool!!!”. Nuff said.
And this works within your app’s process, between processes (eg with sharing extensions), and even between devices via iCloud.
Also worth noting: Forked models work great with Swift 6 structured concurrency, helping to avoid race conditions. When there is a chance you might get a race condition (eg due to interleaving in an actor), you can setup a QuickFork — equivalent to an in-memory Git repo — and use branches (known as forks in Forked) to isolate each set of changes, merging later to get a valid result.
To finish off, consider this: With your model supporting 3-way merging, it knows how to merge itself. All it needs is a conflicting version, and a common ancestor, and Boom! So adding support for CloudKit to your app is next to trivial, and your model can remain completely unchanged. Here is the code that Forkers uses to setup CloudKit sync:
let forkedModel = try ForkedResource(repository: repo)
let cloudKitExchange = try .init(id: "Forkers",
forkedResource: forkedModel)
// Listen for incoming changes from CloudKit
Task {
for await change in forkedModel.changeStream
where change.fork == .main &&
change.mergingFork == .cloudKit {
// Update UI...
}
}
That’s all of it! We just added sync to our app in less than 10 lines of code. Decentralized systems can sometimes be astounding, and they also work great even when your use case is not technically decentralized!
This looks really interesting! I’ve been thinking about this kind of problem space for a while, now, without knowing what to call it nor knowing that there were people engaged in active development to support it. So now I’m all excited to try making my client/server applications into local first replicating applications, since the server is logically just acting as a document repository. But this brings me hard up against two of the other problems that the server addresses: data format versioning and client operating system differences. If I’m storing data locally in, for example, a SQLite database, then I can define migrations to update the stored data when I add a feature that wants to store some new field, and I can put a check early in the “open a document” flow to make sure that the currently running code knows how to read and write the document’s version of the application data. How do I do something similar with a @ForkedModel? I guess it has to happen up above that layer, right? And then, the question of operating systems. I like to use my laptop to work on stuff, and I’m using a Mac; but let’s say I want to collaborate with my kid, who will only ever use Windows or Android devices — sure, the document can sync via Dropbox, but the stream of changes…what’s that going to look like?
I get that this is very early, and I’m going to be downloading and poking through all the examples, to see if I can figure out if these questions even make sense. Thank you for putting in this work and for releasing it to the world!
Versioning is a bit of a hassle in any system, and I’m not sure having a server makes it any easier. You still need to check on the client side if you are up to date, and force the user to update if not.
I’ve been making local first apps for years, and the approach is the same. You embed a version somewhere in the data, or as metadata, and you check it against the latest known model on the client device. If it is newer, you tell the user to update, and you switch off sync until they update.
@ForkedModel doesn’t have anything built in yet. I thought about it, and will continue to think about how best to do it. I left it out for now, because it is perhaps better to leave to the app developer. With @ForkedModel, you will generally be using Codable, and people have often already developed ways to version their data.
The simplest I would consider just adding an Int for the version to every struct in the model. After decoding, you could check that, and if it is newer than the latest known int, you throw and don’t try to merge the data.
The Windows/Android situation is up in the air. I focus on Apple stuff, so that is what I made it for, but the data itself is just Codable compliant. You just get JSON in CloudKit, and you could read that out into an Android app. Perhaps someday a port could be made to Kotlin or whatever.
Interesting. Have you seen Fireproof?
Hadn’t seen it. Looks good. I guess it is using firebase in the cloud?
There are a quite a few local first stores and formats around. A worthy open source one is automerge, which also has swift bindings.
Forked has a bit of a different philosophy to most, with the forking/merging focus on device. So it isn’t just a sync store, it is more like an embedded git lite.