Migrating Forward

Migrations are tricky business. If you think of the Kubernetes model, the client API supports versioned objects. You can e.g. store an ExternalSecret version v1beta1 and then read it as an ExternalSecret v1alpha1. Kubernetes APIs do that by allowing upgrading and downgrading the resource versions.

Why is it practically useful? The client of your API can still support interacting with it even if the object versions are newer—of course it’s only practical for reads, as mutations will destroy whatever the new data is there.

For tinyvmm I came up with a simpler model (for now): the web API only supports the latest version, and the DB layer will transparently upgrade the stored object to it, as its being read. It took me a bit to structure the APIs properly until I settled on the fact that JSON is the most common denominator. As such, migrating a VM from v1alpha2 to v1alpha3 looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


impl MigratableEntity for VirtualMachine {
    fn migrate(entity: Value) -> Result<Value, Error> {
        let spec = entity.get_existing("spec")?.as_map()?;
        let mut new_spec = spec.clone();

        // v2 has a string field "disk", v3 has an array of disks
        new_spec.insert(
            "disks".into(),
            vec![spec.get_existing("disk")?.clone()].into(),
        );
        new_spec.remove("disk");
        
        // we construct a whole new json and return it back
        Ok(json! ({
            "apiVersion": super::v1alpha3::VirtualMachine::API_VERSION,
            "kind": Self::KIND,
            "metadata": entity.get_existing("metadata")?,
            "spec": new_spec,
        }))
    }
}

We receive a serde_json’s Value and we return a Value of a higher version (or fail). This combines with a migrator function that knows how to migrate any API version known to the binary:

1
2
3
4
5
6
7
8


pub fn get_migrator(version: &str) -> Option<fn(Value) -> Result<Value, Error>> {
    match version {
        "v1alpha1" => Some(v1alpha1::VirtualMachine::migrate),
        "v1alpha2" => Some(v1alpha2::VirtualMachine::migrate),
        "v1alpha3" => Some(v1alpha3::VirtualMachine::migrate),
        _ => None,
    }
}

And a simple loop when fetching the entries from the database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30


fn get<T>(runtime_dir: &str, name: T) -> Result<Self::Type, super::error::Error>
where
    T: AsRef<str>,
{
    let mut entity = get_entity(runtime_dir, Self::KIND, name.as_ref())?;

    loop {
        let version = entity
            .get_existing("apiVersion")?
            .as_str()
            .ok_or(Error::MissingKey("apiVersion"))?;
        // if the DB version is what we need, then do nothing!
        if version == Self::API_VERSION {
            break;
        }

        // otherwise find a migrator for it, migrate, and see if it fits
        // on the next iteration
        let migrator = Self::migrator(version).ok_or(Error::FailedMigration {
            kind: Self::KIND,
            from: version.to_string(),
            to: None,
        })?;
        entity = migrator(entity)?;
    }

    let unwrapped: Self::Type = serde_json::value::from_value(entity)?;

    Ok(unwrapped)
}

Combined, this allows to iterate faster as I don’t need to do schema updates. I just create a new version of the resource (which is much, much easier to go thanks to procedural macros), and write a json to json mapping from the previous iteration.