Data Model Evolution with Legacy Type Mapping

Avatar

With the Legacy Type Mapping functionality of MicroStream, it becomes possible to evolve the data model of your application as your application matures.
In almost all projects, the data model changes. You need additional fields, you perform a restructuring of the classes to improve your data structure, and so on.

MiroStream helps you to convert the data that is stored to your new format. With small changes, this can be performed automatically, for class renames you can define the mapping, and for a large refactoring of the structure, you need to write a Legacy Type Handler.

This blog is accompanied by a video that showcases the scenarios that are described in this text : https://youtu.be/yYw-DzUdOHQ

The storage

As you might know, MicroStream stores the data in a completely different format than the Standard Java Serialisation. You can read more about the Serialisation engine in this FooJay article.

Since only data is stored together with some field identification, MicroStream can perform some transformations and conversions when the data is read from the storage. If it determines that the class structure between the JVM loaded class and the recorder structure in the storage doesn’t match, it tries to find a match itself. It might make the ‘wrong guesses’ or even fail when classes are renamed. In that case, you must provide the refactoring mapping describing how the old situation should be used with the new classes.

Automatic conversion

When the field names of the classes don’t match, it tries to determine the renamed fields based on the Levenshtein distance.

When the ‘probability’ is high enough that it is the same property that the developer renamed, it uses it as a mapping from the old to the new situation. When ‘probability’ is too low, it considers it as a new or deleted property. The mapping is written out in the log and performed when the instance is recreated in the Java memory.

The new structure is only used when an instance is saved as a result of the store() method for example. This means that the storage can hold a mixture of data of a certain type in the old and the new structure. The MicroStream code can handle this perfectly and performs the mapping if needed.

An example of such an automatic mapping is demonstrated in this Github project https://github.com/rdebusscher/microstream-legacy-type-mapping/tree/main/automatic.

Manual Mapping

The automatic mapping is of course based on guesses and makes mistakes. You as a developer of course know very well how the old situation relates to the new structure. You can define the mapping in the format of Foo#field. Where Foo is the fully qualified name of the class. The pairs can be defined within your code within a Map like structure.

Or you can define the mapping within a CSV file. The first column indicates the old reference, and the second column the new one. If you want to indicate a removed field, place it in the first column and leave the second column empty. For a new field, the first column must be empty.

Based on this mapping, you can instruct how MicroStream must map the fields if you restructured them in a class. Or you can indicate a class rename in this way also by specifying different names.

As mentioned, you can use the CSV format to specify the mapping or a Map-Based structure in code. But just as with many other aspects of MicroStream, you can extend its capabilities by implementing the PersistenceRefactoringMapping interface so that you can have to implement your ideal mapping provider mechanism.

A code example can be found in this repository https://github.com/rdebusscher/microstream-legacy-type-mapping/tree/main/manual.

Legacy Type Handler

The mapping capabilities we have covered until now cannot handle large restructurings of the code. You can rename a class but you cannot split the fields from one class into multiple classes. So it is not possible to extract the address information into a separate instance from the other user information for example.

You can either provide this conversion yourself in code when you start up a helper program that performs these changes where mappings as we described can be combined with some Java code to restructure the data.

You can also make use of the Legacy Type Handler. A normal Type Handler determines how the data within a Java instance is stored and retrieved from the storage in a binary format. MicroStream has a generic type handler but also some specialised ones for certain classes so that it can handle any class in your application.

The Legacy Type Handler can be used to manipulate the information from the storage when creating the Java instance. The developer can perform refactoring in this handler as I mentioned at the beginning of this section, to extract address information in a separate instance.

This approach is more complex as you need to use your information from the old class structure (order of pointers and primitives and their byte length, to access the data from the old structure.

An example can be seen in the repository https://github.com/rdebusscher/microstream-legacy-type-mapping/tree/main/complex.

Legacy Type Mapping

Because MicroStream stores only data and references, and not the class structure of Java itself, some conversion can be applied when data is read into memory to handle changes to your data model.

When MicroStream detects changes, it automatically tries to map the new situation to the old one. When you have renamed or added a field within a class, MicroStream can handle this change without the interaction of the developer.

You can always specify the mapping yourself to handle the case that MicroStream is not able to figure it out by itself. Or you can use this explicit manual mapping to handle class renames for example.

The more complex refactoring like extracting some fields in a separate instance requires the Legacy Handler that reads the data in the old structure and you as the developer can perform the required changes.

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

Using a Database Target with MicroStream CDI Integration

Next Post

MicroStream Sessions at the JCON Conference

Related Posts
Secured By miniOrange