|
| United States Worldwide |
|
opjsubst-The OPJ Class Evolution Toolopjsubst substitutes classes and converts data in a store.SYNOPSISopjsubst [ options | classnames ] DESCRIPTIONopjsubst is a persistent class evolution tool. It supports evolution of individual classes, class hierarchies, and persistent objects. This means, respectively, that some persistent classes can be substituted with their newer versions, classes can be inserted into/deleted from the middle of the existing persistent hierarchy, and persistent objects can be converted to conform to new definitions of their classes. In the simplest case, you just pass the tool the names of new versions of classes that should be in .class files on the CLASSPATH. Given the store and these classes, opjsubst verifies if the classes are substitutable and if everything is ok, replaces the classes in the store.The information is passed to opjsubst via the command line or, alternatively, it can read it from a file, as instructions of the simple change specification language. When you pass information via the command line, the classes to be just substituted are specified without any additional options, unlike the classes to be inserted, deleted or replaced. For example, a command opjsubst -store teststore.pjs A B -del D -ins I -rep R1 R2 Cwould substitute classes A, B and C, delete class D, insert class I and replace class R1 with class R2. This implementation of the evolution tool may only be run off-line, that is, only when a store is not in use by the PJama interpreter. The substitutability verification rules are quite complex, however there are just a few simple ideas behind them:
If substitutability verification fails at any stage, the whole process stops at once and no substitution at all is performed. This guarantees that the set of classes in the store, as well as the data, are always in consistent state. As both opj and opjsubst are early release software, it would be prudent to make a backup copy of your store before running opjsubst.
OPTIONS
LIMITATIONS AND BUGS
DETAILS EXPLAINEDChange Specification LanguageThis language currently has very few instructions, all of which have equivalent command line options. The example below contains all of them and also illustrates the syntax:# Example change specification file substitute A,B,C; insert I1,I2; delete D1, D2 mig D2super; replace R1 with R2, R3 with R4; substall my.package1, my.package2; convclass MyConv; substitute CS sns, CF scf;Command-line options and one or more change specification files can be mixed in one run of opjsubst, and it is the user's responsibility to make sure they don't contradict.
Substitutability Verification RulesThese rules are used by opjsubst to check if two versions of the same class - the original version in the persistent storeC and the new
version C' - are substitutable. New version of the persistent class
is a class outside the store which has the same name, but different definition.
Replacement of a class can be viewed as a special case of substitution,
when it is explicitly specified that the new version has a different name.
The actual rules are presented below:
PERSISTENT DATA CONVERSIONQuick IntroductionThis section gives a quick explanation of what is evolutionary data conversion and how it works. The details are presented in the following sections. Suppose you have a following class BankAccount
class BankAccount {
int number;
long openingDate; // date expressed in ms from the beginning of the epoch
int balance;
...
}
which you wish to evolve, so that its new definition looks like
class BankAccount {
int number;
Date openingDate;
long balance;
...
}
In the new class definition, you wish to change the type of the openingDate field - from simple, but inconvenient long representation to more sophisticated specialized class Date representation. You also want to change the type of the field balance from int to long. If you simply change the BankAccount.java file, recompile it and run opjsubst, e.g.
>opjsubst -store mystore.pjs BankAccount the tool will discover that the format of instances of class BankAccount has changed. If you already have some persistent objects of class BankAccount, the tool will generate the following message:
The type of the field openingDate has changed in the substitute class BankAccount No appropriate conversion method available for class BankAccount. There are instances of this class in the store. Do you want to rely on default conversion (d) or cancel (c)? Please enter d or c : You now have a choice between default conversion and cancellation of the whole operation. If you choose default conversion, the following will happen. opjsubst will scan the store, and for each instance of BankAccount will create a new instance in the new format. The values of all fields that have same names and compatible types in both versions of BankAccount, e.g. number and balance will be copied from the ``old'' instance to the ``new'' one. Note that during default conversion the tool automatically converts the values between the fields of the following types that are not physically (i.e. hardware-level) compatible:
To solve this problem, you have to use custom conversion. We will describe its simplest form here. The details on more sophisticated forms of custom conversion can be found in the following sections. To convert persistent instances of class BankAccount appropriately, you have to write a conversion method and put it into a conversion class. There are several forms of conversion methods that are recognised by the PJama evolution system, each of which has a predefined name and signature. But there can be only one conversion method for each evolved class in the whole set of conversion classes. In our case the only conversion class can look like:
public class MyConversionClass {
public static void convertInstance(BankAccount$$_old_ver_ acOld,
BankAccount acNew) {
acNew.openingDate = new Date(acOld.openingDate);
}
}
The unusual suffix $$_old_ver_ is used to distinguish the old and new versions of the evolved class. This is an entirely valid sequence of symbols to use in a Java identifier, however it has been chosen such that anybody else is unlikely to use it for a different purpose. Before conversion starts, the old version of class BankAccount is automatically renamed by the evolution system. After conversion is finished, the old version is invalidated, so the only place where it is possible to operate on two versions of one class and distinguish them this way is in conversion code. You can notice that the conversion method initialises only the openingDate field. That is because if this form of conversion method is used, the evolution system creates a new instance and performs default conversion for it before the method is called. Therefore you don't need to copy the fields with the same names and compatible types from one instance to another yourself. The $$_old_ver_ suffix tells the compiler that the definition of the corresponding class should be picked up from the persistent store. The ordinary javac compiler does not support this, therefore a conversion class should be compiled with our own opjc compiler (which is, in fact, a slightly modified javac):
>opjc -store mystore.pjs MyConversionClass.java After that, invoke opjsubst, specifying the name of the conversion class:
>opjsubst -store mystore.pjs BankAccount -convclass MyConversionClass This time opjsubst will not ask you any questions, because if there is a conversion class and an appropriate conversion method in it, it is assumed that conversion should be performed. After verification, it will run conversion, i.e. scan the store linearly and for each instance of BankAccount create a new one and call convertInstance method. If a fatal problem, e.g. an ucaught exception, occurs at any moment, the operation will be aborted safely, and the store will remain unchanged. If everything is ok, classes will be substituted as usual, and you will get a converted store. When Conversion is RequiredEvolutionary operations on classes (substitution/replacement, insertion, or deletion) may or may not affect the persistent objects. Whether an object is affected or not depends on whether the modification of its class is such that the format of its instance changes. If we define instance formats equivalence in an implementation independent manner, it would mean strict equivalence of the number, names, types and physical offsets of instance (non-static) data fields of both classes. In PJama, however, there are a number of cases when the type of a field can be changed, but the format of instances remains the same, and the compatibility between the current value of the field and the new field's type is guaranteed. If a modification to the class is such that the format of instances of a modified class changes and there are some instances of this class in the store, it is necessary to convert all instances of this class. If a class is nominated for deletion, but it has some persistent instances, they should be migrated to other classes. One additional operation available in PJama which is described here is not really evolutionary. It is intended to be used in the case when the programmer wants to modify all instances of some class, avoiding the difficulties of their lookup in perhaps complex and deep application data structures. It turns out that a mechanism very similar to evolutionary data conversion can be used for it. Thus this operation, called modification is also discussed here. Since the mechanism that we exploit looks very similar for all three tasks, in the further discussion we will often use the term ``conversion'' in a wider sense to denote all three kinds of operation: conversion, migration and modification.
Default and Custom ConversionWe have already shown the difference between default conversion and custom conversion and how the former is initiated. Default migration is also applicable when the programmer wants all instances of a class nominated for deletion to migrate to one class. In that case, they should simply specify, together with a class to delete, a class to which they want to migrate the ``orphan'' instances, e.g.
>opjsubst -store mystore.pjs -del D -mig Dsuper Custom conversion allows the programmer to encode non-trivial conversion operations in standard Java. In PJama, there are two ways of performing custom conversion. The first and simpler one is called bulk conversion. Bulk custom conversion is supposed to be used when all instances of some class should be converted in the same way way. To perform it, the programmer should write an appropriate conversion method in Java. One such method can be defined for each evolved class. Conversion methods should have predefined names and signatures so that the evolution system can recognise them and call them correctly. All conversion methods should be placed into one class called conversion class. Having this class, the evolution system will scan the store linearly. For each detected instance of an evolved class it will call an appropriate conversion method, and that will result in creation of a new, substitute instance. The evolution system remembers all pairs ``old instance - new instance''. After conversion is finished, it will iterate over all persistent objects and replace all references to old instances with references to corresponding new instances. Custom conversion can be combined with default conversion, i.e. the programmer can avoid writing the code that copies the contents of fields with the same names and types from ``old'' instances to ``new'' ones. The descriptions of the available conversion methods and the details of combining default and custom conversion are discussed in the next section. The task of bulk modification of instances when their class is unchanged, that was mentioned in the previous section, is not actually an evolutionary problem. However, since Java methods very similar to those used for bulk custom conversion are used, we describe them in the same section. In addition to bulk conversion, fully controlled custom conversion is available in PJama evolution technology. It is intended to be used if the programmer wants to have full control over how and in what order the instances are converted. To run fully controlled custom conversion, the programmer should put the method called conversionMain() into the conversion class. If this method is present, the evolution system will simply call it after verification of substitutability of the classes, ignoring any other conversion methods. No automatic linear scan of the store would be performed. Thus the programmer gets the total freedom and the full responsibility for the results of such conversion. They should ensure that all instances of the modified class are either converted or made unreachable.
Bulk ConversionIn the following discussion we will denote a class for which conversion is required as C. Csuper means any superclass of C. We first describe the categories of conversion methods that correspond to categories of changes to class C. Class C Modified The signatures of the conversion methods recognised by the evolution system if class C is modified, is given below. The programmer can choose any one of the available headers and write the appropriate method body:
public static void convertInstance(C$$_old_ver_ c0, C c1) public static C convertInstance(C$$_old_ver_ c) public static Csuper convertInstance(C$$_old_ver c)
The first form of convertInstance method is the simplest one, and also the one that allows automatic default conversion. Before the system calls it, it creates an instance c1 of new version of class C and copies into it from c0 the values of all data fields that have the same name and the same or compatible types in both versions of class C. The new instance is created without invocation of a constructor. The second and third forms allow the programmer to change the actual class of an instance during conversion. The second form can be used if the programmer wants to change the class of an instance to new version of C or to a subclass of the latter. The third form permits a replacement class that is a superclass of C or that just has some common superclass with C. Both of these methods should explicitly call the new operator to create a new instance and then explicitly copy all the necessary data from c to the new instance. The second form of convertInstance is guaranteed to be reference-safe. This means that if there are some data structures in the store that refer to instances of C, for example there is a class CRef that looks like
class CRef {
C cref;
...
}
then after conversion all references from instances of CRef remain valid, although now some or all of them can point to instances of C's subclasses. That's because Java, as any other object-oriented language, allows a class type variable to refer to an instance of a class that is a subclasses of the declared class of this variable. However, a reference to an instance of a class that is C's superclass (or extends C's superclass) would be illegal. Therefore the third form of convertInstance method is unsafe in this sense, and it is the programmer's responsibility to arrange that there are no illegal references after conversion is complete. However, it gives the programmer more freedom in restructuring the persistent data and is justified if, for example, it is necessary to migrate all instances of class C to another class, which has a common superclass with C but is situated on another branch of class hierarchy tree. The subcase of class modification is when C is replaced (i.e. modified and renamed). Let us denote C's new name as NewC. Being informed by the programmer, the evolution tool knows that C and NewC are really the old and new names of the same evolved class. Therefore semantically exactly the same set of conversion methods can be used in this case:
public static void convertInstance(C c, NewC nc) public static NewC convertInstance(C c) public static C_and_NewC_super convertInstance(C c) Class C Deleted If class C that has some persistent instances, should be deleted from the class hierarchy, its ``orphan'' instances should migrate to other classes. The following methods can be used to perform migration:
public static void migrateInstance(C c0, Csuper_sub c1) public static Csuper migrateInstance(C c) Csuper should not be nominated for deletion itself. Csuper_sub is a class which has a common (undeleted) superclass with C. As before, the first form of migrateInstance method receives an already prepared instance of the replacement class from the evolution system. The values of all fields with the same name and same or compatible types are already copied from c0 to c1. The second method should call the new operator and copy the necessary data between the instances explicitly. The second form of the migrateInstance method is also reference-unsafe. Class C Unchanged - Bulk Modification of Instances The evolution system was extended to handle this non-evolutionary problem, because it became clear that the existing mechanism of bulk conversion looks quite attractive for it, and only minor modifications to the system are required. So currently, if programmers want to write a short and simple program that modifies all persistent instances of some class, which is itself unchanged, they can use the following methods:
public static void modifyInstance(C c) public static C modifyInstance(C c) public static Csuper modifyInstance(C c) Semi-automatic Copying of Data Between ``Old'' and ``New'' Instances As mentioned above, conversion methods that get an ``old'' instance as a single argument should create a replacement instance explicitly and they are fully responsible for copying data from one instance to another. However, even though the classes of these instances are most likely to be different and the class of the replacement instance may change from one invocation of the method to another, there can still be many fields with the same name and compatible types in both instances. To facilitate copying of such fields between instances, the following method is available in PJama core class org.opj.utilities.PJEvolution:
public static void copyDefaults(Object oldObj, Object newObj) This method copies the values of all fields that have the same name and same or compatible types, from oldObj to newObj, irrespective of their actual classes. The method uses Java reflection to find all such pairs. To speed up copying, it caches the results (mappings between fields) for every new pair of classes it finds. Copying and Conversion of Static Variables PJama supports the persistence of static variables unless they are marked transient. Therefore the evolution system has to provide support for conversion of static variables. When class C is subsituted, the values of all its static fields that have same names and compatible types in both versions are by default copied from the old version of C to the new one. The value of the field f is not copied, however, if f is static final in either the old or the new version of C. This is because ordinary static fields often hold some information that is obtained during program execution. Such information is preserved between executions of a persistent program and, similarly, across subsequent evolved versions of the class. In contrast, final fields usually serve as constants. They usually do not accumulate any information during runtime and remain the same in all evolving versions of a persistent class. However, if in some new version of a class such a constant has a different value, it is most likely that this change is intentional and should be propagated into the store. For example, the programmer might want to modify some message that the program prints, or change some normally ``stable'' constants. The programmer can override the above default rule for non-final statics of some class using a special command line option of the evolution tool or the similar flag of the tool's change specification language. In that case the static variables of this class will be assigned the values in the usual Java way, i.e. by their static initialisers. Similarly, copying of final static variables between the versions of a class can be enforced. If simple copying of statics is not enough, a conversion method for statics can be used. This method's signature is:
public static void convertStatics() If a method with this signature is present in the conversion class, it is called after statics are copied with the default procedure, but before bulk instance conversion. The code in this method can refer to old versions of all classes as usual, i.e. with the help of the $$_old_ver_ suffix. An Example - an Airline Maintaining a ``Frequent Flyer'' Programme After presenting all available conversion methods, we will illustrate their use on a simple example. Consider an airline that maintains a database of frequently flying customers. Each person is represented as an instance of class Customer. Every time a customer flies with this airline, miles are credited to their account. When a sufficient number of miles is collected, they can be used to fly somewhere for free. Consider the case where the programmer wants to modify the definition of class Customer to make it work with address data more conveniently:
class Customer { // Old class Customer { // Revised
String name; String name;
String address; String houseNo, street, city, postcode, country;
int milesCollected; int milesCollected;
... ...
} }
The single field address is replaced with several fields: houseNo, street, etc., while other fields remain the same and should retain the same information. In order to convert data, the programmer can write the following conversion class:
class CustomerConverter { // The name can be arbitrary
public static void convertInstance(Customer$$_old_ver_ oldC, Customer newC) {
newC.houseNo = extractHouseNo(oldC.address);
newC.street = extractStreet(oldC.address);
...
}
... // Methods extractXXX not shown
}
In the above method, it is enough to deal explicitly only with the fields that have been replaced and added. The values of all others, such as name and milesCollected, are copied from oldC to newC automatically. Now imagine that the airline decides to divide customers into three categories: Gold Tier, Silver Tier and Bronze Tier, depending on the number of collected miles. Class Customer becomes an abstract superclass of three new classes, and each Customer instance should be transformed into an instance of the appropriate specialised class. In order to do such a transformation, we have to use a conversion method that can create and return an instance of more than one class. The solution may look like:
import org.opj.utilities.PJEvolution;
class CustomerConverter {
public static Customer convertInstance(Customer$$_old_ver_ oldC) {
Customer newC;
if (oldC.totalMiles > 50000)
newC = new GoldTierCust();
else if (oldC.totalMiles > 20000)
newC = new SilverTierCust();
else newC = new BronseTierCust();
PJEvolution.copyDefaults(oldC, newC); // Explicit copying
return newC;
}
Stability of the ``Old'' Object Graph during Bulk Conversion An important feature of the bulk conversion mechanism implemented in PJama is the stability of the source (``old'') data. During conversion, newly-created instances are not automatically made reachable from any persistent data structure. ``Old instance - new instance'' reference pairs are kept in a hidden system table instead, and the source object graph remains unaffected. During conversion, reference fields of the freshly created and initialised ``new'' instance would point to ``old'' objects. This stability is essential for comprehensible conversion semantics. When conversion is finished, the persistent store is scanned and all references in persistent objects pointing to ``old'' instances are switched to their ``new'' counterparts, making ``old'' instances unreachable and preserving the identity of converted objects. This has an effect of an instant ``flip'' that transforms the old object graph into the new. The ``old'' instances will be eventually reclaimed by the garbage collector. The fact that the old object graph remains stable during conversion and is visible to conversion methods in its entirety gives the programmer free access to all data in the unconverted format at any moment during conversion. This can be used, say, to collect statistics and for similar purposes. A new version for an ``old'' converted object, if it already exists, can be obtained using yet another method declared in the PJEvolution class called getNewObjectVersion(Object). Continuing the above example, imagine that there is a field reference of type Customer in both old and new versions of class Customer. This field points to a person that has once referred this customer to the airline. The airline decides that if the customer goes to the Gold Tier, then the one who has referred them gets bonus miles:
public static Customer convertInstance(Customer$$_old_ver_ oldC) {
Customer newC;
if (oldC.totalMiles > 50000) {
newC = new GoldTierCust();
Customer$$_old_ver_ ref = oldC.reference;
ref.totalMiles += BONUS_MILES;
// See if ref has already been converted, and if so, update its new version
Customer refNew = (Customer) PJEvolution.getNewObjectVersion(ref);
if (refNew != null) // ref has already been converted
refNew.totalMiles = ref.totalMiles;
}
...
}
Note that during conversion oldC.reference continues to point to an instance of Customer$$_old_ver_ irrespective of whether that particular instance has already been converted or not.
Fully Controlled ConversionThe mechanism of fully controlled conversion can be used if the programmer wants to convert instances of the evolved class in a non-random order, or considerably restructure the data in addition to conversion, or do something else for which ordinary bulk conversion is not appropriate. To run fully controlled custom conversion, the programmer simply puts the method called conversionMain() into the conversion class. If this method is present, the evolution system will call just it, ignoring any other conversion methods. No automatic linear scan of the store will be performed, therefore it is solely the programmer's responsibility to ensure that all instances of the modified class are converted. If fully controlled conversion is used, preservation of identity of instances described in the previous section can't be done automatically. The programmer should explicitly inform the system about every ``old instance-replacement instance'' pair. For that, there is a special method in the PJama core class org.opj.utilities.PJEvolution:
public static native void preserveIdentity(Object oldObj, Object newObj); We will now illustrate the usage of fully controlled conversion on the same example of an airline. Continuing the story, let us imagine, that in addition to sorting customers into three categories, the programmmer also decides to put them into three separate collections instead of one array. Furthermore they might want to get rid of those instances for which the collected miles have expired. The following method can be added to the conversion class in addition to the already existing Customer convertInstance(Customer$$_old_ver_ oldC) method:
public static void conversionMain() {
Customer$$_old_ver_ allCustomers[] = getPersistentRoot("allCustomers");
for (int i = 0; i < allCustomers.length; i++)
if (! milesHaveExpired(allCustomers[i])) { // This instance is valid,
Customer c = convertInstance(allCustomers[i]); // so we convert it
// Preserve the identity explicitly
PJEvolution.preserveIdentity(allCustomers[i], c);
// Put new instance into the appropriate collection
if (c instanceof GoldTierCust)
goldC.add(c);
else if (c instanceof SilverTierCust)
silverC.add(c);
else bronzeC.add(c);
}
makeNewPersistentRoot("bronzeC", bronseC);
...
}
Questions and comments to forest-info@sunlabs.com | ||||||||||||||||||||||