source: branches/save-image/BRANCH-README @ 11702

Last change on this file since 11702 was 11702, checked in by astalla, 13 years ago

Added BRANCH-README, deleted dist directory that was incorrectly added

File size: 7.5 KB
1------- Table of Contents -------
3    - Summary
4    - Introduction
5    - The serialization algorithm
6    - Why serialization?
7    - Current status
8    - Known problems
13-- Summary --
15The purpose of this branch is to store a set of changes to ABCL intended to provide a save-image functionality, found in all the most important Lisp implementations, using Java serialization as the underlying technology.
18-- Introduction --
20Serialization ( is the capability of a Java runtime environment to write Java objects in binary form on a stream. The objects can be read back later, even on a different instance of the JVM (provided certain conditions are met); this process is called deserialization.
22The objects responsible of these processes are instances of the classes and Using a single instance of ObjectOutputStream to serialize an object graph will ensure that each object is only written once to the stream, while subsequent references to it are stored as references in the stream. This ensures, among other things, that circular structures are supported out-of-the-box, at the cost of some memory consuption due to the hash tables that are used by ObjectOutputStream to record known instances.
24Serializing an object will write to the stream a descriptor of its class, followed by a representation of the object itself according to an algorithm presented below. On deserialization, the class descriptor will be read and a corresponding class will be searched in the currently active ClassLoader. If found, an instance of this class will be created (using a native method that does not invoke any constructor). The state of the instance will then be restored according to the algorithm below.
27-- The serialization algorithm --
29Serialization is quite simple to use, yet very configurable. At the bare minimum, a Java class must implement the marker interface to be serializable.
31The default serialization algorithm will save all non-transient fields of an object: both fields that are declared in that object's class and in its superclasses that implement Serializable. Deserialization will simply copy state back in the deserialized instance. This unless an object already appeared in the stream, in which case an instance is read/written only once and subsequent references are resolved to it.
33Sometimes the basic algorithm is not enough. Serializable classes are allowed to hook into the mechanism by implementing some or all of these 4 methods:
35- void readObject(ObjectInputStream)
36- Object readResolve()
38- void writeObject(ObjectOutputStream)
39- Object writeReplace()
41readObject and writeObject can provide custom serialization/deserialization code. Inside these methods you can call defaultReadObject()/defaultWriteObject() on the stream passed as argument to read/write state according to the basic algorithm.
43readResolve and writeReplace can designate another object to be read/written to the stream instead of the current one.
46-- Why serialization? --
48Using serialization for ABCL save/restore state has the following benefits:
50- it's built-in and quite easy to implement, at least in the common case
51- it handles circularity out of the box
52- it plays well with the fact that most ABCL objects inherit from a single superclass (LispObject)
53- potentially it allows for finer-grained save/restore functionality, e.g. to send single Lisp objects over the network.
55However, it has the following downsides:
57- It is purely Java side, so it can be hard to debug since it cannot leverage ABCL's interactivity
58- It is tricky to integrate it with the dynamism of ABCL (e.g. ad-hoc code must be written to serialize/deserialize classes generated at runtime by the compiler).
61-- Current status --
63Currently the LispObject class, plus some support classes (in general inner classes) have been marked as Serializable. Here is a summary of the various Lisp object types and how they are handled:
66- Java class -- Lisp type ------------- Serialization status --------------
67             |            |
68             |            |
69  LispObject |     T      |   Basic algorithm; special cases in subclasses
70             |            |
71             |            |
72    Symbol   |   symbol   |   Package is transient: only the name of the
73             |            |   package is written, and the package is searched
74             |            |   with find-package at deserialization. This allows
75             |            |   to save individual symbols should the need arise.
76             |            |   readResolve() finds the package and interns the
77             |            |   symbol into it, returning the interned instance
78             |            |   with state copied from the deserialized instance.
79             |            |   This makes the deserialized instance EQ with an
80             |            |   eventual preexististing symbol with the same name
81             |            |   in the same package.
82             |            |
83             |            |
84     Nil     |    null    |   readResolve() returns Lisp.NIL
85             |            |
86             |            |
87   Function  |  function  |   compiled functions are instances of runtime-
88             |            |   generated classes marked somehow (currently
89             |            |   prepending ABCL_GENERATED_ to their name).
90             |            |   Function recognizes such classes and in this
91             |            |   case designates as a replacement object an
92             |            |   instance of ExternalizedCompiledFunction
93             |            |   containing the compiler-generated class as
94             |            |   a byte[]. Such an instance is used at
95             |            |   deseriazation time to reload the class into
96             |            |   the classloader, thus reloading the compiled
97             |            |   function. Non-compiled functions are serialized
98             |            |   using the basic algorithm.
99             |            |
100             |            |
101    Cons     |    cons    |   The basic algorithm seems to suffice.
102  HashTable  | hash-table |
103  and others | and others |
104             |            |
105             |            |
108-- Known problems --
110Object{In|Out}putStream have problems working with Lisp streams. Currently I have bypassed Lisp streams and only used Java streams to a hardcoded file for testing purposes, but the problem will need to be addressed.
112Some objects currently cannot be serialized or deserialized:
114- the symbol + cannot be serialized because its plist references a structure
115  describing the + method combination, structure which is not serializable
116  (why?)
118- Deserialization of compiled functions containing calls to jmethod fail,
119  since apparently when loaded (instantiated) the jmethod compiled function -
120  or maybe the function jvm::p2-java-jmethod - tries to load
121  compiler-pass2-704.cls as a file (not from abcl.jar), which obviously fails.
122  (it tries to load it using Lisp.loadCompiledFunction(String)).
123  It can be that jmethod is compiled using compile-function-call, which can
124  call loadCompiledFunction on the LispObject obtained instantiating the
125  runtime generated class. It is unclear to me where the file path comes from,
126  however.
128- Objects with volatile state (streams, threads) have not been addressed.
130- probably many others. Needs much more experimentation. In particular little
131  or no CLOS stuff has been tested.
1352009-03-07, Alessio Stalla
Note: See TracBrowser for help on using the repository browser.