source: trunk/abcl/doc/design/pathnames/jar-pathnames.markdown @ 12607

Last change on this file since 12607 was 12607, checked in by Mark Evenson, 13 years ago

URL pathnames working for OPEN for built-in schemas.

Still need to decide with URI escaping issues, as we currently rely on
the URL Stream handlers to do the right thing. And we still need to
retrofit jar pathname's use of a string to represent a URL.

Updates for URL and jar pathname design documents.

Implemented URL-PATHNAME and JAR-PATHNAME as subtypes of PATHNAME.

Adjusted ABCL-TEST-LISP to use functions provided in
"pathname-test.lisp" in "jar-file.lisp". Added one test for url
pathnames.

Constructor in Java added for a Cons by copying references from the
orignal Cons.

File size: 8.3 KB
Line 
1JARs and JAR entries in ABCL
2============================
3
4    Mark Evenson
5    Created:  09 JAN 2010
6    Modified: 25 MAR 2010
7
8Notes towards an implementation of "jar:" references to be contained
9in Common Lisp `PATHNAME`s within ABCL.
10
11Goals
12-----
13
141.  Use Common Lisp pathnames to refer to entries in a jar file.
15
16   
172.  Use `'jar:'` schema as documented in [`java.net.JarURLConnection`][jarURLConnection] for
18    namestring representation.
19
20    An entry in a JAR file:
21
22         #p"jar:file:baz.jar!/foo"
23   
24    A JAR file:
25
26         #p"jar:file:baz.jar!/"
27
28    A JAR file accessible via URL
29
30         #p"jar:http://example.org/abcl.jar!/"
31
32    An entry in a ABCL FASL in a URL accessible JAR file
33
34         #p"jar:jar:http://example.org/abcl.jar!/foo.abcl!/foo-1.cls"
35         
36[jarUrlConnection]: http://java.sun.com/javase/6/docs/api/java/net/JarURLConnection.html
37
383.  `MERGE-PATHNAMES` working for jar entries in the following use cases:
39
40        (merge-pathnames "foo-1.cls" "jar:jar:file:baz.jar!/foo.abcl!/foo._")
41        ==> "jar:jar:file:baz.jar!/foo.abcl!/foo-1.cls"
42
43        (merge-pathnames "foo-1.cls" "jar:file:foo.abcl!/")
44        ==> "jar:file:foo.abcl!/foo-1.cls"
45
464.  TRUENAME and PROBE-FILE working with "jar:" with TRUENAME
47    cannonicalizing the JAR reference.
48
495.  DIRECTORY working within JAR files (and within JAR in JAR).
50
516.  References "jar:<URL>" for all strings <URL> that java.net.URL can
52    resolve works.
53
547.  Make jar pathnames work as a valid argument for OPEN with
55:DIRECTION :INPUT.
56
578.  Enable the loading of ASDF systems packaged within jar files.
58
599.  Enable the matching of jar pathnames with PATHNAME-MATCH-P
60
61        (pathname-match-p
62          "jar:file:/a/b/some.jar!/a/system/def.asd"
63          "jar:file:/**/*.jar!/**/*.asd")     
64        ==> t
65
66Status
67------
68
69As of svn r125??, all the above goals have been implemented and
70tested.
71
72
73Implementation
74--------------
75
76A PATHNAME refering to a file within a JAR is known as a JAR PATHNAME.
77It can either refer to the entire JAR file or an entry within the JAR
78file.
79
80A JAR PATHNAME always has a DEVICE which is a proper list.  This
81distinguishes it from other uses of Pathname.
82
83The DEVICE of a JAR PATHNAME will be a list with either one or two
84elements.  The first element of the JAR PATHNAME can be either a
85PATHNAME representing a JAR on the filesystem, or a SimpleString
86representing a URL.
87
88A PATHNAME occuring in the list in the DEVICE of a JAR PATHNAME is
89known as a DEVICE PATHNAME.
90
91If the DEVICE is a String it must be a String that successfully
92references a URL via the java.net.URL(String) constructor
93
94Only the first entry in the the DEVICE list may be a String.
95
96Otherwise the the DEVICE PATHAME denotes the PATHNAME of the JAR file.
97
98The DEVICE PATHNAME list of enclosing JARs runs from outermost to
99innermost.
100   
101The DIRECTORY component of a JAR PATHNAME should be a list starting
102with the :ABSOLUTE keyword.  Even though hierarchial entries in jar
103files are stored in the form "foo/bar/a.lisp" not "/foo/bar/a.lisp",
104the meaning of DIRECTORY component better represented as an absolute
105path.
106
107A jar Pathname has type JAR-PATHNAME, derived from PATHNAME.
108
109BNF
110---
111
112An incomplete BNF of the syntax of JAR PATHNAME would be:
113
114      JAR-PATHNAME ::= "jar:" URL "!/" [ ENTRY ]
115
116      URL ::= <URL parsable via java.net.URL.URL()>
117            | JAR-FILE-PATHNAME
118
119      JAR-FILE-PATHNAME ::= "jar:" "file:" JAR-NAMESTRING "!/" [ ENTRY ]
120
121      JAR-NAMESTRING  ::=  ABSOLUTE-FILE-NAMESTRING
122                         | RELATIVE-FILE-NAMESTRING
123
124      ENTRY ::= [ DIRECTORY "/"]* FILE
125
126
127### Notes
128
1291.  `ABSOLUTE-FILE-NAMESTRING` and `RELATIVE-FILE-NAMESTRING` use the
130local filesystem conventions, meaning that on Windows this could
131contain '\' as the directory separator, while an `ENTRY` always uses '/'
132to separate directories within the jar proper.
133
134
135Use Cases
136---------
137
138    // UC1 -- JAR
139    pathname: {
140      namestring: "jar:file:foo/baz.jar!/"
141      device: (
142        pathname: { 
143          device: "jar:file:"
144          directory: (:RELATIVE "foo")
145          name: "baz"
146          type: "jar"
147        }
148      )
149    }
150
151
152    // UC2 -- JAR entry
153    pathname: {
154      namestring: "jar:file:baz.jar!/foo.abcl"
155      device: ( pathname: {
156        device: "jar:file:"
157        name: "baz"
158        type: "jar"
159      })
160      name: "foo"
161      type: "abcl"
162    }
163
164
165    // UC3 -- JAR file in a JAR entry
166    pathname: {
167      namestring: "jar:jar:file:baz.jar!/foo.abcl!/"
168      device: (
169        pathname: {
170          name: "baz"
171          type: "jar"
172        }
173        pathname: {
174          name: "foo"
175          type: "abcl"
176        }
177      )
178    }
179
180    // UC4 -- JAR entry in a JAR entry with directories
181    pathname: {
182      namestring: "jar:jar:file:a/baz.jar!/b/c/foo.abcl!/this/that/foo-20.cls"
183      device: (
184        pathname {
185          directory: (:RELATIVE "a")     
186          name: "bar"
187          type: "jar"
188        }
189        pathname {
190          directory: (:RELATIVE "b" "c")
191          name: "foo"
192          type: "abcl"
193        }
194      )
195      directory: (:RELATIVE "this" "that")
196      name: "foo-20"
197      type: "cls"
198    }
199
200    // UC5 -- JAR Entry in a JAR Entry
201    pathname: {
202      namestring: "jar:jar:file:a/foo/baz.jar!/c/d/foo.abcl!/a/b/bar-1.cls"
203      device: (
204        pathname: {
205          directory: (:RELATIVE "a" "foo")
206          name: "baz"
207          type: "jar"
208        }
209        pathname: {
210          directory: (:RELATIVE "c" "d")
211          name: "foo"
212          type: "abcl"
213        }
214      )
215      directory: (:ABSOLUTE "a" "b")
216      name: "bar-1"
217      type: "cls"
218    }
219
220    // UC6 -- JAR entry in a http: accessible JAR file
221    pathname: {
222      namestring: "jar:http://example.org/abcl.jar!/org/armedbear/lisp/Version.class",
223      device: (
224        "http://example.org/abcl.jar"
225        pathname: {
226          directory: (:RELATIVE "org" "armedbear" "lisp")
227          name: "Version"
228          type: "class"
229       }
230    }
231
232    // UC7 -- JAR Entry in a JAR Entry in a URL accessible JAR FILE
233    pathname: {
234       namestring  "jar:jar:http://example.org/abcl.jar!/foo.abcl!/foo-1.cls"
235       device: (
236         "http://example.org/abcl.jar"
237         pathname: {
238           name: "foo"
239           type: "abcl"
240         }
241      )
242      name: "foo-1"
243      type: "cls"
244    }
245
246    // UC8 -- JAR in an absolute directory
247
248    pathame: {
249       namestring: "jar:file:/a/b/foo.jar!/"
250       device: (
251         pathname: {
252           directory: (:ABSOLUTE "a" "b")
253           name: "foo"
254           type: "jar"
255         }
256       )
257    }
258
259    // UC9 -- JAR in an relative directory with entry
260    pathname: {
261       namestring: "jar:file:a/b/foo.jar!/c/d/foo.lisp"
262       device: (
263         directory: (:RELATIVE "a" "b")
264         name: "foo"
265         type: "jar"
266       )
267       directory: (:ABSOLUTE "c" "d")
268       name: "foo"
269       type: "lisp
270    }
271
272
273History
274-------
275
276Previously, ABCL did have some support for jar pathnames. This support
277used the convention that the if the device field was itself a
278pathname, the device pathname contained the location of the jar.
279
280In the analysis of the desire to treat jar pathnames as valid
281locations for `LOAD`, we determined that we needed a "double" pathname
282so we could refer to the components of a packed FASL in jar.  At first
283we thought we could support such a syntax by having the device
284pathname's device refer to the inner jar.  But with in this use of
285`PATHNAME`s linked by the `DEVICE` field, we found the problem that UNC
286path support uses the `DEVICE` field so JARs located on UNC mounts can't
287be referenced. via '\\', i.e. 
288
289    jar:jar:file:\\server\share\a\b\foo.jar!/this\that!/foo.java
290
291would not have a valid representation.
292
293So instead of having `DEVICE` point to a `PATHNAME`, we decided that the
294`DEVICE` shall be a list of `PATHNAME`, so we would have:
295
296    pathname: {
297      namestring: "jar:jar:file:\\server\share\foo.jar!/foo.abcl!/"
298      device: (
299                pathname: {
300                  host: "server"
301                  device: "share"
302                  name: "foo"
303                  type: "jar"
304                }
305                pathname: {
306                  name: "foo"
307                  type: "abcl"
308                }
309    }
310
311Although there is a fair amount of special logic inside `Pathname.java`
312itself in the resulting implementation, the logic in `Load.java` seems
313to have been considerably simplified.
314
Note: See TracBrowser for help on using the repository browser.