source: branches/0.22.x/abcl/doc/design/pathnames/jar-pathnames.markdown

Last change on this file was 12613, checked in by Mark Evenson, 15 years ago

Document URL and jar pathname design changes.

File size: 8.4 KB
Line 
1JARs and JAR entries in ABCL
2============================
3
4    Mark Evenson
5    Created:  09 JAN 2010
6    Modified: 10 APR 2010
7
8Notes towards an implementation of "jar:" references to be contained
9in Common Lisp `PATHNAME`s within ABCL.
10
11Goals
12-----
13
141.  Use Common Lisp pathnames to refer to entries in a jar file.
15
16   
172.  Use `'jar:'` schema as documented in [`java.net.JarURLConnection`][jarURLConnection] for
18    namestring representation.
19
20    An entry in a JAR file:
21
22         #p"jar:file:baz.jar!/foo"
23   
24    A JAR file:
25
26         #p"jar:file:baz.jar!/"
27
28    A JAR file accessible via URL
29
30         #p"jar:http://example.org/abcl.jar!/"
31
32    An entry in a ABCL FASL in a URL accessible JAR file
33
34         #p"jar:jar:http://example.org/abcl.jar!/foo.abcl!/foo-1.cls"
35         
36[jarUrlConnection]: http://java.sun.com/javase/6/docs/api/java/net/JarURLConnection.html
37
383.  `MERGE-PATHNAMES` working for jar entries in the following use cases:
39
40        (merge-pathnames "foo-1.cls" "jar:jar:file:baz.jar!/foo.abcl!/foo._")
41        ==> "jar:jar:file:baz.jar!/foo.abcl!/foo-1.cls"
42
43        (merge-pathnames "foo-1.cls" "jar:file:foo.abcl!/")
44        ==> "jar:file:foo.abcl!/foo-1.cls"
45
464.  TRUENAME and PROBE-FILE working with "jar:" with TRUENAME
47    cannonicalizing the JAR reference.
48
495.  DIRECTORY working within JAR files (and within JAR in JAR).
50
516.  References "jar:<URL>" for all strings <URL> that java.net.URL can
52    resolve works.
53
547.  Make jar pathnames work as a valid argument for OPEN with
55:DIRECTION :INPUT.
56
578.  Enable the loading of ASDF systems packaged within jar files.
58
599.  Enable the matching of jar pathnames with PATHNAME-MATCH-P
60
61        (pathname-match-p
62          "jar:file:/a/b/some.jar!/a/system/def.asd"
63          "jar:file:/**/*.jar!/**/*.asd")     
64        ==> t
65
66Status
67------
68
69As of svn r125??, all the above goals have been implemented and
70tested.
71
72
73Implementation
74--------------
75
76A PATHNAME refering to a file within a JAR is known as a JAR PATHNAME.
77It can either refer to the entire JAR file or an entry within the JAR
78file.
79
80A JAR PATHNAME always has a DEVICE which is a proper list.  This
81distinguishes it from other uses of Pathname.
82
83The DEVICE of a JAR PATHNAME will be a list with either one or two
84elements.  The first element of the JAR PATHNAME can be either a
85PATHNAME representing a JAR on the filesystem, or a URL PATHNAME.
86
87A PATHNAME occuring in the list in the DEVICE of a JAR PATHNAME is
88known as a DEVICE PATHNAME.
89
90Only the first entry in the the DEVICE list may be a URL PATHNAME.
91
92Otherwise the the DEVICE PATHAME denotes the PATHNAME of the JAR file.
93
94The DEVICE PATHNAME list of enclosing JARs runs from outermost to
95innermost.
96   
97The DIRECTORY component of a JAR PATHNAME should be a list starting
98with the :ABSOLUTE keyword.  Even though hierarchial entries in jar
99files are stored in the form "foo/bar/a.lisp" not "/foo/bar/a.lisp",
100the meaning of DIRECTORY component is better represented as an
101absolute path.
102
103A jar Pathname has type JAR-PATHNAME, derived from PATHNAME.
104
105
106BNF
107---
108
109An incomplete BNF of the syntax of JAR PATHNAME would be:
110
111      JAR-PATHNAME ::= "jar:" URL "!/" [ ENTRY ]
112
113      URL ::= <URL parsable via java.net.URL.URL()>
114            | JAR-FILE-PATHNAME
115
116      JAR-FILE-PATHNAME ::= "jar:" "file:" JAR-NAMESTRING "!/" [ ENTRY ]
117
118      JAR-NAMESTRING  ::=  ABSOLUTE-FILE-NAMESTRING
119                         | RELATIVE-FILE-NAMESTRING
120
121      ENTRY ::= [ DIRECTORY "/"]* FILE
122
123
124### Notes
125
1261.  `ABSOLUTE-FILE-NAMESTRING` and `RELATIVE-FILE-NAMESTRING` use the
127local filesystem conventions, meaning that on Windows this could
128contain '\' as the directory separator, while an `ENTRY` always uses '/'
129to separate directories within the jar proper.
130
131
132Use Cases
133---------
134
135    // UC1 -- JAR
136    pathname: {
137      namestring: "jar:file:foo/baz.jar!/"
138      device: (
139        pathname: { 
140          device: "jar:file:"
141          directory: (:RELATIVE "foo")
142          name: "baz"
143          type: "jar"
144        }
145      )
146    }
147
148
149    // UC2 -- JAR entry
150    pathname: {
151      namestring: "jar:file:baz.jar!/foo.abcl"
152      device: ( pathname: {
153        device: "jar:file:"
154        name: "baz"
155        type: "jar"
156      })
157      name: "foo"
158      type: "abcl"
159    }
160
161
162    // UC3 -- JAR file in a JAR entry
163    pathname: {
164      namestring: "jar:jar:file:baz.jar!/foo.abcl!/"
165      device: (
166        pathname: {
167          name: "baz"
168          type: "jar"
169        }
170        pathname: {
171          name: "foo"
172          type: "abcl"
173        }
174      )
175    }
176
177    // UC4 -- JAR entry in a JAR entry with directories
178    pathname: {
179      namestring: "jar:jar:file:a/baz.jar!/b/c/foo.abcl!/this/that/foo-20.cls"
180      device: (
181        pathname {
182          directory: (:RELATIVE "a")     
183          name: "bar"
184          type: "jar"
185        }
186        pathname {
187          directory: (:RELATIVE "b" "c")
188          name: "foo"
189          type: "abcl"
190        }
191      )
192      directory: (:RELATIVE "this" "that")
193      name: "foo-20"
194      type: "cls"
195    }
196
197    // UC5 -- JAR Entry in a JAR Entry
198    pathname: {
199      namestring: "jar:jar:file:a/foo/baz.jar!/c/d/foo.abcl!/a/b/bar-1.cls"
200      device: (
201        pathname: {
202          directory: (:RELATIVE "a" "foo")
203          name: "baz"
204          type: "jar"
205        }
206        pathname: {
207          directory: (:RELATIVE "c" "d")
208          name: "foo"
209          type: "abcl"
210        }
211      )
212      directory: (:ABSOLUTE "a" "b")
213      name: "bar-1"
214      type: "cls"
215    }
216
217    // UC6 -- JAR entry in a http: accessible JAR file
218    pathname: {
219      namestring: "jar:http://example.org/abcl.jar!/org/armedbear/lisp/Version.class",
220      device: (
221        pathname: {
222          namestring: "http://example.org/abcl.jar"
223        }
224        pathname: {
225          directory: (:RELATIVE "org" "armedbear" "lisp")
226          name: "Version"
227          type: "class"
228       }
229    }
230
231    // UC7 -- JAR Entry in a JAR Entry in a URL accessible JAR FILE
232    pathname: {
233       namestring  "jar:jar:http://example.org/abcl.jar!/foo.abcl!/foo-1.cls"
234       device: (
235         pathname: {
236           namestring: "http://example.org/abcl.jar"
237         }
238         pathname: {
239           name: "foo"
240           type: "abcl"
241         }
242      )
243      name: "foo-1"
244      type: "cls"
245    }
246
247    // UC8 -- JAR in an absolute directory
248
249    pathame: {
250       namestring: "jar:file:/a/b/foo.jar!/"
251       device: (
252         pathname: {
253           directory: (:ABSOLUTE "a" "b")
254           name: "foo"
255           type: "jar"
256         }
257       )
258    }
259
260    // UC9 -- JAR in an relative directory with entry
261    pathname: {
262       namestring: "jar:file:a/b/foo.jar!/c/d/foo.lisp"
263       device: (
264         directory: (:RELATIVE "a" "b")
265         name: "foo"
266         type: "jar"
267       )
268       directory: (:ABSOLUTE "c" "d")
269       name: "foo"
270       type: "lisp
271    }
272
273
274History
275-------
276
277Previously, ABCL did have some support for jar pathnames. This support
278used the convention that the if the device field was itself a
279pathname, the device pathname contained the location of the jar.
280
281In the analysis of the desire to treat jar pathnames as valid
282locations for `LOAD`, we determined that we needed a "double" pathname
283so we could refer to the components of a packed FASL in jar.  At first
284we thought we could support such a syntax by having the device
285pathname's device refer to the inner jar.  But with in this use of
286`PATHNAME`s linked by the `DEVICE` field, we found the problem that UNC
287path support uses the `DEVICE` field so JARs located on UNC mounts can't
288be referenced. via '\\', i.e. 
289
290    jar:jar:file:\\server\share\a\b\foo.jar!/this\that!/foo.java
291
292would not have a valid representation.
293
294So instead of having `DEVICE` point to a `PATHNAME`, we decided that the
295`DEVICE` shall be a list of `PATHNAME`, so we would have:
296
297    pathname: {
298      namestring: "jar:jar:file:\\server\share\foo.jar!/foo.abcl!/"
299      device: (
300                pathname: {
301                  host: "server"
302                  device: "share"
303                  name: "foo"
304                  type: "jar"
305                }
306                pathname: {
307                  name: "foo"
308                  type: "abcl"
309                }
310              )
311    }
312
313Although there is a fair amount of special logic inside `Pathname.java`
314itself in the resulting implementation, the logic in `Load.java` seems
315to have been considerably simplified.
316
317When we implemented URL Pathnames, the special syntax for URL as an
318abstract string in the first position of the device list was naturally
319replaced with a URL pathname.
320
321
Note: See TracBrowser for help on using the repository browser.