1 | JARs and JAR entries in ABCL |
---|
2 | ============================ |
---|
3 | |
---|
4 | Mark Evenson |
---|
5 | Created: 09 JAN 2010 |
---|
6 | Modified: 21 JUN 2011 |
---|
7 | |
---|
8 | Notes towards an implementation of "jar:" references to be contained |
---|
9 | in Common Lisp `PATHNAME`s within ABCL. |
---|
10 | |
---|
11 | Goals |
---|
12 | ----- |
---|
13 | |
---|
14 | 1. Use Common Lisp pathnames to refer to entries in a jar file. |
---|
15 | |
---|
16 | 2. Use `'jar:'` schema as documented in [`java.net.JarURLConnection`][jarURLConnection] for |
---|
17 | namestring representation. |
---|
18 | |
---|
19 | An entry in a JAR file: |
---|
20 | |
---|
21 | #p"jar:file:baz.jar!/foo" |
---|
22 | |
---|
23 | A JAR file: |
---|
24 | |
---|
25 | #p"jar:file:baz.jar!/" |
---|
26 | |
---|
27 | A JAR file accessible via URL |
---|
28 | |
---|
29 | #p"jar:http://example.org/abcl.jar!/" |
---|
30 | |
---|
31 | An entry in a ABCL FASL in a URL accessible JAR file |
---|
32 | |
---|
33 | #p"jar:jar:http://example.org/abcl.jar!/foo.abcl!/foo-1.cls" |
---|
34 | |
---|
35 | [jarUrlConnection]: http://java.sun.com/javase/6/docs/api/java/net/JarURLConnection.html |
---|
36 | |
---|
37 | 3. `MERGE-PATHNAMES` working for jar entries in the following use cases: |
---|
38 | |
---|
39 | (merge-pathnames "foo-1.cls" "jar:jar:file:baz.jar!/foo.abcl!/foo._") |
---|
40 | ==> "jar:jar:file:baz.jar!/foo.abcl!/foo-1.cls" |
---|
41 | |
---|
42 | (merge-pathnames "foo-1.cls" "jar:file:foo.abcl!/") |
---|
43 | ==> "jar:file:foo.abcl!/foo-1.cls" |
---|
44 | |
---|
45 | 4. TRUENAME and PROBE-FILE working with "jar:" with TRUENAME |
---|
46 | cannonicalizing the JAR reference. |
---|
47 | |
---|
48 | 5. DIRECTORY working within JAR files (and within JAR in JAR). |
---|
49 | |
---|
50 | 6. References "jar:<URL>" for all strings <URL> that java.net.URL can |
---|
51 | resolve works. |
---|
52 | |
---|
53 | 7. Make jar pathnames work as a valid argument for OPEN with |
---|
54 | :DIRECTION :INPUT. |
---|
55 | |
---|
56 | 8. Enable the loading of ASDF systems packaged within jar files. |
---|
57 | |
---|
58 | 9. Enable the matching of jar pathnames with PATHNAME-MATCH-P |
---|
59 | |
---|
60 | (pathname-match-p |
---|
61 | "jar:file:/a/b/some.jar!/a/system/def.asd" |
---|
62 | "jar:file:/**/*.jar!/**/*.asd") |
---|
63 | ==> t |
---|
64 | |
---|
65 | Status |
---|
66 | ------ |
---|
67 | |
---|
68 | All the above goals have been implemented and tested. |
---|
69 | |
---|
70 | |
---|
71 | Implementation |
---|
72 | -------------- |
---|
73 | |
---|
74 | A PATHNAME refering to a file within a JAR is known as a JAR PATHNAME. |
---|
75 | It can either refer to the entire JAR file or an entry within the JAR |
---|
76 | file. |
---|
77 | |
---|
78 | A JAR PATHNAME always has a DEVICE which is a proper list. This |
---|
79 | distinguishes it from other uses of Pathname. |
---|
80 | |
---|
81 | The DEVICE of a JAR PATHNAME will be a list with either one or two |
---|
82 | elements. The first element of the JAR PATHNAME can be either a |
---|
83 | PATHNAME representing a JAR on the filesystem, or a URL PATHNAME. |
---|
84 | |
---|
85 | A PATHNAME occuring in the list in the DEVICE of a JAR PATHNAME is |
---|
86 | known as a DEVICE PATHNAME. |
---|
87 | |
---|
88 | Only the first entry in the the DEVICE list may be a URL PATHNAME. |
---|
89 | |
---|
90 | Otherwise the the DEVICE PATHAME denotes the PATHNAME of the JAR file. |
---|
91 | |
---|
92 | The DEVICE PATHNAME list of enclosing JARs runs from outermost to |
---|
93 | innermost. The implementaion currently limits this list to have at |
---|
94 | most two elements. |
---|
95 | |
---|
96 | The DIRECTORY component of a JAR PATHNAME should be a list starting |
---|
97 | with the :ABSOLUTE keyword. Even though hierarchial entries in jar |
---|
98 | files are stored in the form "foo/bar/a.lisp" not "/foo/bar/a.lisp", |
---|
99 | the meaning of DIRECTORY component is better represented as an |
---|
100 | absolute path. |
---|
101 | |
---|
102 | A jar Pathname has type JAR-PATHNAME, derived from PATHNAME. |
---|
103 | |
---|
104 | |
---|
105 | BNF |
---|
106 | --- |
---|
107 | |
---|
108 | An incomplete BNF of the syntax of JAR PATHNAME would be: |
---|
109 | |
---|
110 | JAR-PATHNAME ::= "jar:" URL "!/" [ ENTRY ] |
---|
111 | |
---|
112 | URL ::= <URL parsable via java.net.URL.URL()> |
---|
113 | | JAR-FILE-PATHNAME |
---|
114 | |
---|
115 | JAR-FILE-PATHNAME ::= "jar:" "file:" JAR-NAMESTRING "!/" [ ENTRY ] |
---|
116 | |
---|
117 | JAR-NAMESTRING ::= ABSOLUTE-FILE-NAMESTRING |
---|
118 | | RELATIVE-FILE-NAMESTRING |
---|
119 | |
---|
120 | ENTRY ::= [ DIRECTORY "/"]* FILE |
---|
121 | |
---|
122 | |
---|
123 | ### Notes |
---|
124 | |
---|
125 | 1. `ABSOLUTE-FILE-NAMESTRING` and `RELATIVE-FILE-NAMESTRING` can use |
---|
126 | the local filesystem conventions, meaning that on Windows this could |
---|
127 | contain '\' as the directory separator, which are always normalized to |
---|
128 | '/'. An `ENTRY` always uses '/' to separate directories within the |
---|
129 | jar archive. |
---|
130 | |
---|
131 | |
---|
132 | Use Cases |
---|
133 | --------- |
---|
134 | |
---|
135 | // UC1 -- JAR |
---|
136 | pathname: { |
---|
137 | namestring: "jar:file:foo/baz.jar!/" |
---|
138 | device: ( |
---|
139 | pathname: { |
---|
140 | device: "jar:file:" |
---|
141 | directory: (:RELATIVE "foo") |
---|
142 | name: "baz" |
---|
143 | type: "jar" |
---|
144 | } |
---|
145 | ) |
---|
146 | } |
---|
147 | |
---|
148 | |
---|
149 | // UC2 -- JAR entry |
---|
150 | pathname: { |
---|
151 | namestring: "jar:file:baz.jar!/foo.abcl" |
---|
152 | device: ( pathname: { |
---|
153 | device: "jar:file:" |
---|
154 | name: "baz" |
---|
155 | type: "jar" |
---|
156 | }) |
---|
157 | name: "foo" |
---|
158 | type: "abcl" |
---|
159 | } |
---|
160 | |
---|
161 | |
---|
162 | // UC3 -- JAR file in a JAR entry |
---|
163 | pathname: { |
---|
164 | namestring: "jar:jar:file:baz.jar!/foo.abcl!/" |
---|
165 | device: ( |
---|
166 | pathname: { |
---|
167 | name: "baz" |
---|
168 | type: "jar" |
---|
169 | } |
---|
170 | pathname: { |
---|
171 | name: "foo" |
---|
172 | type: "abcl" |
---|
173 | } |
---|
174 | ) |
---|
175 | } |
---|
176 | |
---|
177 | // UC4 -- JAR entry in a JAR entry with directories |
---|
178 | pathname: { |
---|
179 | namestring: "jar:jar:file:a/baz.jar!/b/c/foo.abcl!/this/that/foo-20.cls" |
---|
180 | device: ( |
---|
181 | pathname { |
---|
182 | directory: (:RELATIVE "a") |
---|
183 | name: "bar" |
---|
184 | type: "jar" |
---|
185 | } |
---|
186 | pathname { |
---|
187 | directory: (:RELATIVE "b" "c") |
---|
188 | name: "foo" |
---|
189 | type: "abcl" |
---|
190 | } |
---|
191 | ) |
---|
192 | directory: (:RELATIVE "this" "that") |
---|
193 | name: "foo-20" |
---|
194 | type: "cls" |
---|
195 | } |
---|
196 | |
---|
197 | // UC5 -- JAR Entry in a JAR Entry |
---|
198 | pathname: { |
---|
199 | namestring: "jar:jar:file:a/foo/baz.jar!/c/d/foo.abcl!/a/b/bar-1.cls" |
---|
200 | device: ( |
---|
201 | pathname: { |
---|
202 | directory: (:RELATIVE "a" "foo") |
---|
203 | name: "baz" |
---|
204 | type: "jar" |
---|
205 | } |
---|
206 | pathname: { |
---|
207 | directory: (:RELATIVE "c" "d") |
---|
208 | name: "foo" |
---|
209 | type: "abcl" |
---|
210 | } |
---|
211 | ) |
---|
212 | directory: (:ABSOLUTE "a" "b") |
---|
213 | name: "bar-1" |
---|
214 | type: "cls" |
---|
215 | } |
---|
216 | |
---|
217 | // UC6 -- JAR entry in a http: accessible JAR file |
---|
218 | pathname: { |
---|
219 | namestring: "jar:http://example.org/abcl.jar!/org/armedbear/lisp/Version.class", |
---|
220 | device: ( |
---|
221 | pathname: { |
---|
222 | namestring: "http://example.org/abcl.jar" |
---|
223 | } |
---|
224 | pathname: { |
---|
225 | directory: (:RELATIVE "org" "armedbear" "lisp") |
---|
226 | name: "Version" |
---|
227 | type: "class" |
---|
228 | } |
---|
229 | } |
---|
230 | |
---|
231 | // UC7 -- JAR Entry in a JAR Entry in a URL accessible JAR FILE |
---|
232 | pathname: { |
---|
233 | namestring "jar:jar:http://example.org/abcl.jar!/foo.abcl!/foo-1.cls" |
---|
234 | device: ( |
---|
235 | pathname: { |
---|
236 | namestring: "http://example.org/abcl.jar" |
---|
237 | } |
---|
238 | pathname: { |
---|
239 | name: "foo" |
---|
240 | type: "abcl" |
---|
241 | } |
---|
242 | ) |
---|
243 | name: "foo-1" |
---|
244 | type: "cls" |
---|
245 | } |
---|
246 | |
---|
247 | // UC8 -- JAR in an absolute directory |
---|
248 | |
---|
249 | pathame: { |
---|
250 | namestring: "jar:file:/a/b/foo.jar!/" |
---|
251 | device: ( |
---|
252 | pathname: { |
---|
253 | directory: (:ABSOLUTE "a" "b") |
---|
254 | name: "foo" |
---|
255 | type: "jar" |
---|
256 | } |
---|
257 | ) |
---|
258 | } |
---|
259 | |
---|
260 | // UC9 -- JAR in an relative directory with entry |
---|
261 | pathname: { |
---|
262 | namestring: "jar:file:a/b/foo.jar!/c/d/foo.lisp" |
---|
263 | device: ( |
---|
264 | directory: (:RELATIVE "a" "b") |
---|
265 | name: "foo" |
---|
266 | type: "jar" |
---|
267 | ) |
---|
268 | directory: (:ABSOLUTE "c" "d") |
---|
269 | name: "foo" |
---|
270 | type: "lisp |
---|
271 | } |
---|
272 | |
---|
273 | |
---|
274 | URI Encoding |
---|
275 | ------------ |
---|
276 | |
---|
277 | As a subtype of URL-PATHNAMES, JAR-PATHNAMES follow all the rules for |
---|
278 | that type. Most notably this means that all #\Space characters should |
---|
279 | be encoded as '%20' when dealing with jar entries. |
---|
280 | |
---|
281 | |
---|
282 | History |
---|
283 | ------- |
---|
284 | |
---|
285 | Previously, ABCL did have some support for jar pathnames. This support |
---|
286 | used the convention that the if the device field was itself a |
---|
287 | pathname, the device pathname contained the location of the jar. |
---|
288 | |
---|
289 | In the analysis of the desire to treat jar pathnames as valid |
---|
290 | locations for `LOAD`, we determined that we needed a "double" pathname |
---|
291 | so we could refer to the components of a packed FASL in jar. At first |
---|
292 | we thought we could support such a syntax by having the device |
---|
293 | pathname's device refer to the inner jar. But with in this use of |
---|
294 | `PATHNAME`s linked by the `DEVICE` field, we found the problem that UNC |
---|
295 | path support uses the `DEVICE` field so JARs located on UNC mounts can't |
---|
296 | be referenced. via '\\', i.e. |
---|
297 | |
---|
298 | jar:jar:file:\\server\share\a\b\foo.jar!/this\that!/foo.java |
---|
299 | |
---|
300 | would not have a valid representation. |
---|
301 | |
---|
302 | So instead of having `DEVICE` point to a `PATHNAME`, we decided that the |
---|
303 | `DEVICE` shall be a list of `PATHNAME`, so we would have: |
---|
304 | |
---|
305 | pathname: { |
---|
306 | namestring: "jar:jar:file:\\server\share\foo.jar!/foo.abcl!/" |
---|
307 | device: ( |
---|
308 | pathname: { |
---|
309 | host: "server" |
---|
310 | device: "share" |
---|
311 | name: "foo" |
---|
312 | type: "jar" |
---|
313 | } |
---|
314 | pathname: { |
---|
315 | name: "foo" |
---|
316 | type: "abcl" |
---|
317 | } |
---|
318 | ) |
---|
319 | } |
---|
320 | |
---|
321 | Although there is a fair amount of special logic inside `Pathname.java` |
---|
322 | itself in the resulting implementation, the logic in `Load.java` seems |
---|
323 | to have been considerably simplified. |
---|
324 | |
---|
325 | When we implemented URL Pathnames, the special syntax for URL as an |
---|
326 | abstract string in the first position of the device list was naturally |
---|
327 | replaced with a URL pathname. |
---|
328 | |
---|
329 | |
---|