Opened 8 years ago
Last modified 17 months ago
#432 assigned defect
CL:OPEN on URL-PATHNAME does not redirect across different schemes
Reported by: | aruttenberg | Owned by: | nobody |
---|---|---|---|
Priority: | critical | Milestone: | 1.9.3 |
Component: | streams | Version: | 1.5.0-dev |
Keywords: | has-test uri | Cc: | |
Parent Tickets: |
Description (last modified by )
java.net.HTTPUrlConnection
does not follow redirects across different schemes.
c.f. http://stackoverflow.com/questions/1884230/urlconnection-doesnt-follow-redirect#1884427
Original statement of problem:
So if you want to do something like use uiop/stream:copy-file which does something like open the source, open the dest, read/write, it will not do what you might expect. I don't see a way of controlling this behavior. Arguably the default ought to be to follow redirects and open the redirected-to file.
Change History (25)
comment:1 Changed 8 years ago by
Keywords: | needs-test added |
---|---|
Milestone: | → 1.4.1 |
Owner: | set to aruttenberg |
Status: | new → assigned |
Version: | → 1.5.0-dev |
comment:2 Changed 8 years ago by
(uiop/stream:copy-file (truename (pathname "http://purl.obolibrary.org/obo/iao.owl")) "~/Desktop/test.owl")
yields, in "~/Desktop/test.owl"
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>302 Found</title> </head><body> <h1>Found</h1> <p>The document has moved <a href="https://raw.githubusercontent.com/information-artifact-ontology/IAO/master/releases/2015-02-23/iao.owl">here</a>.</p> <hr> <address>Apache/2.4.7 (Ubuntu) Server at purl.obolibrary.org Port 80</address> </body></html>
comment:3 Changed 8 years ago by
Keywords: | has-test uri added; needs-test removed |
---|---|
Owner: | changed from aruttenberg to Mark Evenson |
Priority: | major → blocker |
Status: | assigned → accepted |
comment:4 Changed 8 years ago by
Something has gone wonky with the URL-PATHNAME constructor on 1.5.0-dev, a result of the partial merge of the @Alan.Ruttenberg holiday deluge I am working from under.
Stay tuned, Bear fans.
comment:7 Changed 8 years ago by
The URL-PATHNAME
constructor is working again, which reveals a more basic problem in that java.net.URLConnection
does not "follow" redirects across scheme change, i.e. http://purl.obolibrary.org/obo/iao.owl
via scheme http
redirects to https://raw.githubusercontent.com/information-artifact-ontology/IAO/master/releases/2015-02-23/iao.owl
using scheme https
.
Writing code to follow scheme changes across redirects is fairly trivial (see <http://stackoverflow.com/questions/1884230/urlconnection-doesnt-follow-redirect#1884427>) but there are security implications here in automatically following a redirect from a secure session to an insecure one in that request headers (which may contain sensitive information used for authentication/authorization) that one intends to keep secret may be revealed.
My preference here would be to allow ABCL to follow redirects from http
to https
but not vice-versa, but this may be confusing to the user.
What would be an appropriate way to inform the end-user of what redirects are being followed?
Should we set up configuration options on what sort of redirects we allow, i.e
REDIRECT_ALL | Follow all redirections |
REDIRECT_SECURELY | Never follow a redirection from a secure connection to an insecure one |
I need to consider what the right behavior should be here?
comment:8 Changed 8 years ago by
Description: | modified (diff) |
---|---|
Summary: | open http:// pathname doesn't follow redirects → CL:OPEN on URL-PATHNAME does not redirect across different schemes |
comment:9 Changed 7 years ago by
Replying to aruttenberg:
So if you want to do something like use uiop/stream:copy-file which does something like open the source, open the dest, read/write, it will not do what you might expect. I don't see a way of controlling this behavior. Arguably the default ought to be to follow redirects and open the redirected-to file.
ABCL-BUILD now uses uiop/steam:copy-file
for the machinery which retreives XDG rooted Ant and Maven installations from well-known URIs c.f. <http://abcl.org/trac/browser/trunk/abcl/contrib/abcl-build/build/install.lisp#L49>, so the underlying ext:pathname-url
implementation on java.net.HTTPUrlConnection
seems to be working for cases in which the URI is already canonical in the form that would be idempotent for following across HTTP 3xx redirects.
WORKAROUND: The current usage expected of the user is to introspect any reference to a ext:pathname-url
via CL:TRUENAME. If there is an underlying redirect, CL:TRUENAME should somehow offer an API to the user to customize its behavior. I am currently against "jumping" across URI schema changes without a chance for user intervention. Smells like another restart to me, but comments?
comment:10 Changed 7 years ago by
Not a restart. Restarts should be for exceptional conditions. In this case the common use pattern of PURLs is that they redirect. Always. In addition, the common case for browsers is that they follow redirects by default. ABCL should be no different.
For my case, restricting http->https would be fine, as would having a setting that controls https://docs.oracle.com/javase/7/docs/api/java/net/HttpURLConnection.html#instanceFollowRedirects
Preferred: default to permissive, warning (one time per session) in the case of potentially insecure https->http redirect with instruction about how to set default to fix.
comment:11 follow-up: 12 Changed 7 years ago by
BTW, I don't buy that I should have to use truename every time I use a URI to get appropriate behavior with a URI. I don't have to do that every time I use a file name. However, it is reasonable to have, in addition to normal behavior, the ability to use truename to retrieve the final redirected-to URI, if your application wants to know it. In most cases it doesn't care.
While I think your concerns about security are well-motivated, I think they are out of place here. Common lisp was not engineered for security, and bits and pieces here and there being secure won't change that. If there's a need for a more secure use of common lisp that needs to be implemented by some package, with a new set of APIs and documentation explaining what the "secure" package brings to the table.
comment:12 Changed 7 years ago by
Replying to aruttenberg:
BTW, I don't buy that I should have to use truename every time I use a URI to get appropriate behavior with a URI. I don't have to do that every time I use a file name.
I haven't suggested that one needs to use CL:TRUENAME
every time one uses a EXT:PATHNAME-URL
, merely that it provides some clue to the user about the need to follow redirects to access the representation.
While I think your concerns about security are well-motivated, I think they are out of place here. Common lisp was not engineered for security, and bits and pieces here and there being secure won't change that. If there's a need for a more secure use of common lisp that needs to be implemented by some package, with a new set of APIs and documentation explaining what the "secure" package brings to the table.
In creating the possibility to load resources from the network via EXT::PATHNAME-URL
references, it is incumbent to follow a "principle of least surprise" to the user of these new abstractions, irrespective of the security concerns of Common Lisp, the language (which probably "don't exist" in the first place). As such, to have a request for a resource via the 'https' schema get redirected through a 'http' connection while leaking information certainly would cause surprise to the user, and should be avoided if possible.
comment:14 Changed 5 years ago by
Owner: | changed from Mark Evenson to nobody |
---|---|
Priority: | blocker → critical |
Status: | accepted → assigned |
comment:17 Changed 4 years ago by
Milestone: | 1.6.2 → 1.7.0 |
---|
comment:22 Changed 3 years ago by
Milestone: | 1.8.1 → 1.9.0 |
---|
comment:23 Changed 22 months ago by
Milestone: | 1.9.0 → 1.9.1 |
---|
comment:24 Changed 21 months ago by
Milestone: | 1.9.1 → 1.9.2 |
---|
comment:25 Changed 17 months ago by
Milestone: | 1.9.2 → 1.9.3 |
---|
Have you tried appling
CL:TRUENAME
to aEXT:URL-PATHNAME
? That should explicitly compute a new retrieval (or at least a cache validation) of the representation.Bug me again if that doesn't work…