Opened 8 years ago

Last modified 18 months ago

#432 assigned defect

CL:OPEN on URL-PATHNAME does not redirect across different schemes

Reported by: aruttenberg Owned by: nobody
Priority: critical Milestone: 1.9.3
Component: streams Version: 1.5.0-dev
Keywords: has-test uri Cc:
Parent Tickets:

Description (last modified by Mark Evenson)

java.net.HTTPUrlConnection does not follow redirects across different schemes.

c.f. http://stackoverflow.com/questions/1884230/urlconnection-doesnt-follow-redirect#1884427

Original statement of problem:

So if you want to do something like use uiop/stream:copy-file which does something like open the source, open the dest, read/write, it will not do what you might expect. I don't see a way of controlling this behavior. Arguably the default ought to be to follow redirects and open the redirected-to file.

Change History (25)

comment:1 Changed 8 years ago by Mark Evenson

Keywords: needs-test added
Milestone: 1.4.1
Owner: set to aruttenberg
Status: newassigned
Version: 1.5.0-dev

Have you tried appling CL:TRUENAME to a EXT:URL-PATHNAME? That should explicitly compute a new retrieval (or at least a cache validation) of the representation.

Bug me again if that doesn't work…

comment:2 Changed 8 years ago by aruttenberg

(uiop/stream:copy-file 
   (truename (pathname "http://purl.obolibrary.org/obo/iao.owl")) 
   "~/Desktop/test.owl") 

yields, in "~/Desktop/test.owl"

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://raw.githubusercontent.com/information-artifact-ontology/IAO/master/releases/2015-02-23/iao.owl">here</a>.</p>
<hr>
<address>Apache/2.4.7 (Ubuntu) Server at purl.obolibrary.org Port 80</address>
</body></html>

comment:3 Changed 8 years ago by Mark Evenson

Keywords: has-test uri added; needs-test removed
Owner: changed from aruttenberg to Mark Evenson
Priority: majorblocker
Status: assignedaccepted

comment:4 Changed 8 years ago by Mark Evenson

Something has gone wonky with the URL-PATHNAME constructor on 1.5.0-dev, a result of the partial merge of the @Alan.Ruttenberg holiday deluge I am working from under.

Stay tuned, Bear fans.

comment:5 Changed 8 years ago by aruttenberg

Sure, blame it on me!

comment:6 Changed 8 years ago by Mark Evenson

Milestone: 1.4.11.5.0

Ticket retargeted after milestone closed

comment:7 Changed 8 years ago by Mark Evenson

The URL-PATHNAME constructor is working again, which reveals a more basic problem in that java.net.URLConnection does not "follow" redirects across scheme change, i.e. http://purl.obolibrary.org/obo/iao.owl via scheme http redirects to https://raw.githubusercontent.com/information-artifact-ontology/IAO/master/releases/2015-02-23/iao.owl using scheme https.

Writing code to follow scheme changes across redirects is fairly trivial (see <http://stackoverflow.com/questions/1884230/urlconnection-doesnt-follow-redirect#1884427>) but there are security implications here in automatically following a redirect from a secure session to an insecure one in that request headers (which may contain sensitive information used for authentication/authorization) that one intends to keep secret may be revealed.

My preference here would be to allow ABCL to follow redirects from http to https but not vice-versa, but this may be confusing to the user.

What would be an appropriate way to inform the end-user of what redirects are being followed?

Should we set up configuration options on what sort of redirects we allow, i.e

REDIRECT_ALL Follow all redirections
REDIRECT_SECURELY Never follow a redirection from a secure connection to an insecure one

I need to consider what the right behavior should be here?

comment:8 Changed 8 years ago by Mark Evenson

Description: modified (diff)
Summary: open http:// pathname doesn't follow redirectsCL:OPEN on URL-PATHNAME does not redirect across different schemes

comment:9 in reply to:  description Changed 8 years ago by Mark Evenson

Replying to aruttenberg:

So if you want to do something like use uiop/stream:copy-file which does something like open the source, open the dest, read/write, it will not do what you might expect. I don't see a way of controlling this behavior. Arguably the default ought to be to follow redirects and open the redirected-to file.

ABCL-BUILD now uses uiop/steam:copy-file for the machinery which retreives XDG rooted Ant and Maven installations from well-known URIs c.f. <http://abcl.org/trac/browser/trunk/abcl/contrib/abcl-build/build/install.lisp#L49>, so the underlying ext:pathname-url implementation on java.net.HTTPUrlConnection seems to be working for cases in which the URI is already canonical in the form that would be idempotent for following across HTTP 3xx redirects.

WORKAROUND: The current usage expected of the user is to introspect any reference to a ext:pathname-url via CL:TRUENAME. If there is an underlying redirect, CL:TRUENAME should somehow offer an API to the user to customize its behavior. I am currently against "jumping" across URI schema changes without a chance for user intervention. Smells like another restart to me, but comments?

Last edited 8 years ago by Mark Evenson (previous) (diff)

comment:10 Changed 8 years ago by aruttenberg

Not a restart. Restarts should be for exceptional conditions. In this case the common use pattern of PURLs is that they redirect. Always. In addition, the common case for browsers is that they follow redirects by default. ABCL should be no different.

For my case, restricting http->https would be fine, as would having a setting that controls https://docs.oracle.com/javase/7/docs/api/java/net/HttpURLConnection.html#instanceFollowRedirects

Preferred: default to permissive, warning (one time per session) in the case of potentially insecure https->http redirect with instruction about how to set default to fix.

comment:11 Changed 8 years ago by aruttenberg

BTW, I don't buy that I should have to use truename every time I use a URI to get appropriate behavior with a URI. I don't have to do that every time I use a file name. However, it is reasonable to have, in addition to normal behavior, the ability to use truename to retrieve the final redirected-to URI, if your application wants to know it. In most cases it doesn't care.

While I think your concerns about security are well-motivated, I think they are out of place here. Common lisp was not engineered for security, and bits and pieces here and there being secure won't change that. If there's a need for a more secure use of common lisp that needs to be implemented by some package, with a new set of APIs and documentation explaining what the "secure" package brings to the table.

comment:12 in reply to:  11 Changed 8 years ago by Mark Evenson

Replying to aruttenberg:

BTW, I don't buy that I should have to use truename every time I use a URI to get appropriate behavior with a URI. I don't have to do that every time I use a file name.

I haven't suggested that one needs to use CL:TRUENAME every time one uses a EXT:PATHNAME-URL, merely that it provides some clue to the user about the need to follow redirects to access the representation.

While I think your concerns about security are well-motivated, I think they are out of place here. Common lisp was not engineered for security, and bits and pieces here and there being secure won't change that. If there's a need for a more secure use of common lisp that needs to be implemented by some package, with a new set of APIs and documentation explaining what the "secure" package brings to the table.

In creating the possibility to load resources from the network via EXT::PATHNAME-URL references, it is incumbent to follow a "principle of least surprise" to the user of these new abstractions, irrespective of the security concerns of Common Lisp, the language (which probably "don't exist" in the first place). As such, to have a request for a resource via the 'https' schema get redirected through a 'http' connection while leaking information certainly would cause surprise to the user, and should be avoided if possible.

comment:13 Changed 8 years ago by Mark Evenson

Milestone: 1.5.01.6.0

Ticket retargeted after milestone closed

comment:14 Changed 5 years ago by Mark Evenson

Owner: changed from Mark Evenson to nobody
Priority: blockercritical
Status: acceptedassigned

comment:15 Changed 5 years ago by Mark Evenson

Milestone: 1.6.01.6.1

Ticket retargeted after milestone closed

comment:16 Changed 5 years ago by Mark Evenson

Milestone: 1.6.11.6.2

Ticket retargeted after milestone closed

comment:17 Changed 5 years ago by Mark Evenson

Milestone: 1.6.21.7.0

comment:18 Changed 5 years ago by Mark Evenson

Milestone: 1.7.01.7.1

Ticket retargeted after milestone closed

comment:19 Changed 4 years ago by Mark Evenson

Milestone: 1.7.11.7.2

Ticket retargeted after milestone closed

comment:20 Changed 4 years ago by Mark Evenson

Milestone: 1.7.21.8.0

Milestone renamed

comment:21 Changed 4 years ago by Mark Evenson

Milestone: 1.8.01.8.1

Ticket retargeted after milestone closed

comment:22 Changed 3 years ago by Mark Evenson

Milestone: 1.8.11.9.0

comment:23 Changed 23 months ago by Mark Evenson

Milestone: 1.9.01.9.1

comment:24 Changed 22 months ago by Mark Evenson

Milestone: 1.9.11.9.2

comment:25 Changed 18 months ago by Mark Evenson

Milestone: 1.9.21.9.3
Note: See TracTickets for help on using tickets.