Opened 6 months ago

Last modified 2 weeks ago

#432 accepted defect

CL:OPEN on URL-PATHNAME does not redirect across different schemes

Reported by: aruttenberg Owned by: mevenson
Priority: blocker Milestone: 1.6.0
Component: streams Version: 1.5.0-dev
Keywords: has-test uri Cc:
Parent Tickets:

Description (last modified by mevenson)

java.net.HTTPUrlConnection does not follow redirects across different schemes.

c.f. http://stackoverflow.com/questions/1884230/urlconnection-doesnt-follow-redirect#1884427

Original statement of problem:

So if you want to do something like use uiop/stream:copy-file which does something like open the source, open the dest, read/write, it will not do what you might expect. I don't see a way of controlling this behavior. Arguably the default ought to be to follow redirects and open the redirected-to file.

Subtickets (add)

Change History (13)

comment:1 Changed 6 months ago by mevenson

  • Keywords needs-test added
  • Milestone set to 1.4.1
  • Owner set to aruttenberg
  • Status changed from new to assigned
  • Version set to 1.5.0-dev

Have you tried appling CL:TRUENAME to a EXT:URL-PATHNAME? That should explicitly compute a new retrieval (or at least a cache validation) of the representation.

Bug me again if that doesn't work…

comment:2 Changed 5 months ago by aruttenberg

(uiop/stream:copy-file 
   (truename (pathname "http://purl.obolibrary.org/obo/iao.owl")) 
   "~/Desktop/test.owl") 

yields, in "~/Desktop/test.owl"

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://raw.githubusercontent.com/information-artifact-ontology/IAO/master/releases/2015-02-23/iao.owl">here</a>.</p>
<hr>
<address>Apache/2.4.7 (Ubuntu) Server at purl.obolibrary.org Port 80</address>
</body></html>

comment:3 Changed 5 months ago by mevenson

  • Keywords has-test uri added; needs-test removed
  • Owner changed from aruttenberg to mevenson
  • Priority changed from major to blocker
  • Status changed from assigned to accepted

comment:4 Changed 5 months ago by mevenson

Something has gone wonky with the URL-PATHNAME constructor on 1.5.0-dev, a result of the partial merge of the @Alan.Ruttenberg holiday deluge I am working from under.

Stay tuned, Bear fans.

comment:5 Changed 5 months ago by aruttenberg

Sure, blame it on me!

comment:6 Changed 4 months ago by mevenson

  • Milestone changed from 1.4.1 to 1.5.0

Ticket retargeted after milestone closed

comment:7 Changed 3 months ago by mevenson

The URL-PATHNAME constructor is working again, which reveals a more basic problem in that java.net.URLConnection does not "follow" redirects across scheme change, i.e. http://purl.obolibrary.org/obo/iao.owl via scheme http redirects to https://raw.githubusercontent.com/information-artifact-ontology/IAO/master/releases/2015-02-23/iao.owl using scheme https.

Writing code to follow scheme changes across redirects is fairly trivial (see <http://stackoverflow.com/questions/1884230/urlconnection-doesnt-follow-redirect#1884427>) but there are security implications here in automatically following a redirect from a secure session to an insecure one in that request headers (which may contain sensitive information used for authentication/authorization) that one intends to keep secret may be revealed.

My preference here would be to allow ABCL to follow redirects from http to https but not vice-versa, but this may be confusing to the user.

What would be an appropriate way to inform the end-user of what redirects are being followed?

Should we set up configuration options on what sort of redirects we allow, i.e

REDIRECT_ALL Follow all redirections
REDIRECT_SECURELY Never follow a redirection from a secure connection to an insecure one

I need to consider what the right behavior should be here?

comment:8 Changed 3 months ago by mevenson

  • Description modified (diff)
  • Summary changed from open http:// pathname doesn't follow redirects to CL:OPEN on URL-PATHNAME does not redirect across different schemes

comment:9 in reply to: ↑ description Changed 3 weeks ago by mevenson

Replying to aruttenberg:

So if you want to do something like use uiop/stream:copy-file which does something like open the source, open the dest, read/write, it will not do what you might expect. I don't see a way of controlling this behavior. Arguably the default ought to be to follow redirects and open the redirected-to file.

ABCL-BUILD now uses uiop/steam:copy-file for the machinery which retreives XDG rooted Ant and Maven installations from well-known URIs c.f. <http://abcl.org/trac/browser/trunk/abcl/contrib/abcl-build/build/install.lisp#L49>, so the underlying ext:pathname-url implementation on java.net.HTTPUrlConnection seems to be working for cases in which the URI is already canonical in the form that would be idempotent for following across HTTP 3xx redirects.

WORKAROUND: The current usage expected of the user is to introspect any reference to a ext:pathname-url via CL:TRUENAME. If there is an underlying redirect, CL:TRUENAME should somehow offer an API to the user to customize its behavior. I am currently against "jumping" across URI schema changes without a chance for user intervention. Smells like another restart to me, but comments?

Last edited 3 weeks ago by mevenson (previous) (diff)

comment:10 Changed 3 weeks ago by aruttenberg

Not a restart. Restarts should be for exceptional conditions. In this case the common use pattern of PURLs is that they redirect. Always. In addition, the common case for browsers is that they follow redirects by default. ABCL should be no different.

For my case, restricting http->https would be fine, as would having a setting that controls https://docs.oracle.com/javase/7/docs/api/java/net/HttpURLConnection.html#instanceFollowRedirects

Preferred: default to permissive, warning (one time per session) in the case of potentially insecure https->http redirect with instruction about how to set default to fix.

comment:11 follow-up: Changed 3 weeks ago by aruttenberg

BTW, I don't buy that I should have to use truename every time I use a URI to get appropriate behavior with a URI. I don't have to do that every time I use a file name. However, it is reasonable to have, in addition to normal behavior, the ability to use truename to retrieve the final redirected-to URI, if your application wants to know it. In most cases it doesn't care.

While I think your concerns about security are well-motivated, I think they are out of place here. Common lisp was not engineered for security, and bits and pieces here and there being secure won't change that. If there's a need for a more secure use of common lisp that needs to be implemented by some package, with a new set of APIs and documentation explaining what the "secure" package brings to the table.

comment:12 in reply to: ↑ 11 Changed 3 weeks ago by mevenson

Replying to aruttenberg:

BTW, I don't buy that I should have to use truename every time I use a URI to get appropriate behavior with a URI. I don't have to do that every time I use a file name.

I haven't suggested that one needs to use CL:TRUENAME every time one uses a EXT:PATHNAME-URL, merely that it provides some clue to the user about the need to follow redirects to access the representation.

While I think your concerns about security are well-motivated, I think they are out of place here. Common lisp was not engineered for security, and bits and pieces here and there being secure won't change that. If there's a need for a more secure use of common lisp that needs to be implemented by some package, with a new set of APIs and documentation explaining what the "secure" package brings to the table.

In creating the possibility to load resources from the network via EXT::PATHNAME-URL references, it is incumbent to follow a "principle of least surprise" to the user of these new abstractions, irrespective of the security concerns of Common Lisp, the language (which probably "don't exist" in the first place). As such, to have a request for a resource via the 'https' schema get redirected through a 'http' connection while leaking information certainly would cause surprise to the user, and should be avoided if possible.

comment:13 Changed 2 weeks ago by mevenson

  • Milestone changed from 1.5.0 to 1.6.0

Ticket retargeted after milestone closed

Note: See TracTickets for help on using tickets.