Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#340 closed defect (fixed)

DIRECTORY applied to symbolic links with non-existent targets returns errors

Reported by: mevenson Owned by:
Priority: major Milestone: 1.3.0
Component: interpreter Version: 1.3.0-dev
Keywords: Cc:
Parent Tickets:

Description

Alan Ruttenberg notes http://article.gmane.org/gmane.lisp.armedbear.devel/3048:

Emacs creates files like this:
lrwxr-xr-x   1 alanr  staff     30 Jan  6 01:40 .#foo.sql <at>  ->
alanr@...

as lock files, IIRC.

In such cases, the target of link names a file that doesn't exist.
#'directory calls #'truename on the filenames that are listed, and
truename does a probe file. Incidentally It looks like that is done in
the java as well as the lisp function making the latter redundant.

It seems wrong that there is a situation in which one can't call
directory at all without getting an error.

The crux of the matter is how to interpret the documentation of
truename in the presence of symbolic links.  The doc for truename
says:  If filespec is a pathname it represents the name used to open
the file. This may be, but is not required to be, the actual name of
the file."

"actual name of the file" suggests that there is a file, but also
suggests that it is the name that is important. Also whereas directory
allows implementation-dependent keywords, truename is not so defined.

I'll take the position that truename should *not* depend on the file
existing, but rather focus on what names are present in directory
structures. So resolve-truenames nil should should return the names in
the directory and for resolve-truenames t it should follow symbolic
links as long as they can be followed without hitting a name that
doesn't designate an existing file.

Note that followup comments by Alan and Marco to the behavior of SBCL and CCL, and the relation to the Hyperspec were not picked up in the Gmane archives.

Subtickets

Change History (9)

comment:1 Changed 3 years ago by alan ruttenberg

[discussion from mailing list]

From: Marco Antoniotti <marcoxa@…>
Date: Tue, Jan 7, 2014 at 5:05 AM
To: abcl-dev <armedbear-devel@…>

Hello,

as anything that deals with PATHNAMES is inherently a “hic sunt leones” area.

My suggestion - as a user! - regarding this TRUENAME issue is that the top confer is to minimize differences with other implementations. Another suggestion would be to see what Farè’s UIOP library is doing in this case.

Cheers

MA
--
Marco Antoniotti


From: Alan Ruttenberg <alanruttenberg@…>
Date: Tue, Jan 7, 2014 at 1:40 PM
To: Marco Antoniotti <marcoxa@…>
Cc: abcl-dev <armedbear-devel@…>

Well, SBCL returns the name of the directory entries(without resolving
symbolic links) when I use the directory function and truename returns
the directory entry unmolested for the problematic entry. I guess
sbcl precedes Faré ;-)

Note this is despite the fact that sbcl also has a :resolve-symlinks
argument to directory, and that it also defaults to true.

I think this should be the behavior for ABCL, with the resolving and
probing separated altogether from the truename and directory code.

I'm downloading ccl to see what it does.

-Alan


From: Alan Ruttenberg <alanruttenberg@…>
Date: Tue, Jan 7, 2014 at 1:53 PM
To: Marco Antoniotti <marcoxa@…>
Cc: abcl-dev <armedbear-devel@…>

Clozure CCL's behavior is more complicated and confusing.

Without keywords directory doesn't return the file *at all* while
truename fails saying that the file does not exist.

However it also has keywords :include-emacs-lock-files and :follow-links

With :include-emacs-lock-files t and :follow-links nil it directory
returns the problematic entry's name, and with :follow-links t (or not
supplied) it gives an error.

I would be satisfied with this behavior as well, though I think SBCL's
is cleaner.

The problem with ABCL's function is that there is no way to avoid
getting an error when applying directory to it.

-Alan


From: Mark Evenson <evenson@…>
Date: Fri, Jan 10, 2014 at 6:44 AM
To: abcl-dev <armedbear-devel@…>
Cc: Marco Antoniotti <marcoxa@…>, Alan Ruttenberg <alanruttenberg@…>

On Jan 7, 2014, at 19:40, Alan Ruttenberg <alanruttenberg@…> wrote:

Well, SBCL returns the name of the directory entries(without resolving
symbolic links) when I use the directory function and truename returns
the directory entry unmolested for the problematic entry. I guess
sbcl precedes Faré ;-)

Note this is despite the fact that sbcl also has a :resolve-symlinks
argument to directory, and that it also defaults to true.

I think this should be the behavior for ABCL, with the resolving and
probing separated altogether from the truename and directory code.

It is [quite clear from the Hyperspec][DIRECTORY] that DIRECTORY "returns a fresh list of pathnames corresponding to the truenames of those files.”

And it is [quite clear as well][TRUENAME] that calling TRUENAME on a non-existent file should signal a file error.

What is not clear is whether the “fresh list of pathnames” returned by DIRECTORY is the actual result of calling TRUENAME on the filespec. I guess it doesn’t have to be, which is what SBCL and CCL seem to be implementing.

I guess I am leaning towards adding a further arg to DIRECTORY “:RESOLVE-ENTRIES-VIA-TRUENAME” (terrible name, should be shorter) which defaults to NIL, which would preserve the current behavior. Otherwise, we would treat symbolic links in the same manner as SBCL: populate the PATHNAME with what would be used in making a TRUENAME call, but don’t actually make the call.

Adding an application specific (“ignore Emacs backup files”) mechanism like CCL to the Common Lisp part of things smells wrong: one should at least implement the rudiments of an extensible API that would allow other filespecs to be added to the list of directory entries to be treated in this manner.

But I do think that the ANSI spec intends that once one has the results of the DIRECTORY call, and the relevant parts of the filesystem are not changed in the meantime, applications expect that the results remain valid TRUENAMEs (i.e. they can be accessed).

In practice, I guess this is why everyone has their own toolsets for dealing with filesystems.

Comments?

[DIRECTORY]: http://www.lispworks.com/documentation/HyperSpec/Body/f_dir.htm
[TRUENAME]: http://www.lispworks.com/documentation/HyperSpec/Body/f_tn.htm

--
"A screaming comes across the sky. It has happened before but there is nothing
to compare to it now."


From: Alan Ruttenberg <alanruttenberg@…>
Date: Fri, Jan 10, 2014 at 9:51 AM
To: Mark Evenson <evenson@…>
Cc: abcl-dev <armedbear-devel@…>, Marco Antoniotti <marcoxa@…>

There are, I would offer, several arguments suggesting that the
current behavior should not be the default.

1) Exceptions: Generally, listing the contents of a directory is not
considered an exception unless the directory structure is corrupted.
In some cases (e.g. unix "rm") you can't even do anything to the
target of the link. So ABCL signals an exception when most would not
expect it. Consider the case where the link points to a file on a
device that may or not be mounted. If the device is taken offline,
nothing in the directory changes, and yet the behavior does. It seems
that this situation is more properly handled by an exception when
opening the file.

The doc for truename says, about conditions: "An error of type
file-error is signaled if an appropriate file cannot be located within
the file system for the given file spec or if the file system cannot
perform the requested operation.". Whether or not the symbolic link is
a file or not. Certainly for some cases it is, for example when
handled by archiving or certain version control systems, or in the
cases I list in (4). In the case of "rm", the answer is yes, it is a
file, implicit from the documentation: "The rm utility attempts to
remove the non-directory type files specified on the command line.
[...] The rm utility removes symbolic links, not the files referenced
by the links. In ABCL, (directory "/") -> (#P"/"). This means that a)
ABCL is inconsistent in that it sometimes returns truenames for things
that can not be "opened" or b) ABCL admits that there can be truenames
for entities other than files, which makes it's treatment of symbolic
links inconsistent.

2) Pragmatism: With the current default, the only possible
programmatic repair in the common case that you don't care about these
entries - for example if you are looking for a file whose name matches
a pattern not expressible using the directory wild card expressivity -
is to use an implementation-specific keyword. CCL's implementation
also has this property, which I also consider to be a fail. In
practical terms this means that the unsuspecting programmer must wrap
all calls to directory with a catch and have the handler respond in an
implementation-specific way. It is much more common to protect
accesses to a file than accesses to a directory.

3) Truth in advertising: The exception happens independent of the
value of the :resolve-symlinks keyword. If the function is told not to
resolve symlinks, one would expect it doesn't resolve symlinks. Yet it
does, since that's the only way that it could figure out that the
"file does not exist".

4) File system operations other than opening: There are legitimate
operations on unresolvable links. For example, such files can be
removed, renamed, and there are retrievable dates and other metadata
retrievable about them. One would not expect directory to balk in the
case that you are retrieving file names for one of these purposes.

My conclusion:

  • If resolve-symlinks is false, the behavior should be either that of

SBCL's (return the name) or CCL's (don't return the name) depending on
which answer the implementation takes towards the question "is a
symbolic link a file?".

  • If resolve-symlinks is true then signal an error.
  • The default should be :resolve-symlinks nil, because cognate

directory operations in every operation's default case that I'm aware
of is to not consider this case an exception.

Regarding CCL's application-specific reference to emacs, I simply
think the argument is named poorly. In fact directory will return
paths to any links that do not resolve, not only emacs lock files. A
better argname would be :include-unresolvable-links.

This email was tool long :)

-Alan


From: Alessio Stalla <alessiostalla@…>
Date: Fri, Jan 10, 2014 at 10:06 AM
To: Alan Ruttenberg <alanruttenberg@…>
Cc: Mark Evenson <evenson@…>, abcl-dev <armedbear-devel@…>, Marco Antoniotti <marcoxa@…>

On Fri, Jan 10, 2014 at 3:51 PM, Alan Ruttenberg <alanruttenberg@…> wrote:
There are, I would offer, several arguments suggesting that the
current behavior should not be the default.

[...]


My conclusion:

  • If resolve-symlinks is false, the behavior should be either that of

SBCL's (return the name) or CCL's (don't return the name) depending on
which answer the implementation takes towards the question "is a
symbolic link a file?".

  • If resolve-symlinks is true then signal an error.
  • The default should be :resolve-symlinks nil, because cognate

directory operations in every operation's default case that I'm aware
of is to not consider this case an exception.

I agree wholeheartedly. CL's filesystem handling is already bad as it is without implementation-specific warts.

comment:2 Changed 3 years ago by mevenson

java.nio.Path would be helpful here, but it was only introduced in Java 7

http://docs.oracle.com/javase/7/docs/api/java/nio/file/Path.html#toRealPath%28java.nio.file.LinkOption...%29

comment:3 Changed 3 years ago by mevenson

Apache Commons uses the mismatch between canonical version of the directory of the java.io.File object and the object itself to determine whether the file possibly refers to a symbolic link.

Reportedly this doesn't work on Windows.

http://stackoverflow.com/questions/813710/java-1-6-determine-symbolic-links

comment:4 Changed 3 years ago by alan ruttenberg

For the resolve-pathnames nil case, why not return

(map 'list #"getName" (#"listFiles" (new 'file "/Users/alanr/repos/iop/")))

post-filtered for any :wilds

and

(#"isFile" (new 'file "/Users/alanr/repos/iop/.#compare-queries.lisp"))
-> nil

to test for symbolic links? (or at least files that one might try to resolve as symbolic links)
The irony of the nil result does not escape me :)

This is java 1.6

comment:5 Changed 3 years ago by mark evenson (2)

comment:6 Changed 3 years ago by mevenson

Which seems to have broken :wild-inferiors (directory "~/work//*.lisp"). I swear I had this working: seems to have been something I touched in directory.lisp in r14619.

comment:7 Changed 3 years ago by mevenson

Current implementation definitely breaks DIRECTORY's use of :WILD-INFERIORS.

Closing this ticket as implemented, with the following bug: http://abcl.org/trac/ticket/344

comment:8 Changed 3 years ago by mevenson

  • Resolution set to fixed
  • Status changed from new to closed

comment:9 Changed 3 years ago by mevenson

DIRECTORY's use of :WILD-INFERIORS was fixed in r14624.

Note: See TracTickets for help on using tickets.