Opened 12 years ago

Closed 12 years ago

Last modified 4 years ago

#196 closed defect (fixed)

STABLE-SORT is only stable for lists

Reported by: Mark Evenson Owned by: ehuelsmann
Priority: blocker Milestone: 1.1.0
Component: interpreter Version: 1.0.1
Keywords: ansi-conformance Cc:
Parent Tickets:

Description

Jorge Tavares reports on armedbear-devel@:

Recently I started to investigate the different CL sort implementations and found that stable-sort in ABCL does not execute as required. The problem is that stable-sort is only stable for lists and not for all type of sequences (i.e., elements considered equal by the predicate should stay in their original order).

A very simple test shows this error:

idesk:~ jast$ abcl
Armed Bear Common Lisp 1.0.1
Java 1.6.0_29 Apple Inc.
Java HotSpot(TM) 64-Bit Server VM
Low-level initialization completed in 0.399 seconds.
Startup completed in 1.069 seconds.
Loading /Users/jast/.abclrc completed in 4.8 seconds.
Type ":help" for a list of available commands.
CL-USER(1): (defparameter *stuff* #((1 0.31648433) (0 0.2704757) (0 0.48931926) (1 0.9958766) (0 0.49251676)))
*STUFF*
CL-USER(2): (setf *stuff* (stable-sort *stuff* #'< :key #'first))
#((0 0.49251676) (0 0.2704757) (0 0.48931926) (1 0.31648433) (1 0.9958766))

The above test is short and simple to understand: *stuff* contains an array where each element is a pair. The first element of a pair is the key and the second the data. With this in mind, the expected result should have been: #((0 0.2704757) (0 0.48931926) (0 0.49251676) (1 0.31648433) (1 0.9958766)). 

The same error happens with larger arrays and other types of data. I also tested it in SBCL, CCL, ECL and CLisp and for these implementations the results are the expected. Only ABCL differs form the major open source implementations (I didn't test in ACL or LW).

The reason for this bug is actually quite simple. In sort.lisp, in the definition of stable-sort, seq-dispatch calls for the non-list sequences the quicksort algorithm, which is not stable. For lists it calls merge sort (which is stable) and the problem does not arise. 

As a quick fix, I send in attach a patch that uses in stable-sort merge sort for all sequences. This is done by coercing the sequence to list, calling merge sort and coercing it back to the original sequence type. However, as a long term improvement, the best solution would be to implement a merge sort for non-list sequences.

Change History (7)

comment:1 Changed 12 years ago by Mark Evenson

Resolution: fixed
Status: newclosed

comment:2 Changed 12 years ago by Mark Evenson

Priority: majorblocker
Resolution: fixed
Status: closedreopened

Just noticed that the change of implementation for STABLE-SORT has induced the failure of the following ANSI tests:

STABLE-SORT-VECTOR.4
STABLE-SORT-VECTOR.6
STABLE-SORT-VECTOR.7
STABLE-SORT-VECTOR.8
STABLE-SORT-BIT-VECTOR.2
STABLE-SORT-BIT-VECTOR.3
STABLE-SORT-STRING.1
STABLE-SORT-STRING.2
STABLE-SORT-STRING.3
STABLE-SORT-STRING.4

Re-opening this ticket; marking as blocker for the abcl-1.1 release.

comment:3 Changed 12 years ago by Mark Evenson

(In [13870]) See #196: further patch for STABLE-SORT from Jorge Tavares.

easye: still seeing the ANSI failures, but this is a much more
plausible "final" implementation with the appropiate optimizations
which should be easier to fix modulo the possible hairy macro
debugging part. But that's why they call it trunk, right?

I send in attach a patch with further improvements to sort and
stable-sort for sequences other than lists. In short, the patch
includes a merge sort for vectors. To allow different types I've
written the algorithm using macros and these generate the appropriate
code according to the vector type. This way the algorithm is in a
single place avoiding duplication of code. The macros also take care
of the situation of when no key is present, avoiding the use of
unnecessary funcalls. The quicksort algorithm was also refactored in
the same way.

I've tested the algorithms and they seem to be working correct. Stable
sort is now considerably faster since the fix before converted the
sequences to a list and used the sort-list function. I've made some
benchmarking to verify how fast is sort and stable-sort. The tables
with the results are also in a file sent in attach [1]. For
stable-sort I've compare the current trunk version with the patched
one while for sort I've compared 1.0.1, the trunk and with the
patch. For unsorted vectors sort has a speed up of 7.5 from 1.0.1 and
this considers only vectors of size 8 to 8192 (1.0.1 hits the
worst-case quite fast). For stable-sort the speed up is around 90.2
from vectors of size 8 to 32768. The sort functions become even faster
for the nearly sorted vectors. I think the tables clearly show t he
speed-ups

Naturally these benchmarks cannot be used to draw definite conclusions
since they lack rigorous testing but I think they can provide some
indications. With this patch, I think ABCL gets good performant
sorting functions, especially for large vectors. As for lists, I
haven't looked at them so probably they can also be improved (but I
really don't know).

Cheers,
Jorge

[1] The tables result from the generation of simple-vectors of sizes 8
to 524288 (powers of 2 from 3 to 19) with distinct integer: unsorted,
nearly sorted (distances 0, 4 and 16), sorted and reversed sorted. The
nearly sorted vectors were constructed by selecting pairs where they
would swap with a neighbor at a certain distance. I did 100 runs and
timed only the sorting operation. The tables contain the averages of
the 100 runs. They were performed in an iMac (2.5GHz i5, 4GB) with Mac
OS X 10.7.3.

[1]: http://article.gmane.org/gmane.lisp.armedbear.devel/2220

comment:4 Changed 12 years ago by Evenson Not Org

After the further patch from Jorge of 11-FEB-2012 commited in r13870 , easye still sees errors from the ANSI, something about unexpected types:

Test STABLE-SORT-VECTOR.4 failed
Form: (LET ((A (MAKE-ARRAY 10 :INITIAL-CONTENTS (QUOTE (10 40 20 50 30 15 45 25 55 35)) :FILL-POINTER 5))) (STABLE-SORT A (FUNCTION <)))
Expected value:
          #(10 20 30 40 50)
Actual value:
          #<TYPE-ERROR {678042C2}> [The value #(10 40 20 50 30) is not of type SIMPLE-VECTOR.]
Test STABLE-SORT-VECTOR.6 failed
Form: (DO-SPECIAL-INTEGER-VECTORS (V #(1 4 7 3 2 6 5) NIL) (LET ((SV (STABLE-SORT V (FUNCTION <)))) (ASSERT (EQUALP SV #(1 2 3 4 5 6 7)))))
Expected value:
         

comment:5 Changed 12 years ago by Mark Evenson

(In [13873]) Restore autoload CL:MERGE as part of ANSI sort triage (See #196).

This was mistakenly removed as part of Jorge Tavares' last commit.

As an optimization, we attempt to invoke the original quicksort
implementation if the new one fails while emitting a warning to the
user.

comment:6 Changed 12 years ago by Mark Evenson

Resolution: fixed
Status: reopenedclosed

(In [13931]) Fixes #196: STABLE-SORT is only stable for lists.

Somewhat kludgily fix the macrology submitted by Jorge Tavares to pass
all the newly failing ANSI tests introduced. The macrology of
MERGE-VECTORS-BODY and MERGE-SORT-BODY required that the sequences
were of type SIMPLE-VECTOR. But somehow, MERGE-SORT-BODY was not
working when asked to stable sort sequences of type BIT-VECTOR or
STRING, both of which are subtypes of VECTOR but not SIMPLE-VECTOR.

comment:7 in reply to:  6 Changed 4 years ago by Mark Evenson

Replying to Mark Evenson:

(In [13931]) Fixes #196: STABLE-SORT is only stable for lists.

Somewhat kludgily fix the macrology submitted by Jorge Tavares to pass
all the newly failing ANSI tests introduced.

Apparently I am able to learn basic stuff eventually.

Fixed my kludges to restore svref accessor optimization in the sort.lisp macrology with <https://abcl.org/trac/changeset/15327>.

Note: See TracTickets for help on using tickets.