Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting documentation fixups for 1.9 #48440

Merged
merged 5 commits into from
Jan 30, 2023
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 24 additions & 2 deletions base/sort.jl
Original file line number Diff line number Diff line change
Expand Up @@ -524,7 +524,7 @@ Base.size(v::WithoutMissingVector) = size(v.data)
send_to_end!(f::Function, v::AbstractVector; [lo, hi])

Send every element of `v` for which `f` returns `true` to the end of the vector and return
the index of the last element which for which `f` returns `false`.
the index of the last element for which `f` returns `false`.

`send_to_end!(f, v, lo, hi)` is equivalent to `send_to_end!(f, view(v, lo:hi))+lo-1`

Expand Down Expand Up @@ -1242,7 +1242,7 @@ Otherwise, we dispatch to [`InsertionSort`](@ref) for inputs with `length <= 40`
perform a presorted check ([`CheckSorted`](@ref)).

We check for short inputs before performing the presorted check to avoid the overhead of the
check for small inputs. Because the alternate dispatch is to [`InseritonSort`](@ref) which
check for small inputs. Because the alternate dispatch is to [`InsertionSort`](@ref) which
has efficient `O(n)` runtime on presorted inputs, the check is not necessary for small
inputs.

Expand Down Expand Up @@ -1891,6 +1891,26 @@ Characteristics:
ignores case).
* *in-place* in memory.
* *divide-and-conquer*: sort strategy similar to [`MergeSort`](@ref).

Note that `PartialQuickSort(k)` does not necessarily sort the whole array. For example,

```jldoctest
julia> x = rand(100);

julia> k = 50:100;

julia> s1 = sort(x; alg=QuickSort);

julia> s2 = sort(x; alg=PartialQuickSort(k));

julia> map(issorted, (s1, s2))
(true, false)

julia> map(x->issorted(x[k]), (s1, s2))
(true, true)

julia> s1[k] == s2[k]
true
"""
struct PartialQuickSort{T <: Union{Integer,OrdinalRange}} <: Algorithm
k::T
Expand Down Expand Up @@ -1927,6 +1947,8 @@ Characteristics:
case).
* *not in-place* in memory.
* *divide-and-conquer* sort strategy.
* *good performance* for large collections but typically not quite as
fast as [`QuickSort`](@ref).
"""
const MergeSort = MergeSortAlg()

Expand Down
70 changes: 15 additions & 55 deletions doc/src/base/sort.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@ julia> sort([2,3,1], rev=true)
1
```

To sort an array in-place, use the "bang" version of the sort function:
`sort` constructs a sorted copy leaving its input unchanged. Use the "bang" version of
the sort function to mutate an existing array:

```jldoctest
julia> a = [2,3,1];
Expand Down Expand Up @@ -134,74 +135,33 @@ Base.Sort.partialsortperm!

## Sorting Algorithms

There are currently four sorting algorithms available in base Julia:
There are currently four sorting algorithms publicly available in base Julia:

* [`InsertionSort`](@ref)
* [`QuickSort`](@ref)
* [`PartialQuickSort(k)`](@ref)
* [`MergeSort`](@ref)

`InsertionSort` is an O(n²) stable sorting algorithm. It is efficient for very small `n`,
and is used internally by `QuickSort`.
By default, the `sort` family of functions uses stable sorting algorithms that are fast
on most inputs. The exact algorithm choice is an implementation detail to allow for
future performance improvements. Currently, a hybrid of `RadixSort`, `ScratchQuickSort`,
`InsertionSort`, and `CountingSort` is used based on input type, size, and composition.
Implementation details are subject to change but currently availible in the extended help
LilithHafner marked this conversation as resolved.
Show resolved Hide resolved
of `??Base.DEFAULT_STABLE` and the docstrings of internal sorting algorithms listed there.

`QuickSort` is a very fast sorting algorithm with an average-case time complexity of
O(n log n). `QuickSort` is stable, i.e., elements considered equal will remain in the same
order. Notice that O(n²) is worst-case complexity, but it gets vanishingly unlikely as the
pivot selection is randomized.

`PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only
sorted in the range of `k`. For example:

```jldoctest
julia> x = rand(1:500, 100);

julia> k = 50:100;

julia> s1 = sort(x; alg=QuickSort);

julia> s2 = sort(x; alg=PartialQuickSort(k));

julia> map(issorted, (s1, s2))
(true, false)

julia> map(x->issorted(x[k]), (s1, s2))
(true, true)

julia> s1[k] == s2[k]
true
```

!!! compat "Julia 1.9"
The `QuickSort` and `PartialQuickSort` algorithms are stable since Julia 1.9.

`MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary
array of half the size of the input array – and is typically not quite as fast as `QuickSort`.
It is the default algorithm for non-numeric data.

The default sorting algorithms are chosen on the basis that they are fast and stable.
Usually, `QuickSort` is selected, but `InsertionSort` is preferred for small data.
You can also explicitly specify your preferred algorithm, e.g.
`sort!(v, alg=PartialQuickSort(10:20))`.

The mechanism by which Julia picks default sorting algorithms is implemented via the
`Base.Sort.defalg` function. It allows a particular algorithm to be registered as the
default in all sorting functions for specific arrays. For example, here is the default
method from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort.jl):

```julia
defalg(v::AbstractArray) = DEFAULT_STABLE
```

You may change the default behavior for specific types by defining new methods for `defalg`.
You can explicitly specify your preferred algorithm with the `alg` keyword
(e.g. `sort!(v, alg=PartialQuickSort(10:20))`) or reconfigure the default sorting algorithm
for a custom types by adding a specialized method to the `Base.Sort.defalg` function.
LilithHafner marked this conversation as resolved.
Show resolved Hide resolved
For example, [InlineStrings.jl](https://github.com/JuliaStrings/InlineStrings.jl/blob/v1.3.2/src/InlineStrings.jl#L903)
defines the following method:
```julia
Base.Sort.defalg(::AbstractArray{<:Union{SmallInlineStrings, Missing}}) = InlineStringSort
```

!!! compat "Julia 1.9"
The default sorting algorithm (returned by `Base.Sort.defalg`) is guaranteed
to be stable since Julia 1.9. Previous versions had unstable edge cases when sorting numeric arrays.
The default sorting algorithm (returned by `Base.Sort.defalg`) is guaranteed to
be stable since Julia 1.9. Previous versions had unstable edge cases when
sorting numeric arrays.

## Alternate orderings

Expand Down