Skip to content

Commit

Permalink
[wip] start report
Browse files Browse the repository at this point in the history
  • Loading branch information
pgujjula committed Apr 4, 2024
1 parent 424c7ae commit e115b54
Show file tree
Hide file tree
Showing 5 changed files with 194 additions and 0 deletions.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,12 @@ SPDX-License-Identifier: CC-BY-SA-4.0

⚠️ Under construction. No stability guarantees, even commit history may be
overwritten.

See [ALGORITHM.md](docs/ALGORITHM.md) for a full description of the algorithm.

## Licensing
* Code is licensed under BSD-3-Clause.
* Documentation is licensed under CC-BY-SA-4.0.
* Some config is licensed under CC0-1.0.

See the licensing information in the header of each file for exact licensing information. This project follows [REUSE](https://reuse.software) licensing principles.
1 change: 1 addition & 0 deletions apply-merge.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ tested-with:
extra-doc-files:
README.md
ChangeLog.md
docs/ALGORITHM.md

source-repository head
type: git
Expand Down
172 changes: 172 additions & 0 deletions docs/ALGORITHM.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
<!--
SPDX-FileCopyrightText: Copyright Preetham Gujjula
SPDX-License-Identifier: CC-BY-SA-4.0
-->

# Motivation

## The problem
Say that we want to compute the [3-smooth numbers](https://en.wikipedia.org/wiki/Smooth_number),
i.e., the integers that can be written in the form $2^{i} 3^{j}$, where $i$ and
$j$ are non-negative integers.

It's easy enough to compute the powers of 2 and 3:

```haskell
powersOf2 :: [Integer]
powersOf2 = iterate (*2) 1

powersOf3 :: [Integer]
powersOf3 = iterate (*3) 1
```

But how do we implement the following?

```haskell
-- take 10 smooth3 == [1, 2, 3, 4, 6, 9, 12, 16, 18, 24]
smooth3 :: [Integer]
smooth3 = ..
```

If we limited `powersOf2` and `powersOf3`, say to the first 10 elements, we
could write:

```haskell
powersOf2Bounded :: [Integer]
powersOf2Bounded = take 10 (iterate (*2) 1)

powersOf3Bounded :: [Integer]
powersOf3Bounded = take 10 (iterate (*3) 1)

smooth3Bounded :: [Integer]
smooth3Bounded = sort (liftA2 (*) powersOf2Bounded powersOf3Bounded)
```

And `smooth3Bounded` would consist of $\\{2^{i} 3^{j} \mid 0 \le i, j < 10\\}$.
But this doesn't work when `powersOf2` and `powersOf3` are infinite. We need a
way to lazily generate `smooth3` given `powersOf2` and `powersOf3`.

## A closer look
Let's lay out the elements of `smooth3` in a 2D grid:

<pre>
| 1 2 4 8 ..
---------------------------------
1 | 1 2 4 8 ..
|
3 | 3 6 12 24 ..
|
9 | 9 18 36 72 ..
|
27 | 27 54 108 216 ..
|
: | : : : : ॱ.
</pre>

Since `(*)` is monotonically increasing in both arguments, each element is ≤
the elements below and to the right. Therefore, when an element appears in
`smooth3`, the element above and to the left must already have appeared.

Let's think about `smooth3` after 3 elements have been produced:

<pre>
| 1 2 4 8 ..
---------------------------------
1 | <del>1</del> <del>2</del> <b>4</b> 8 ..
|
3 | <del>3</del> <b>6</b> 12 24 ..
|
9 | <b>9</b> 18 36 72 ..
|
27 | 27 54 108 216 ..
|
: | : : : : ॱ.
</pre>
After producing `1, 2, 3`, the next element in `smooth3` can only be one of `{4, 6, 9}`. We know this without preforming any comparisons, just by the positions of these elements in the grid, as these are the only elements whose up- and left-neighbors have already been produced.

After producing the next element, `4`, our grid becomes

<pre>
| 1 2 4 8 ..
---------------------------------
1 | <del>1</del> <del>2</del> <del>4</del> <b>8</b> ..
|
3 | <del>3</del> <b>6</b> 12 24 ..
|
9 | <b>9</b> 18 36 72 ..
|
27 | 27 54 108 216 ..
|
: | : : : : ॱ.
</pre>

Now, the "frontier" of possible next elements is `{6, 8, 9}`. We added `8` to the frontier as it is no longer "blocked" by `4`. Note that we cannot yet add `12` to the frontier even though it's no longer blocked by `4`, as it's still blocked by `6`.

This investigation suggests the following algorithm for producing `smooth3`:
```
1. Initialize a "frontier" with the top left element
2. Delete the smallest element x from the frontier and yield it as the next element of smooth3.
3. If the down- or right-neighbors of x are unblocked, add them to the frontier.
4. Go to step 2.
```

# The applyMerge function

The idea in the algorithm above is quite general, and instead of using it on
`(*)`, `powersOf2`, and `powersOf3`, we can apply it to any binary function `f`
that is monotonically increasing in both arguments, and any ordered lists `xs`
and `ys`. Then we can define a general
```haskell
applyMerge :: (Ord c) => (a -> b -> c) -> [a] -> [b] -> [c]
```
where `applyMerge f xs ys` is an ordered list of all `f x y`, where `x xs` and `y ys`. In particular, `smooth3 = applyMerge (*) powersOf2 powersOf3`.
We can also define a more general
```haskell
applyMergeBy :: (c -> c -> Ordering) -> (a -> b -> c) -> [a] -> [b] -> [c]
```

## Implementation and complexity
We can use a priority queue to maintain the frontier of elements. To determine in step 3 whether a down- or right-neighbor is unblocked, we maintain two `IntSet`s, for the `x`- and `y`- indices of all elements in the frontier. Then in step 3, if the `x`- and `y`-index of a neighbor are not in the respective `IntSet`s, the neighbor is unblocked and can be added to the frontier.

After producing $n$ elements, the size of the frontier is at most $O(\sqrt{n})$ elements. <i>(TODO: prove this)</i> Manipulating the priority queue and the `IntSet`s to produce the next element takes $O(\text{log } \sqrt{n}) = O(\text{log } n)$ time. Therefore, producing $n$ elements of `applyMerge f xs ys` takes $O(n \log n)$ time and $O(\sqrt{n})$ auxiliary space.

## Other applications

`applyMerge` is versatile!
```haskell
-- Sieve of Erastosthenes
primes :: [Int]
primes = 2 : ([3..] `minus` composites) -- `minus` from data-ordlist

composites :: [Int]
composites = applyMerge (*) primes [2..]

-- Square-free integers
squarefrees :: [Int]
squarefrees = [1..] `minus` applyMerge (*) (map (^2) primes) [1..]

-- Gaussian integers in the first quadrant, ordered by norm
gaussianInts :: [Complex Int]
gaussianInts = applyMergeBy (comparing (\a b -> a*a + b*b)) (:+) [1..] [1..]
```

# Prior work
In <code>[data-ordlist](https://www.stackage.org/lts/package/data-ordlist)</code>,
there is
<code>[mergeAll](https://www.stackage.org/haddock/lts/data-ordlist/Data-List-Ordered.html#v:mergeAll) :: Ord a => [[a]] -> [a]</code>,
which merges a potentially infinite list of ordered lists, where the heads of
the inner lists are sorted. This is more general than `applyMerge`, in the sense
that we can implement `applyMerge` in terms of `mergeAll`:

```haskell
applyMerge :: (Ord c) => (a -> b -> c) -> [a] -> [b] -> [c]
applyMerge f xs ys =
mergeAll (map (\y -> map (\x -> f x y) xs) ys)
```

However, `mergeAll` uses $O(n)$ auxiliary space in the worst case, while our
implementation of `applyMerge` uses just $O(\sqrt{n})$ auxiliary space.

---

###### Copyright Preetham Gujjula, licensed under [CC-BY-SA-4.0](../LICENSES/CC-BY-SA-4.0.txt).
1 change: 1 addition & 0 deletions package.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ copyright: "Preetham Gujjula"
extra-doc-files:
- README.md
- ChangeLog.md
- docs/ALGORITHM.md

description: Please see the README on GitHub at
<https://github.com/pgujjula/apply-merge#readme>
Expand Down
11 changes: 11 additions & 0 deletions src/Data/List/ApplyMerge.hs
Original file line number Diff line number Diff line change
@@ -1,9 +1,20 @@
-- SPDX-FileCopyrightText: Copyright Preetham Gujjula
-- SPDX-License-Identifier: BSD-3-Clause

-- |
-- Module: Data.List.ApplyMerge
-- License: BSD-3-Clause
-- Maintainer: Preetham Gujjula <[email protected]>
-- Stability: experimental
module Data.List.ApplyMerge (applyMerge) where

import Data.List.ApplyMerge.IntSet qualified

-- | Given a binary function @f@ that is monotonically increasing in each
-- argument, and ordered lists @xs@ and @ys@,
--
-- > applyMerge f xs ys
--
-- is an ordered list of all @f x y@, where @x ∈ xs@ and @y ∈ ys@.
applyMerge :: (Ord c) => (a -> b -> c) -> [a] -> [b] -> [c]
applyMerge = Data.List.ApplyMerge.IntSet.applyMerge

0 comments on commit e115b54

Please sign in to comment.