LinkedHash(Map|Set) have weird complexity #2727

j-baker · 2022-11-29T12:17:51Z

Mostly leaving this as a note. LinkedHashMap and LinkedHashSet are, as implemented in the Java standard library, pretty cool structures. Map entries contain a doubly linked list to next and previous elements, and because of the mutability, the cost of maintaining the list structure is always O(1). Effectively, you have a structure which can be treated as both a list and a set without the downsides in most practical cases.

In Vavr, these structures are implemented as a pair of a HashMap and a Queue. This pairing is rather uncool, because the complexity of various map operations becomes case-dependent.

For example (all complexities are listed as complexities on the queue, not on the paired HashMap):

When doing a put, if the element does not exist, the put is O(1). However, doing a put on an element that exists already is O(n) because the element's node is replaced in the queue (and I think the entire queue is copied at this time).

Likewise, doing a remove is O(n) because the queue must be rewritten.

The queue is stored as a head and tail with the tail possibly being reversed on dequeue. However, this is only amortized efficient when the reading operation actually dequeues (e.g. n enqs followed by n deqs will do one queue reversal, but n enqs followed by n iterations through the queue will do n queue reversals).

// This function is O(n^2) though the user will expect it to be O(nlogn + n), since calling head will reverse the entire queue.
int doSomethingExpensive(int n) {
    LinkedHashSet<Integer> myInts = LinkedHashSet.range(0, 100_000);
    int sum = 0;
    for (int i = 0; i < 100_000; i++) {
        // the head call is presently implemented as .iterator().next() which will prepare the second element for reading, and
        // thus read the entire queue. This could be improved, but the issue remains that reading prefixes of an unchanging map can be expected linear in the entire size of the map.
        sum += doSomethingCheapWith(myInts.head());
}

In practice this means that the datastructure is genuinely much less effective than the normal HashMap - one converting their HashMap into a LinkedHashMap should expect a huge performance degradation which is unexpected to one coming from normal Java maps.

An implementation with perhaps slightly poorer top-end performance but that would have much more forgiving worst case performance might be to do something more like:

- state of LinkedHashMap is (HashMap<K, Tuple2<V, Long>> map, TreeMap<Long, Tuple2<K, V>> indexes, Long nextIndex).
- Inserting a new element returns (map.put(key, tuple(value, nextIndex)), indexes.put(nextIndex, tuple(key, value)), nextIndex++)
- overwriting returns (map.put(key, tuple(value, map.get(key)._2), indexes.put(map.get(key)._2, tuple(key, value), nextIndex);
- remove returns (map.remove(key), indexes.remove(map.get(key)._2), nextIndex)
- iteration is indexes.values().

This structure would be O(logn) on all operations, and generally avoid the sharp edges of the current structure. It would use quite a bit more memory, although this could be minimized by e.g. avoiding boxing of longs or at least sharing them. Of course, it's possible that top-end performance would suffer, especially since inserting in sorted order is the worst case for a red-black tree.

The text was updated successfully, but these errors were encountered:

While writing up vavr-io#2727 I noticed that `LinkedHashSet.head()` is implemented `iterator().head()`. This is inefficient because `queue.iterator().next()` makes a call to `.head()`, but also to `.tail()` so as to prepare for the next `head()` call. Given the structure of the underlying `Queue`, the `head()` call is worst case `O(1)` but the tail call is worst case `O(n)`. The present worst case will be achieved if there have never been overwrites or removals from the set, which is probably a fairly common case.

While writing up #2727 I noticed that `LinkedHashSet.head()` is implemented `iterator().head()`. This is inefficient because `queue.iterator().next()` makes a call to `.head()`, but also to `.tail()` so as to prepare for the next `head()` call. Given the structure of the underlying `Queue`, the `head()` call is worst case `O(1)` but the tail call is worst case `O(n)`. The present worst case will be achieved if there have never been overwrites or removals from the set, which is probably a fairly common case.

wrandelshofer · 2023-04-02T19:44:02Z

I made a draft implementation of a CHAMP-trie based VAVR-collection, that performs all operations in constant time.
The class has currently the name SequencedChampMap. Performance is not stellar, but at least, there are no performance cliffs anymore. The code is in an experimental state.
https://github.com/wrandelshofer/vavr/blob/6569c67e3d5c3b4dae681fa762c714c8d074d7ab/README.md

wrandelshofer · 2023-05-11T15:40:29Z

My pull-request 2745 is supposed to address this problem for LinkedHashSet and LinkedHashMap.
#2745

While writing up #2727 I noticed that `LinkedHashSet.head()` is implemented `iterator().head()`. This is inefficient because `queue.iterator().next()` makes a call to `.head()`, but also to `.tail()` so as to prepare for the next `head()` call. Given the structure of the underlying `Queue`, the `head()` call is worst case `O(1)` but the tail call is worst case `O(n)`. The present worst case will be achieved if there have never been overwrites or removals from the set, which is probably a fairly common case.

wrandelshofer · 2024-10-13T14:43:40Z

I have made now a release that is binary compatible with vavr 0.10.5, but which is implemented with CHAMP-based collections.

You can get it here: https://github.com/wrandelshofer/vavr/releases/tag/v0.10.5

michaelhkay · 2025-01-02T13:53:08Z

I hit this issue (presumably) today and reported it by email. FWIW, here's a copy of what I found.

I have been experimenting with VAVR collections with a view to replacing our (*) current HashTrie implementation (due to Michael Froh) with a map implementation that maintains ordering.

I constructed a use case that generates a word frequency from a Shakespeare corpus - 900K tokens, of which about 66K are distinct, so the dominant operation is LinkedHashMap.put() replacing an existing entry with an updated value.

This gets progressively slower as the task progresses. Profiling suggests that the dominant factor is calls on io.vavr.collections.List.replace(). It's also spending a lot of time in my equals() method comparing keys.

The number of calls on my equals() method seems excessive: Between put() call 199001 (with map size 25612) and put() call 200001 (at size 25724) there are around 20 million calls on my equals() method.

Total elapsed time is something like 25min compared with 6s for the current unordered map implementation.

(*) "our" here means the implementation in the Saxon XSLT/XQuery processor

wrandelshofer · 2025-01-02T13:59:05Z

@michaelhkay
Can you give it a try with the Champ collections?

Ideally, you can swap out the file vavr.jar with vavr-champ-0.10.5.jar that I have published here:
https://github.com/wrandelshofer/vavr/releases/tag/v0.10.5

michaelhkay · 2025-01-02T14:04:43Z

Yes, that's next on my list. After redecorating my study.

pivovarit · 2025-01-02T14:13:47Z

Generally, the classic implementation of persistent/immutable data structures has... suboptimal performance characteristics.

Going CHAMP (Compressed Hash-Array Mapped Prefix-tree) is one of the ways to tackle it: Compressed Hash-Array Mapped Prefix-tree

I would love to have this in 1.0.0

wrandelshofer · 2025-01-02T14:44:43Z

The code below runs in about 1.4 seconds with vavr-champ LinkedHashMap.
(and in about 0.85 seconds with vavr-champ HashMap).
(and in about 0.75 seconds with the regular vavr HashMap).

package outside_of_vavr;

import io.vavr.collection.LinkedHashMap;
import io.vavr.collection.Map;
import io.vavr.control.Option;
import org.junit.jupiter.api.Test;

import java.util.ArrayList;
import java.util.UUID;

import static org.junit.jupiter.api.Assertions.assertEquals;

public class ShakespeareCorpusTest {
    /**
     * This use case generates a word frequency from a
     * simulated Shakespeare corpus - 900K tokens,
     * of which about 66K are distinct,
     * so the dominant operation is LinkedHashMap.put()
     * replacing an existing entry with an updated value.
     */
    @Test
    public void shouldPerformReasonablyWell() {
        // GIVEN the Shakespeare corpus
        int distinctWords = 66_000;
        int totalWords = 900_000;
        java.util.List<String> corpus = new ArrayList<>(distinctWords);
        for (int i = 0; i < distinctWords; i++) {
            corpus.add(UUID.randomUUID().toString());
        }
        while (corpus.size() < totalWords) {
            corpus.addAll(corpus.subList(0, Math.min(distinctWords, totalWords - corpus.size())));
        }
        java.util.Collections.shuffle(corpus);
        System.out.println("corpus.size=" + corpus.size());

        // WHEN generate a word frequency from the corpus
        //Map<String, Integer> m = HashMap.empty();
        Map<String, Integer> m = LinkedHashMap.empty();
        for (String word : corpus) {
            Option<Integer> frequency = m.get(word);
            if (frequency.isEmpty()) {
                m = m.put(word, 0);
            } else {
                m = m.put(word, frequency.get() + 1);
            }
        }

        // THEN the size must match the expected size
        assertEquals(distinctWords, m.size());
    }
}

michaelhkay · 2025-01-02T17:49:07Z

Took a bit of an effort to work out how to change my gradle build to pick it up (not my build script...) but Wow! this certainly fixes the performance bug.

Average execution time over 20 runs 710ms, Memory used: 1095Mb

Previously, with the unordered HashTrie map from Michael Froh: 662ms, Memory 976Mb

I wouldn't place much confidence in the memory measurements.

The timings include the XML parsing cost. For the record, this is using the Shakespeare corpus at GitHub/PlayShakespeare.com-XML, and the Saxon XQuery 4.0 code is

collection('.?select=*.xml')//l/tokenize(.) => map:build(value:=fn{1}, combine:=op('+')) => map:get('the')

producing the result (the number of occurrences of "the") of 24157. Each XML document in the collection is parsed in a separate thread but the query then executes in a single thread.

wrandelshofer · 2025-01-02T18:49:35Z

Thank you for testing! 😀

The memory overhead for a sequenced (linked) map is expected to be higher than for an non-sequenced (non-linked) map.

The memory overhead of a vavr-champ LinkedHashMap is about 50 bytes per element.
The memory overhead of a vavr-champ HashMap is about 21 bytes per element.

j-baker mentioned this issue Nov 29, 2022

Faster LinkedHashSet head() #2728

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LinkedHash(Map|Set) have weird complexity #2727

LinkedHash(Map|Set) have weird complexity #2727

j-baker commented Nov 29, 2022 •

edited

Loading

wrandelshofer commented Apr 2, 2023

wrandelshofer commented May 11, 2023

wrandelshofer commented Oct 13, 2024

michaelhkay commented Jan 2, 2025

wrandelshofer commented Jan 2, 2025

michaelhkay commented Jan 2, 2025

pivovarit commented Jan 2, 2025

wrandelshofer commented Jan 2, 2025

michaelhkay commented Jan 2, 2025

wrandelshofer commented Jan 2, 2025

LinkedHash(Map|Set) have weird complexity #2727

LinkedHash(Map|Set) have weird complexity #2727

Comments

j-baker commented Nov 29, 2022 • edited Loading

wrandelshofer commented Apr 2, 2023

wrandelshofer commented May 11, 2023

wrandelshofer commented Oct 13, 2024

michaelhkay commented Jan 2, 2025

wrandelshofer commented Jan 2, 2025

michaelhkay commented Jan 2, 2025

pivovarit commented Jan 2, 2025

wrandelshofer commented Jan 2, 2025

michaelhkay commented Jan 2, 2025

wrandelshofer commented Jan 2, 2025

j-baker commented Nov 29, 2022 •

edited

Loading