Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala 2.12 map.keySet leaks reference to map #11727

Closed
fanf opened this issue Sep 5, 2019 · 12 comments · Fixed by scala/scala#10544
Closed

Scala 2.12 map.keySet leaks reference to map #11727

fanf opened this issue Sep 5, 2019 · 12 comments · Fixed by scala/scala#10544
Assignees
Milestone

Comments

@fanf
Copy link

fanf commented Sep 5, 2019

Hello, we discovered with prod pain and crashes that Map.keySet leaks a reference to the Map object so that it is not GC'ed and leaks memory - huge memory if objects in the map are big and you wanted to only return light identifiers to them.

It can be seen for ex here: https://gist.github.com/fanf/ff39065f7f9738467e40929ad7490da1

Is this known/wanted and I'm just 13 years late in that knowledge, or is it a bug (at least a bad documentation)?

The problem is likely the references to self in DefaultKeySet (contains, size...) https://github.com/scala/scala/blob/2.12.x/src/library/scala/collection/MapLike.scala#L176

@plokhotnyuk
Copy link

@fanf it looks that DefaultKeySet was implemented like a view (not a separated collection of keys)...

@szeiger
Copy link

szeiger commented Sep 5, 2019

I think it makes sense that keySet references the source map because it is usually the fastest and most straight-forward implementation.. But you wouldn't expect so from documentation such as

Collects all keys of this map in an iterable collection.

@fanf
Copy link
Author

fanf commented Sep 5, 2019

Yes, it would be very appreciated to make obvious the fact that it's a view and not a new object, AND what is the best (most efficient) method to get a fresh, independant Set.
For reference, the naive Set()++map.keySet() we tested first does not work, most likelly because ++ is smart and see that one of the set is empty and just returns the second one.

@fanf
Copy link
Author

fanf commented Sep 5, 2019

(I knew about values which is never what you want it to be, keySet is a new one :)

@SethTisue
Copy link
Member

best (most efficient) method to get a fresh, independant Set.

.keysIterator.toSet?

@fanf
Copy link
Author

fanf commented Sep 5, 2019

@SethTisue most likely from my user point of view, but I don't know if there's better way :)

@plokhotnyuk
Copy link

plokhotnyuk commented Sep 5, 2019

Both keySet.toSeq.toSet and keysIterator.toSet methods for getting a separated set are vulnerable for hash collision based DoS attacks even if the original map (like scala.collection.mutable.CollisionProofHashMap or java.collection.LinkedHashMap) is safe: #11203

@SethTisue
Copy link
Member

@scala/collections is 2.13 also affected? (if it's 2.12-only we should close the ticket)

@som-snytt
Copy link

2.13 is similarly a view onto the enclosing Map:

  /** A generic trait that is reused by keyset implementations */
  protected trait GenKeySet { this: Set[K] =>
    def iterator: Iterator[K] = MapOps.this.keysIterator
    def contains(key: K): Boolean = MapOps.this.contains(key)
    override def size: Int = MapOps.this.size
    override def knownSize: Int = MapOps.this.knownSize
    override def isEmpty: Boolean = MapOps.this.isEmpty
  }

The 2.13 nomenclature dictates that keySet should be a view and keySetted should be a copy. Good joke alert!

Agree that a tweak to the Scaladoc is deemed advisable.

@som-snytt som-snytt self-assigned this Sep 14, 2023
@som-snytt
Copy link

Today's complaint in chat was that keys is an Iterable of no known properties.

Generally, keysIterator uses the map's iterator, keySet is a special set that uses keysIterator, and keys is just your keySet.

ListMap has a test expecting keys.map(f) to preserve order (insertion order in this case). However, I would expect to require keys.iterator.map(f). (It looks like sorted maps try to preserve the sort order of their keys.)

@fommil
Copy link

fommil commented Sep 15, 2023

heh, I hit this problem this week. The side effect was that I thought I'd discovered a new way to add 4 bit numbers together that used a lot less diodes and transistors than the usual encoding. Suffice to say, my excitement has been suitably squashed upon discovery that I was accidentally deduplicating before applying my cost function... le sigh.

@som-snytt
Copy link

To quote Jesus from "The Chosen" tv show, "In this world, bones will break, and hearts will break." He goes on to say that in Scala 4, matters are much improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants