-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove direct uses of hppc #84735
Labels
:Core/Infra/Core
Core issues without another label
Meta
Team:Core/Infra
Meta label for core/infra team
Comments
Pinging @elastic/es-core-infra (Team:Core/Infra) |
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 7, 2022
This commit cleanups up allocation awareness integration tests to not use hppc. relates elastic#84735
elasticsearchmachine
pushed a commit
that referenced
this issue
Mar 7, 2022
This commit cleanups up allocation awareness integration tests to not use hppc. relates #84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 8, 2022
This commit removes a couple trivial uses of hppc in x-pack tests. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 9, 2022
This commit removes a couple random uses of hppc maps in favor of HashMap. The uses should not be memory sensitive. relates elastic#84735
elasticsearchmachine
pushed a commit
that referenced
this issue
Mar 9, 2022
This commit removes a couple trivial uses of hppc in x-pack tests. relates #84735
rjernst
added a commit
that referenced
this issue
Mar 9, 2022
This commit removes a couple random uses of hppc maps in favor of HashMap. The uses should not be memory sensitive. relates #84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 10, 2022
The histogram field parses values and counts from document. Parsing the document should not be the bottleneck in index, so using hppc classes here appears unnecessary, especially given all the other overhead of I/O for parsing a document. This commit converts these two uses to ArrayList. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 10, 2022
This use was trivial, as it already was using references as the values are String. This commit converts to using an ArrayList. relates elastic#84735
rjernst
added a commit
that referenced
this issue
Mar 10, 2022
The histogram field parses values and counts from document. Parsing the document should not be the bottleneck in index, so using hppc classes here appears unnecessary, especially given all the other overhead of I/O for parsing a document. This commit converts these two uses to ArrayList. relates #84735
rjernst
added a commit
that referenced
this issue
Mar 10, 2022
This use was trivial, as it already was using references as the values are String. This commit converts to using an ArrayList. relates #84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 10, 2022
The LinearizabilityChecker has one use of hppc, where it uses a Long to Object map. Given that the rest of the maps in the Cache class are Java Maps, and this is only for testing, hppc does not seem necessary. This commit converts the usage to Map. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 10, 2022
This method is only used for an assertion in test clusters. This commit converts the hppc map to Java Map. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Mar 14, 2022
A handful of server integ tests used hppc for local state to compare with. None of these should be performance critical. This commit converts remaining uses within server integ tests to Java collections. relates elastic#84735
rjernst
added a commit
that referenced
this issue
Mar 15, 2022
A handful of server integ tests used hppc for local state to compare with. None of these should be performance critical. This commit converts remaining uses within server integ tests to Java collections. relates #84735
rjernst
added a commit
that referenced
this issue
Mar 15, 2022
The LinearizabilityChecker has one use of hppc, where it uses a Long to Object map. Given that the rest of the maps in the Cache class are Java Maps, and this is only for testing, hppc does not seem necessary. This commit converts the usage to Map. relates #84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 20, 2022
The percentiles bucket agg uses an hppc arraylist of doubles to store the parsed percent values. This is a very small list and does not need to be a native array. This commit changes to using a standard ArrayList. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 20, 2022
The http server transport uses an hppc integer set to find the first open port. This commit changes it to use a standard HashSet. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 20, 2022
The builder for ImmutableOpenMap was changed to not inherit from hppc types, but the builder from ImmutableOpenIntMap was missed. This commit removes the base hppc interface from the builder and removes unused methods, and converts one place that was using the leaked types. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 20, 2022
The CombinedDeletionPolicy keeps ref counts for each index snapshot using an hppc primitive map. This commit converts it to use a standard HashMap. relates elastic#84735
rjernst
added a commit
that referenced
this issue
Apr 20, 2022
The builder for ImmutableOpenMap was changed to not inherit from hppc types, but the builder from ImmutableOpenIntMap was missed. This commit removes the base hppc interface from the builder and removes unused methods, and converts one place that was using the leaked types. relates #84735
rjernst
added a commit
that referenced
this issue
Apr 20, 2022
The http server transport uses an hppc integer set to find the first open port. This commit changes it to use a standard HashSet. relates #84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 21, 2022
The LocalCheckpointTracker keeps mappings between sequence number and a bitsets, using hppc primitive maps. This commit converts these to use standard HashMaps. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 21, 2022
The translog writer uses hppc maps for mapping sequence numbers to bitsets, and keeping track of which sequence numbers have been flushed or not. This commit converts these uses to Java collections. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 21, 2022
The nested and reverse nested aggs use hppc maps and lists for mapping buckets to ordinals. This commit converts these to use HashMaps and ArrayLists. relates elastic#84735
rjernst
added a commit
that referenced
this issue
Apr 21, 2022
The CombinedDeletionPolicy keeps ref counts for each index snapshot using an hppc primitive map. This commit converts it to use a standard HashMap. relates #84735
rjernst
added a commit
that referenced
this issue
Apr 21, 2022
The nested and reverse nested aggs use hppc maps and lists for mapping buckets to ordinals. This commit converts these to use HashMaps and ArrayLists. relates #84735
rjernst
added a commit
that referenced
this issue
Apr 25, 2022
The LocalCheckpointTracker keeps mappings between sequence number and a bitsets, using hppc primitive maps. This commit converts these to use standard HashMaps. relates #84735
rjernst
added a commit
that referenced
this issue
Apr 25, 2022
The percentiles bucket agg uses an hppc arraylist of doubles to store the parsed percent values. This is a very small list and does not need to be a native array. This commit changes to using a standard ArrayList. relates #84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 25, 2022
The top hits aggregator uses hppc for keeping track of leaf bucket collectors. This commit converts it to use HashMap. relates elastic#84735
rjernst
added a commit
to rjernst/elasticsearch
that referenced
this issue
Apr 25, 2022
The terms aggs take include/excludes for terms, and for numeric fields use an hppc set of longs. This commit converts to using a HashSet. relates elastic#84735
rjernst
pushed a commit
that referenced
this issue
Apr 26, 2022
Remove LongObjectHashMap, LongArrayList and LongProcedure from index translog files. relates #84735
rjernst
added a commit
that referenced
this issue
Apr 26, 2022
The terms aggs take include/excludes for terms, and for numeric fields use an hppc set of longs. This commit converts to using a HashSet. relates #84735
rjernst
added a commit
that referenced
this issue
Apr 27, 2022
The top hits aggregator uses hppc for keeping track of leaf bucket collectors. This commit converts it to use HashMap. relates #84735
With the exception of ImmutableOpenIntMap and ImmutableOpenMap, the uses of hppc have been removed. I opened a followup for those classes, as they are used quite extensively: #86239 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Core/Infra/Core
Core issues without another label
Meta
Team:Core/Infra
Meta label for core/infra team
This is a meta issue for removing most uses of hppc from Elasticsearch.
Like any complex software project, Elasticsearch uses various data structures from computer science: lists, maps, sets, etc. Java provides interfaces for these structures (Java Collections Framework, JCF), and concrete implementations using well known techniques, like a hashtable based Map implementation, HashMap. The Java provided implementations are used throughout Elasticsearch.
Additionally, Elasticsearch server depends on the HPPC library. It provides alternatives to Java’s builtin collections that attempt to provide efficient, fast and “open” implementations (open here means easily hackable). The library is mostly wrapped with our own classes, namely ImmutableOpenMap and ImmutableOpenIntMap, but it is also used sporadically throughout the codebase. But when to use the HPPC based collections vs the Java ones has never been well defined within the team.
HPPC has been around for more than a decade, and while it once provided major performance benefits over JCF, Java has changed a lot in that time, especially in hotspot. It is unclear if HPPC actually provides advantages over JCF anymore. On top of this, the HPPC based collections are more difficult to use, and don’t easily interoperate with JCF interfaces, requiring conversion, or more often, forcing a developer to use HPPC because an upstream object used it, like when dealing with cluster state.
We have decided to try to move away from using hppc. Doing so is tricky since swapping out JCF classes are not trivial in most uses. This meta issue lists direct uses of hppc that should be either removed entirely, or evaluated for performance differences with JCF. Whether performance tests are needed would depend on the use case, and whether it is expected to be sensitive to memory footprint, and the objects lifetime.
server
o.e.action.admin.cluster.stats
(Remove hppc from ClusterStatsNodes #85639)o.e.action
Multi* request/response (Remove hppc from multi*shard request and responses #85888)o.e.action.admin.indices
(Remove unnecessary uses of ObjectHashSet #85911)o.e.common
(Remove hppc from some "common" classes #85957)o.e.http
(Remove hppc from http server transport #86024)o.e.index.engine
(Remove hppc from deletion policy #86072)o.e.index.mapper.BinaryFieldMapper
(Rewrite CollectionUtils dedup to work with any type #85352)o.e.index.mapper
(Use Set instead of LongSet in long script field #85417, Use Set instead of LongSet in double script field #85475)o.e.index.seqno
(Remove hppc from LocalCheckpointTracker #86073)o.e.index.translog
(Remove hppc from index translog #85657)o.e.indices
(Remove unnecessary uses of ObjectHashSet #85911)o.e.rest.action.cat
(Remove hppc from cat allocation api #85842)o.e.search.aggregations.bucket.nested
(Remove hppc from nested aggs #86078)o.e.search.aggregations.bucket.terms
(Remove hppc from terms agg #86151)o.e.search.aggregations.metrics
(Remove hppc from top hits agg #86150)o.e.search.aggregations.pipeline
(Remove hppc from percentiles agg #86023)o.e.search.dfs
(Remove hppc from dfs search #84688)o.e.tasks
(Remove hppc from task manager #85889)o.e.transport
(Remove hppc from tcp transport #85843)server
unit testso.e.action.fieldcaps
(Remove hppc from RequestDispatcherTests #85236)FieldMemoryStatsTests
(Remove hppc from FieldMemoryStats #85240)o.e.common.util
(Remove hppc from o.e.c.util tests #85406)o.e.index
(Remove hppc from o.e.index tests #85297)SearchServiceTests
(Remove hppc from FetchSearchPhase #85188)BinaryRangeAggregatorTests
(Remove hppc from search and aggs tests #85468)o.e.search.aggregations.metrics
(Remove hppc from search and aggs tests #85468)o.e.search.runtime
o.e.search.slice
server
integ testsAllocationAwarenessIT
(Remove hppc uses from AwarenessAllocationIT #84736)aggs and search tests
(Remove hppc uses from server integ tests #84963)LinearizabilityChecker
(Remove hppc from LinearizabilityChecker #84878)InternalTestCluster
(Convert ReplicationTracker insync global checkpoints getter to Map #84882)TestCluster
(Remove hppc use from TestCluster #84850)JdbcAssert
) (Remove hppc uses from xpack tests #84782)The text was updated successfully, but these errors were encountered: