This track is based on a geonames dump of the file allCountries.zip retrieved as of April 27, 2017.
For further details about the semantics of individual fields, please see the geonames dump README.
Modifications:
- The original CSV data have been converted to JSON.
- We combine the original
longitude
andlatitude
fields to a newlocation
field of type geo_point.
{
"geonameid": 2986043,
"name": "Pic de Font Blanca",
"asciiname": "Pic de Font Blanca",
"alternatenames": "Pic de Font Blanca,Pic du Port",
"feature_class": "T",
"feature_code": "PK",
"country_code": "AD",
"admin1_code": "00",
"population": 0,
"dem": "2860",
"timezone": "Europe/Andorra",
"location": [
1.53335,
42.64991
]
}
This track allows to overwrite the following parameters with Rally 0.8.0+ using --track-params
:
bulk_size
(default: 5000)bulk_indexing_clients
(default: 8): Number of clients that issue bulk indexing requests.ingest_percentage
(default: 100): A number between 0 and 100 that defines how much of the document corpus should be ingested.conflicts
(default: "random"): Type of id conflicts to simulate. Valid values are: 'sequential' (A document id is replaced with a document id with a sequentially increasing id), 'random' (A document id is replaced with a document id with a random other id).conflict_probability
(default: 25): A number between 0 and 100 that defines the probability of id conflicts. This requires to run the respective challenge. Combiningconflicts=sequential
andconflict-probability=0
makes Rally generate index ids by itself, instead of relying on Elasticsearch'sautomatic id generation <https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#_automatic_id_generation>
_.on_conflict
(default: "index"): Whether to use an "index" or an "update" action when simulating an id conflict.recency
(default: 0): A number between 0 and 1 that defines whether to bias towards more recent ids when simulating conflicts. See the Rally docs for the full definition of this parameter. This requires to run the respective challenge.number_of_replicas
(default: 0)number_of_shards
(default: 5)max_num_segments
: The maximum number of segments to force-merge to.source_enabled
(default: true): A boolean defining whether the_source
field is stored in the index.index_settings
: A list of index settings. Index settings defined elsewhere (e.g.number_of_replicas
) need to be overridden explicitly.cluster_health
(default: "green"): The minimum required cluster health.error_level
(default: "non-fatal"): Available for bulk operations only to specify ignore-response-error-level.post_ingest_sleep
(default: false): Whether to pause after ingest and prior to subsequent operations.post_ingest_sleep_duration
(default: 30): Sleep duration in seconds.include_non_serverless_index_settings
(default: true for non-serverless clusters, false for serverless clusters): Whether to include non-serverless index settings.include_force_merge
(default: true for non-serverless clusters, false for serverless clusters): Whether to include force merge operation.include_target_throughput
(default: true for non-serverless clusters, false for serverless clusters): Whether to apply target throughput.
We use the same license for the data as the original data from Geonames:
This work is licensed under a Creative Commons Attribution 3.0 License,
see http://creativecommons.org/licenses/by/3.0/
The Data is provided "as is" without warranty or any representation of accuracy, timeliness or completeness.