-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fork cortex chunk storage into Loki. #3842
Conversation
* Refactor enocding.Chunk.Add Signed-off-by: Ganesh Vernekar <[email protected]> * Undo pointer-to-slice changes in varbit Signed-off-by: Bryan Boreham <[email protected]> * Error on setting Delta encoding Signed-off-by: Ganesh Vernekar <[email protected]> * Don't replace Delta with DoubleDelta Signed-off-by: Ganesh Vernekar <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Ganesh Vernekar <[email protected]>
* Adds v11 schema to store label names within index. Stores only label names and not the entire metric. Storing entire metric will bloat the index by 30% and it doesn't really make sense to do it right now. Adding just label names adds a tolerable 7% to the index. Also, in Prometheus, we don't treat __name__ as a special label. We always return it when calling /labels API and we should do the same here. Signed-off-by: Cyril Tovena <[email protected]> Signed-off-by: Goutham Veeramachaneni <[email protected]> * Finishing touches fix lint issue Signed-off-by: Cyril Tovena <[email protected]> removes useless loop Signed-off-by: Cyril Tovena <[email protected]> This should be on v11 not v10. Signed-off-by: Cyril Tovena <[email protected]> s/metricConstRangeKeyV1/labelNamesRangeKeyV1/ The code was first written to store the entire series, but now changed to do just labelNames. Signed-off-by: Goutham Veeramachaneni <[email protected]> Add note about v11 being experimental. Signed-off-by: Goutham Veeramachaneni <[email protected]> Co-authored-by: cyriltovena <[email protected]> Co-authored-by: Goutham Veeramachaneni <[email protected]>
* ingester: v2 add tsdb block storage per user Signed-off-by: Thor <[email protected]> * ingester: transfer tsdb on shutdown Signed-off-by: Thor <[email protected]> * querier: query s3 tsdb blocks Signed-off-by: Thor <[email protected]> * vendor Signed-off-by: Thor <[email protected]> * ingester.v2.enable Signed-off-by: Thor <[email protected]> * split v2 into separate file Signed-off-by: Thor <[email protected]> * log error Signed-off-by: Thor <[email protected]> * check for creation of db in-between locks Signed-off-by: Thor <[email protected]> * anchor regexp Signed-off-by: Thor <[email protected]> * configurable ship interval Signed-off-by: Thor <[email protected]> * fixed transfer_test flake Signed-off-by: Thor <[email protected]> * skip store init when v2 enabled Signed-off-by: Thor <[email protected]> * s3insecure flag Signed-off-by: Thor <[email protected]> * ignore eof errors on block sync Signed-off-by: Thor <[email protected]> * moved transfer function to common wrapper Signed-off-by: Thor <[email protected]> * don't stop stores if nil Signed-off-by: Thor <[email protected]> * use TSDB base dir for querier directories Signed-off-by: Thor <[email protected]> * ignore table manager in v2 Signed-off-by: Thor <[email protected]> * Refactored TSDB config file structure and CLI flags Signed-off-by: Thor <[email protected]> * fixed lint errors, mark flags as experimental Signed-off-by: Thor <[email protected]> * use separate directory for syncing block data Signed-off-by: Thor <[email protected]> * fixed unit test with new config changes Signed-off-by: Thor <[email protected]> * sync-dir -> sync_dir Signed-off-by: Thor <[email protected]> * validate configs Signed-off-by: Thor <[email protected]> * fixed rebase overflow changes Signed-off-by: Thor <[email protected]>
Empirically, from data stored at Weaveworks, the code used a different version marker and a different size calculation. Allow that data to be read without affecting chunks written more recently. Signed-off-by: Bryan Boreham <[email protected]>
This is required to support IAM roles for service accounts in AWS EKS Signed-off-by: Adam Johnson <[email protected]>
…l components (grafana#1728) Signed-off-by: Roger Steneteg <[email protected]>
%q is better than '%v' because it will show unprintable characters as escape sequences. Signed-off-by: Bryan Boreham <[email protected]>
* Fixes store label name and values duplicate. Signed-off-by: Cyril Tovena <[email protected]> * Normalize string unicity control Signed-off-by: Cyril Tovena <[email protected]> * Adds benchmark to compare the new unique algorithm Signed-off-by: Cyril Tovena <[email protected]>
Use session.NewSession in order to use AssumeRoleWithWebIdentity
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…rafana#1618 (grafana#1900) Signed-off-by: Samiur Arif <[email protected]>
Signed-off-by: Ken Haines <[email protected]>
Signed-off-by: Ken Haines <[email protected]>
Signed-off-by: Ken Haines <[email protected]>
Signed-off-by: Ken Haines <[email protected]>
* Added new KV Store client, MultiClient. This client is configured with multiple stores, one of them is designated as primary. All client operations are forwarded to the primary store. MultiClient also does "mirroring" of values from primary to secondary store. MultiClient can listen on changes in runtime configuration (via overrides mechanism), and switch primary store and enable/disable mirroring. Signed-off-by: Peter Štibraný <[email protected]> * Use Stop, which is now part of kv.Client. Signed-off-by: Peter Štibraný <[email protected]> * Put back setting of defaultLimits -- used when loading YAML files. Signed-off-by: Peter Štibraný <[email protected]> * Moved setting of default limits for YAML unmarshal to separate function. Signed-off-by: Peter Štibraný <[email protected]> * Pass multi-client context as argument. Signed-off-by: Peter Štibraný <[email protected]> * watchConfigChannel now reacts on context being done as well Signed-off-by: Peter Štibraný <[email protected]> * Changed Mirroring to *bool. Signed-off-by: Peter Štibraný <[email protected]> * Ignore mock by yaml. Signed-off-by: Peter Štibraný <[email protected]> * Renamed mirroring to mirror-enabled to be consistent with MultiConfig. Signed-off-by: Peter Štibraný <[email protected]> * Renamed 'multi' to 'multi_kv_config' in overrides.yaml. Signed-off-by: Peter Štibraný <[email protected]> * Forward writes done via CAS function to secondary client. Mirroring goroutine was removed, and replaced by forwarding writes done via CAS function to secondary client. Rate limits config was removed, but there is now timeout for secondary write, to avoid blocking CAS function for too long, if secondary write is slow (eg. etcd being down can cause very long writes). Only WatchKey and WatchPrefix functions now react on change of primary client. Signed-off-by: Peter Štibraný <[email protected]> * Added metrics to multi client. Signed-off-by: Peter Štibraný <[email protected]> * Removed equality check when writing to secondary store. Without watch-and-mirror functionality, there is no need to check if value is already present in the secondary store. Signed-off-by: Peter Štibraný <[email protected]> * Renamed OverridesManager and moved it to its own package. Signed-off-by: Peter Štibraný <[email protected]> * Make lint happy, 3. Signed-off-by: Peter Štibraný <[email protected]> * Make lint happy, 4. Signed-off-by: Peter Štibraný <[email protected]> * Add metric type to variable names, yaml name changes, fixed metric names, removed forgotten log. Signed-off-by: Peter Štibraný <[email protected]> * Fixed yet one more yaml name. Signed-off-by: Peter Štibraný <[email protected]> * Fixed tests after changing yaml fields. Signed-off-by: Peter Štibraný <[email protected]> * Fix bug when default limits are not applied until next overrides reload. Signed-off-by: Peter Štibraný <[email protected]> * Ignore LoadConfig if LoadPath is empty. Signed-off-by: Peter Štibraný <[email protected]> * Use channels to communicate config updates. Instead of spawning new goroutine for each config update, we now use channels to communicate config updates. Signed-off-by: Peter Štibraný <[email protected]> * Initialize limits before starting runtimeconfig Manager. Signed-off-by: Peter Štibraný <[email protected]> * Updated CHANGELOG.md and arguments.md. Signed-off-by: Peter Štibraný <[email protected]> * Typo Signed-off-by: Peter Štibraný <[email protected]> * Fix compilation error in ingester_v2_test.go. Signed-off-by: Peter Štibraný <[email protected]> * Fixed error after rebase. Signed-off-by: Peter Štibraný <[email protected]> * Fixed error after rebase. Signed-off-by: Peter Štibraný <[email protected]> * Use logger with component="multikv" to log messages. Signed-off-by: Peter Štibraný <[email protected]> * Improve log message when runtime config file is not specified. Signed-off-by: Peter Štibraný <[email protected]> * Don't use memberlist in the example, as it is still experimental. Addressed other review feedback. Signed-off-by: Peter Štibraný <[email protected]>
…na#1952) * Added tool to automatically generate config file documentation Signed-off-by: Marco Pracucci <[email protected]> * Make doc due to config changes after rebasing master Signed-off-by: Marco Pracucci <[email protected]> * Added support for inline yaml field Signed-off-by: Marco Pracucci <[email protected]> * Fixed linter issues Signed-off-by: Marco Pracucci <[email protected]> * Fixed linter in the alertmanager Signed-off-by: Marco Pracucci <[email protected]> * Fixed Cortex module name type detection Signed-off-by: Marco Pracucci <[email protected]> * Commented why SchemaConfig doc is not auto-generated Signed-off-by: Marco Pracucci <[email protected]> * Updated config file doc Signed-off-by: Marco Pracucci <[email protected]>
…hema considering grace period (grafana#1976) * fix calculation of expected tables and create tables from upcoming schema considering grace period Signed-off-by: Sandeep Sukhani <[email protected]> * updated changelog Signed-off-by: Sandeep Sukhani <[email protected]> * updated comment to make it clearer Signed-off-by: Sandeep Sukhani <[email protected]>
All code paths return nil as the error, so we can simplify the code. Signed-off-by: Bryan Boreham <[email protected]>
Signed-off-by: Thor <[email protected]>
* adding retry options for cassandra and updating docs * adjusting some naming based on feedback Signed-off-by: Ken Haines <[email protected]>
Signed-off-by: yeya24 <[email protected]>
…#1786) This is more idiomatic Go and more efficient because it doesn't have to loop over the slice. Signed-off-by: Bryan Boreham <[email protected]>
See the conversation that led to this: https://cloud-native.slack.com/archives/CCYDASBLP/p1580342552117900 I personally think that 600ms is quite short, and bumping it shouldn't have a bad effect. Signed-off-by: Goutham Veeramachaneni <[email protected]>
* switching azblob fetching to lower level function to avoid unnecessary parallel requests Signed-off-by: Ken Haines <[email protected]> * updating CHANGELOG.md Signed-off-by: Ken Haines <[email protected]> * PR Feedback Signed-off-by: Ken Haines <[email protected]>
…ject stores (grafana#2049) * support for registering custom index clients, added new methods to object store NewIndexClient accepts factory methods for creating custom index clients added new methods to object stores to work on objects(io.Reader) instead of just chunks Signed-off-by: Sandeep Sukhani <[email protected]> * splitted s3 config from dynamodb config and updated docs Signed-off-by: Sandeep Sukhani <[email protected]> * removed unwanted code Signed-off-by: Sandeep Sukhani <[email protected]> * added List method to azure and addressed other feedback in PR review Signed-off-by: Sandeep Sukhani <[email protected]> * addressed some of the feedback from PR review Signed-off-by: Sandeep Sukhani <[email protected]> * changes suggested from PR review Signed-off-by: Sandeep Sukhani <[email protected]> * fixed an issue with reporting errors in PutObject for GCS object store Signed-off-by: Sandeep Sukhani <[email protected]>
Signed-off-by: Peter Štibraný <[email protected]>
… (grafana#4069) Signed-off-by: ChinYing-Li <[email protected]> Co-authored-by: Peter Štibraný <[email protected]>
* Remove config options deprecated in Cortex 1.6 Signed-off-by: Marco Pracucci <[email protected]> * Marked some features as stable Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Sandeep Sukhani <[email protected]>
* Deprecated -store.query-chunk-limit in favour of the new config -querier.max-fetched-chunks-per-query which is applied both to ingesters and long-term storage Signed-off-by: Marco Pracucci <[email protected]> * Fixed PR number in CHANGELOG Signed-off-by: Marco Pracucci <[email protected]> * Moved CHANGELOG entry to unreleased Signed-off-by: Marco Pracucci <[email protected]> * Addressed review feedback Signed-off-by: Marco Pracucci <[email protected]>
) Fixes following error: ``` msg="error adding delete request to the store" err="Invalid null value in condition for column range" ``` Resolves: cortexproject/cortex#3237 Signed-off-by: Jakub Sokołowski <[email protected]>
* Implement DynamoDB blocksconvert (v9 schema only) Remember which series we have processed, so we only emit entries to the plan once for each series. Signed-off-by: Bryan Boreham <[email protected]> Includes: * Move IndexEntryProcessor to chunk package, so it can be shared across other packages * Pre-check if user is allowed, and make use of map of ignored users * Move IndexReader type beside IndexEntryProcessor * Stop returning unexported type
Signed-off-by: Benjamin Charron <[email protected]> Co-authored-by: Benjamin Charron <[email protected]>
… not found in Azure blob storage (grafana#4200) * detect and return ErrStorageObjectNotFound when object requested for download is not found in storage Signed-off-by: Sandeep Sukhani <[email protected]> * use azblob.StorageError to get detect the object missing error Signed-off-by: Sandeep Sukhani <[email protected]>
…eleted (grafana#4221) Signed-off-by: Sandeep Sukhani <[email protected]>
* chunks/cache: redis cluster support Signed-off-by: kamijin_fanta <[email protected]> * add redis client test Signed-off-by: kamijin_fanta <[email protected]> * update miniredis Signed-off-by: kamijin_fanta <[email protected]> * redis_client: changed the test to simple. Signed-off-by: kamijin_fanta <[email protected]> Co-authored-by: Marco Pracucci <[email protected]>
* Adds the ability to remove OpenCensus instrumentation on GCS. Signed-off-by: Cyril Tovena <[email protected]> * Add the ability to disable OpenCensus within GCS client. The rationale is that we're pretty biaised about Prometheus instrumentation here and it should be possible to disable this if we don't want it at all. Signed-off-by: Cyril Tovena <[email protected]> * Adds changelog entry. Signed-off-by: Cyril Tovena <[email protected]> Co-authored-by: Marco Pracucci <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should maybe move pkg/storage/chunk/storage
to pkg/storage
but that can be done in another PR but I am not sure if we can retain the history. Overall the changes look good to me.
I think we should also decide on CLA before merging this.
@sandeepsukhani it's easier to keep the same package name to avoid renaming everything. |
Signed-off-by: Cyril Tovena <[email protected]>
It is not required to get the CLA signed by all previous contributors per my understanding. However we should include and make clear the copyright information on the forked files (excluding vendored files). Using the softwarefreedom.org guidance I suggest we add the following header to all incoming files (excluding the vendor directory): /*
* Grafana Loki
* Copyright (C) 2021 Grafana Labs
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*
* This file incorporates work covered by the following copyright and
* permission notice:
*
* Copyright 2016-2021 Cortex Project Contributors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/ Note: the Cortex copyright line was created using the CNCF Copyright Guidance |
Through some further discussion I am revising my advice. I do not think it is necessary to add any specific copyright notice, primarily referencing:
As such the existing information in Git should be sufficient for determining copyright and that reason as well as the others listed on the CNCF copyright guidance[1] are enough for me to be convinced there is little value in adding a specific copyright to these files. |
I agree that sleuthing up all copyright holders is near-endless work and not of much benefit. I do think that cross-linking to make code, copyright, and legal spelunking easier is A Good Thing, though. What about a simple
and we can ask SPDX if they would consider standardizing more headers to allow for better cross-linking? |
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
Signed-off-by: Cyril Tovena <[email protected]>
This PR forks cortex chunk package into our storage package since Cortex doesn't plan to use it anymore.
This will give us the ability to change object store client for hedging request for instance and include more fixes tailored for Loki.
I had to upgrade protobuf and gogoproto up to what Cortex is using. I've used subtree to maintain the history but that seems to make CLA not happy. @slim-bean WDYT ? do we care ?