Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate ClickHouse timeseries schema from TOML definitions #5942

Open
bnaecker opened this issue Jun 24, 2024 · 0 comments
Open

Populate ClickHouse timeseries schema from TOML definitions #5942

bnaecker opened this issue Jun 24, 2024 · 0 comments
Assignees

Comments

@bnaecker
Copy link
Collaborator

As part of #5941, we'll move timeseries schema from ClickHouse into CockroachDB as the primary source of truth. ClickHouse will still need these definitions, however, to support querying and as a self-contained record of how to understand all the data in the database. This issue tracks populating the schema in ClickHouse from the underlying TOML definitions, rather than deriving those from the samples oximeter receives.

I'm not entirely sure how this will work yet, but as a strawman:

  • Add an API to oximeter for setting the schema
  • oximeter will insert (or update) these in ClickHouse. It should be the case that it can simply insert or update them all, since the TOML definitions will have been checked both at compile time and by Nexus when inserting into CockroachDB. It's not clear how to express that in ClickHouse yet, given the rather confusing ways modifications can be made.
  • oximeter should probably drop its internal schema cache. This is an optimization, and has also led to some bugs. It should just read the schema from ClickHouse whenever it needs them.
@bnaecker bnaecker self-assigned this Jun 24, 2024
david-crespo added a commit that referenced this issue Dec 3, 2024
#7047 moves the existing `/v1/timeseries/query` endpoint (which requires
the fleet viewer role) to `/v1/system/timeseries/query`. This PR is on
top of #7047 and adds `/v1/timeseries/query?project=my-project`, using a
sneaky trick to let us ensure that the user has read permissions on a
project before we let them see metrics for that project. See
#5298 (comment)
for a discussion of the OxQL authz problem more broadly.

- [x] Add `/v1/timeseries/query` with required project query param
- [x] Integration tests showing the authz works in an expected scenarios
- ~~Add a list schemas endpoint that only lists schemas with a
`project_id` field~~
- ~~Move schema filtering logic inside oximeter client~~
- ~~Fully test schema list endpoint~~

## The trick

1. Require the user to say (in a query param) what project they're
interested in
2. Look up that project and make sure they can read it
3. Jam `| filter silo_id == "<silo_id>" && project_id == "<project_id>"`
on the end of whatever query they passed in

It sounds silly, but I talked it over with @bnaecker and we couldn't
find any holes. If the user tries to fetch metrics from another project
inside their query, the query will end up with the filter `project_id ==
"def" && project_id == "xyz"` and the result set will always be empty.
If they try to query a metric that doesn't have a `project_id` on it, it
will error out. It works!

## API design

Initially I had the endpoint as
`/v1/timeseries/query/project/{project}`. This is really horrible. It
should be `/projects/`, but that doesn't feel any better. I also
considered `/v1/projects/{project}/timeseries/query`, which has some
precedent:


https://github.com/oxidecomputer/omicron/blob/45f5f1cc2c7eec2a1de0a5143e85e0794134f175/nexus/external-api/output/nexus_tags.txt#L59-L64

But it also feels awful. 

I like the query param approach. Right now, there are only fleet-scoped
metrics and project-scoped metrics, so requiring the project ID makes
sense, but the query params are nicely flexible in that we could make
project optional or add other optional fields and just do different auth
checks based on what's passed in. Neither path nor operation ID mention
projects.

<details>
<summary>Examples of fleet- and project-scoped metrics</summary


https://github.com/oxidecomputer/omicron/blob/45f5f1cc2c7eec2a1de0a5143e85e0794134f175/oximeter/oximeter/schema/virtual-machine.toml#L3-L18


https://github.com/oxidecomputer/omicron/blob/45f5f1cc2c7eec2a1de0a5143e85e0794134f175/oximeter/oximeter/schema/switch-data-link.toml#L3-L19
</details>


## No list endpoint yet

I backed out the schema list endpoint. We can't list project-scoped
schemas because `authz_scope` is not in the database! See
#5942. Currently the
schema list endpoint hard-codes `authz_scope: Fleet`.


https://github.com/oxidecomputer/omicron/blob/69de8b6288fde36fbcd6cabb4d632d62851230ad/oximeter/db/src/model/from_block.rs#L142

I am using the tag `hidden` on the query endpoint so that it goes in the
OpenAPI definition (so I can use it in the console) but it will not show
up on the docs site.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant