Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thanos rule doesn't honor --web.external-prefix #3260

Closed
perkons opened this issue Sep 30, 2020 · 16 comments
Closed

Thanos rule doesn't honor --web.external-prefix #3260

perkons opened this issue Sep 30, 2020 · 16 comments

Comments

@perkons
Copy link

perkons commented Sep 30, 2020

Thanos, Prometheus and Golang version used:

# thanos --version
thanos, version 0.15.0 (branch: HEAD, revision: fbd14b49f95e7543883fedf9381a21127d4dda2b)
  build user:       root@d35b081d3e19
  build date:       20200907-09:29:21
  go version:       go1.14.2
#

What happened:

Running Thanos Rule behind Nginx, with --web.external-prefix=https://external.com/thanos/rule, so that Rule is accessible at https://external.com/thanos/rule.
The result is https://external.com/thanos/rule directs to https://external.com/thanos/rule/alerts . It opens up but it is in "text mode", style sheet (css) not loaded.

The links in the UI:

Opening https://external.com/thanos/rule/rules opens the rules but again style sheet not loaded.
Opening https://external.com/thanos/rule/new/ redirects to https://external.com/thanos/rule/new/alerts and displays "Error: Error fetching Alerts: Failed to fetch", the style sheet is loaded.

Links in the new UI:

http://internal.org:19902 redirects to http://internal.org:19902/thanos/rule/alerts
http://internal.org:19902/alerts opens up but it is in text mode, style sheet (css) not loaded.

What you expected to happen:

  1. https://external.com/thanos/rule/alerts opens with the style loaded
  2. All the buttons (Alerts, Rules, New UI) direct to the right urls
  3. New UI Actually works

How to reproduce it (as minimally and precisely as possible):

  1. Run Thanos Rule with --web.external-prefix=https://external.com/thanos/rule on host internal.org
  2. Configure Nginx for Query on host external.com with the following:
  location /thanos/rule/ {
    proxy_pass http://internal.org:19902/;
  }
  1. Open up your browser, go to https://external.com/thanos/rule and try switching to "Alerts", "Rules", "New UI".
@onprem
Copy link
Member

onprem commented Sep 30, 2020

Unlike Prometheus' --web.external-url we have --web.external-prefix which means that you only need to pass the prefix part of the URL. In your case --web.external-prefix=/thanos/rule will be the correct way to use this flag.

Can you please check if you can reproduce this bug even after correctly passing the flag?

@perkons
Copy link
Author

perkons commented Oct 9, 2020

I just switched to --web.external-prefix=/thanos/rule.

The results for Classic UI:
https://external.com/thanos/rule directs to https://external.com/thanos/rule/alerts . It opens up but it is in "text mode", style sheet (css) not loaded.
Links in the Classic UI:

The results for New UI:
https://external.com/thanos/rule/new/ directs to https://external.com/thanos/rule/new/alerts. It does open with css loaded, but displaying error "Error: Error fetching Alerts: Failed to fetch".
Links in the New UI:

@onprem
Copy link
Member

onprem commented Oct 10, 2020

Thanks for the detailed report, I'll take a look at how things are handled in Ruler and hopefully a fix will land soon.

@ncastrocosta
Copy link

ncastrocosta commented Oct 19, 2020

@prmsrswt I'm facing a similar issue with --web.external-prefix' for the Query component.

Using version 0.14.0, I had '--web.external-prefix=/thanos' and everything was working fine. Then, I updated to version 0.15.0 and I started to get "404 page not found". Then, I changed to '--web.external-prefix=thanos' - no slash. And, now, I have too many redirects errors.

For the Ruler, I was using '--web.external-prefix=/ruler'. Then, I dropped the slash and everything worked again.

Any thoughts on that?

@onprem
Copy link
Member

onprem commented Oct 22, 2020

Okay so tried this on the latest master and this issue seems to be fixed as a side effect of #3234 Everything should work as expected with the config you specified above.

But there's one thing that caught my eye is that we are handling external prefix and route prefix differently from the Querier.

In Prometheus, route-prefix's default value is equal to the external-prefix specified, because that is the most general use case. This way, Prometheus and it's UI can be accessed with and without the reverse proxy. In Thanos, we were not doing this initially so if you forget to specify route-prefix the same as the external prefix, you will not be able to access the UI internally (without reverse proxy, it'll work as expected with a reverse proxy though). This was changed in the Querier some time ago and now we also do what Prometheus does. We added some documentation around it here. But we are not doing this in other components yet. I think we need to follow the same approach in all other components as well, so that it stays consistent throughout all components.

@ncastrocosta If we modify this behavior in Ruler, you would have to modify your nginx config to something like

location /thanos/rule/ {
    proxy_pass http://internal.org:19902/thanos/rule/;
}

Also, even internally, the ruler would be accessible at /thanos/rule prefix.

@ncastrocosta
Copy link

@prmsrswt using v0.16.0, for the Query, no matter if I configure --web.external-prefix=/thanos or --web.external-prefix=thanos, for both cases I'm redirected to "/thanos/graph" and I get "404 page not found". The "/thanos/metrics", for example, is working fine. What happened to graph?

@perkons
Copy link
Author

perkons commented Oct 29, 2020

I got the query and rule working like this:

location /thanos/rule/ {
    proxy_pass http://internal.org:10911/;
}

location /thanos/query/ {
    proxy_pass http://internal.org:10913/thanos/query/;
}

When using:

location /thanos/query/ {
    proxy_pass http://internal.org:10913/;
}

I get the error 404 page not found

There is an inconsistency between services.

#  thanos --version
thanos, version 0.16.0 (branch: HEAD, revision: dc6a1666bb68cd5c11be54452842e823f57668c5)
  build user:       root@ba806318d94d
  build date:       20201026-13:56:53
  go version:       go1.15
#

@onprem
Copy link
Member

onprem commented Oct 30, 2020

Yup, that's what I pointed out above, rule and querier are doing things differently.

@stale
Copy link

stale bot commented Dec 31, 2020

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Dec 31, 2020
@onprem onprem removed the stale label Jan 5, 2021
@stale
Copy link

stale bot commented Mar 16, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Mar 16, 2021
@onprem onprem removed the stale label Mar 17, 2021
@dmilind
Copy link

dmilind commented Apr 16, 2021

I was also facing the same issue until I prefix / in both flags --web.route-prefix --web.external-prefix
I used below prefixes to resolve this.
--web.route-prefix=/thanos/ruler
--web.external-prefix=/thanos/ruler

@stale
Copy link

stale bot commented Jun 16, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Jun 16, 2021
@stale
Copy link

stale bot commented Jun 30, 2021

Closing for now as promised, let us know if you need this to be reopened! 🤗

@stale stale bot closed this as completed Jun 30, 2021
@onprem
Copy link
Member

onprem commented Jun 30, 2021

This is still a valid issue I guess.

@onprem onprem reopened this Jun 30, 2021
@stale stale bot removed the stale label Jun 30, 2021
@stale
Copy link

stale bot commented Sep 3, 2021

Hello 👋 Looks like there was no activity on this issue for the last two months.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

@stale stale bot added the stale label Sep 3, 2021
@stale
Copy link

stale bot commented Sep 19, 2021

Closing for now as promised, let us know if you need this to be reopened! 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants