Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Long funnels -> bad request bug #4561

Closed
wants to merge 1 commit into from
Closed

[Fix] Long funnels -> bad request bug #4561

wants to merge 1 commit into from

Conversation

alexkim205
Copy link
Contributor

@alexkim205 alexkim205 commented Jun 1, 2021

Problem

See #4554.

User created a long funnel (17 steps) which resulted in a request to api/insight/funnel that had a path longer than 4094 bytes. Because Gunicorn sets the default limit_request_line to 4094 to protect against DDOS attacks [docs], a bad request from the gunicorn layer was being propagated to the user.

Changes

  1. Change gunicorn limit_request_line to 8190 (max). 8190 bytes still allows user to create large funnels AND mitigates security risk.
  2. Add e2e test that makes sure long funnels can be made.

Checklist

  • N/A All querysets/queries filter by Organization, by Team, and by User
  • N/A Django backend tests
  • N/A Jest frontend tests
  • Cypress end-to-end tests
  • N/A Migrations are safe to run at scale (e.g. PostHog Cloud) – present proof if not obvious

Testing

We don't see this bug in our local dev setups, because wsgi configurations (like limit_request_line) are only applied on
deployed production instances of PostHog.

Because the bug only shows up in prod, I spent a lot of time trying to figure out how to test this. Finally decided to self host bug branch alexkim205/posthog#fix/long-request-lines on Heroku and spot test there.

Before

posthog cloud

Screen Shot 2021-06-01 at 2 42 34 PM

After

heroku app

Screen Shot 2021-06-01 at 2 38 24 PM

@alexkim205 alexkim205 changed the title Fix long funnels bad request bug (#4554) Fix long funnels bad request bug Jun 1, 2021
@alexkim205 alexkim205 changed the title Fix long funnels bad request bug [Fix] Long funnels -> bad request bug Jun 2, 2021
@@ -1,5 +1,5 @@
release: REDIS_URL='redis://' python manage.py migrate
web: gunicorn posthog.wsgi --log-file -
web: gunicorn posthog.wsgi --limit-request-line 8190 --log-file -
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only solve the issue for people using Heroku, but won't solve it for anyone else (including our cloud instance).

What was wrong with the approach of using a POST request instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timgl I generally agree that we should use POST for API requests, though here the case is that with URLs such as:

https://app.posthog.com/insights?insight=FUNNELS&interval=day&display=FunnelViz&actions=%5B%7B%22id%22%3A2671%2C%22math%22%3Anull%2C%22name%22%3A%22User%20Signed%20Up%22%2C%22type%22%3A%22actions%22%2C%22order%22%3A0%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%5D&events=%5B%7B%22id%22%3A%22%24pageview%22%2C%22math%22%3Anull%2C%22name%22%3A%22%24pageview%22%2C%22type%22%3A%22events%22%2C%22order%22%3A1%2C%22properties%22%3A%5B%7B%22key%22%3A%22%24active_feature_flags%22%2C%22type%22%3A%22event%22%2C%22value%22%3A%22%5C%22project-home-exp-5%5C%22%22%2C%22operator%22%3A%22icontains%22%7D%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A2%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A3%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A4%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A5%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A6%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A7%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A8%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A9%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A10%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A11%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A12%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A13%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A14%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A15%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%2C%7B%22id%22%3A%22first%20team%20event%20ingested%22%2C%22math%22%3Anull%2C%22name%22%3A%22first%20team%20event%20ingested%22%2C%22type%22%3A%22events%22%2C%22order%22%3A16%2C%22properties%22%3A%5B%5D%2C%22math_property%22%3Anull%7D%5D&properties=%5B%5D&filter_test_accounts=true&date_from=dStart&new_entity=%5B%5D

... the insights page itself doesn't open.

image

The GET URL of the API request being long is just a side-effect of this bigger problem. Replacing that with POST will not solve the core issue.

Solving this effectively requires untangling all frontend state that depends on the URL via an approach like #4329 . This would benefit from having prior experience taming propertyFilterLogic and related spaghetti.

Alternatively to just limit the amount of data reaching Django, we can move the filters from ?... to #... in the URL. Everything after the hash will not be sent to Django, and can be as long as your M1's swap space.

This fix does however include a delicate tightrope one must walk in the frontend codebase, including some effort to make things backwards compatible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solving this effectively requires untangling all frontend state that depends on the URL via an approach like #4329 . This would benefit from having prior experience taming propertyFilterLogic and related spaghetti.

We should definitely do this, however for the scope of this specific issue the insights URL wasn't the issue, just the API request

Copy link
Contributor Author

@alexkim205 alexkim205 Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: @mariusandra

+1 that POST won't fix the issue. Also wanted to add that if you're building up the funnel in the console by adding more steps (as opposed to clicking a direct link), we propagate the error a bit nicer:

스크린샷 2021-05-28 오후 7 15 14 (1)

Digging into network requests brings the bad API request to surface.

I also don't believe the ?... -> #... stopgap solution will work, because we would be sending a >4094 byte request across the network in both cases, and the WSGI layer will still complain.

re: @timgl

Do you think the scope of this project belongs in the context of #4329 (seems like it would require a large refactor but would be cleaner in the longrun)? Or should we look for a hacky stopgap?

This PR would be a really easy (reversible) stopgap solution that patches PH Cloud and Heroku self-hosted customers. It would unblock the original customer who raised the issue as they are using app.posthog.com.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexkim205 I don't see anything that will work for cloud in this PR?

Also this issue is completely separate from #4329. #4329 is about the URL the user sees in their browser, this PR is about the API request. I see POST as a long term solution rather than a hacky stopgap unless I'm missing something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timgl got it re: API vs frontend split with regards to the scope of this PR.

Let's tackle the "insights page won't open in the first place with a huge URL" issue in #4329 , and indeed just convert the API to use a POST request.


Offtopic then, but adding for completion:

The ?... -> #... fix will work as anything after # is never sent to the server. The user might copy/paste a longer than 4096 byte URL into their slack/email/wherever, but when opening the link, Django will only get everything until "#".

However, that's outside the scope here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexkim205 I don't see anything that will work for cloud in this PR?

Also this issue is completely separate from #4329. #4329 is about the URL the user sees in their browser, this PR is about the API request. I see POST as a long term solution rather than a hacky stopgap unless I'm missing something?

Ah I misunderstood the separation of URL and API, my bad. Going with the original POST solution

@alexkim205
Copy link
Contributor Author

Closing in favor of #4595.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants