Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url_ignorelist on Autocapture config doesn't seem to be working #1402

Open
jperezr21 opened this issue Sep 6, 2024 · 4 comments
Open

url_ignorelist on Autocapture config doesn't seem to be working #1402

jperezr21 opened this issue Sep 6, 2024 · 4 comments

Comments

@jperezr21
Copy link

I want to ignore events from the Vercel screenshots crawler (user agent vercel-screenshot/1.0).

These events have $current_url set to https://project-and-deployment-id.vercel.app/.

I tried adding all of the following:

    autocapture: {
      url_ignorelist: [/.*\.vercel\.app\/.*/, "vercel.app", "vercel.app/.*"],
    },

but I'm still receiving these events.

Am I doing something wrong or is this a bug?

@pauldambra
Copy link
Member

pauldambra commented Sep 6, 2024

Hey,

Are you receiving only $autocapture events? That ignore list is to stop only $autocapture events (like clicked on a div)

I'd expect it's better to edit the user agent blocker... So, in your config you can set additional user agents we'll detect as bots.

in your config you'd set custom_blocked_useragents: ['vercel-screenshot/1.0']

(ofc if you're fine with other events and its just autocapture you want to block we can check that too :))

@jperezr21
Copy link
Author

These are $pageleave events, which I have enabled in the config. So maybe they aren't $autocapture. Here's my whole config:

  posthog.init(env.NEXT_PUBLIC_POSTHOG_KEY, {
    api_host: "/ingest",
    ui_host: "https://app.posthog.com",
    person_profiles: "identified_only",
    capture_pageview: false,
    capture_pageleave: true,
    autocapture: {
      url_ignorelist: [/.*\.vercel\.app\/.*/, "vercel.app", "vercel.app/.*"],
    },
  });

I'll try adding custom_blocked_useragents param.

Thanks!

@jperezr21
Copy link
Author

jperezr21 commented Sep 6, 2024

Turns out Vercel doesn't always set the user agent to that 🤦

Their latest request came with Mozilla/5.0 (iPhone; CPU iPhone OS 17_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148.

I think the Posthog client should be a global url_ignorelist option.

@pauldambra
Copy link
Member

Hey,

In the short term.. you can detect the URL and not initialize posthog which is equivalent to a url ignore list (if slightly more work on your part) and should work if you never want to collect on that URL... similar to how folk check if they're running on localhost and don't send data.

I think this is maybe another vote for a global onEvent or similar that lets folk mask / edit / reject events. In your case it'd be if (url === blah) return null 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants