-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🚀 Feature]: Implement download tracking in selenium manager #11211
Comments
I agree that it would be very interesting to know how many users are using Selenium Manager. But I am unsure if proxying each Selenium Manager request is the best choice since it can be problematic. In the best case, the performance of downloading drivers will be higher by putting an intermediate page intercepting the traffic. Moreover, the selenium.dev server could be a bottleneck. Finally, we will need backend support to track all these requests. Maybe an alternative way is to continue downloading directly from each driver repository (chromedriver, geckodriver, etc.), and create a tracking endpoint in selenium.dev which is invoked after resolving a driver by Selenium Manager. But even in that case, we are creating additional overhead for the sake of monitoring the use of Selenium Manager. |
Just curious. What do we intend to do with this data? |
As a start, we want to track usage, know how many users in which languages are now relying on Selenium Manager. We have invested a great amount of time, and it'd be very useful to know if the code we are shipping is being used. After learning usage, we can start figuring out what else we want to measure and how. Also, following what @bonigarcia mentions, coding and/or hosting ourselves a tracking solution is shooting ourselves in the foot. It would be the ideal solution because we could ensure some privacy first. Currently we use https://plausible.io/ to track selenium.dev, but the quota we are paying would not be enough to hold the data we would capture. I guess the easiest solution would be to use Google Analytics , I implemented a client to track Zalenium usage back then, but we had help from the legal area in my previous company to draft the terms and conditions to be compliant. If we want to use that, we need to check with the PLC and ask the SFC for their help with the legal terms. |
How much extra traffic are we expecting if each driver download sends a http call with a few bytes of data that doesn't expect a response? |
Just saying it might be worth paying plausible a little more if it can track that info. Would still want to double check with SFC on privacy |
I will check with Plausible and see how much they could charge us. |
I am lost with this feature. Is it still required? |
It would be nice to have. But it adds overhead to each request. Maybe we can wait until we release Selenium Manager officially to track downloads in that way? Tracking every single request might be too much. |
I totally agree that it would be great to monitor the use of Selenium Manager, but adding that overhead is too much, IMO. |
I just realized that having an official release won't track downloads since we package SM in the bindings. |
Yes, that's true. |
@diemol @bonigarcia Would it make sense if selenium manager emits a beacon that can contain the tracking information and which can be collected separately (optionally ofcourse). That way we wouldn't be interfering with the primary responsibility of selenium manager, nor would we be creating a single point of failure/bottleneck, but still we would be able to gather the usage data. But yes, it would involve a bit more work and wont necessarily be the need of the hour right now. |
That is the overhead we want to avoid. Besides we need a place where to store that. |
We could implement a stupid basic page with a couple lines of JS that just redirects a the download request and records what was requested: Could add additional metadata: |
What if we do it as a straight url redirect: I really do think this needs to be part of our final release. We currently have no way of knowing what Selenium usage looks like. If that moves us to a new Plausible tier, I think it's worth it. If it adds a fraction of a second to the download, I think that's worth it as well. |
I would avoid using |
I think github is more reliable than most services? Also we can have a fallback in Selenium Manager. The advantage of selenium.dev is also that we already have Plausible hooked up. |
We should send requests directly to Plausible then. |
That works |
I reached out to Plausible, and they said we can start sending events. With that, they can figure out the new pricing. We can go over our quota for 2 months without incurring extra costs. This is the events API https://plausible.io/docs/events-api |
Don't we want something we can disable, though? I was thinking we could start with: https://plausible.io/docs/custom-query-params and then use: If something unforeseen happens, we can turn off tracking to |
Right, we need to write down first what we want to collect. |
|
I've been analyzing how to implement this issue and have different questions. If I'm not wrong, first, we need to create an HTML page in the Selenium site, for instance, called https://www.selenium.dev/manager/0.4.16/stats.html Wouldn't it be better to store this page directly in the https://www.selenium.dev/manager/stats.html Then, the In the Rust, each time Selenium Manager is executed, SM sends a HTTP request to something like: Regarding the parameters proposed by @titusfortner, Regarding the Regarding the I have further comments but would like to resolve these questions first. |
The reason I suggested that it have a version is to make it so we can turn off tracking for a specific version if there was something wrong with it, without affecting everything else. If something in a release got formatted incorrectly or logged something differently in a way we don't want... Maybe it isn't necessary, but that was my thought process. The rest makes sense to me, use SM version & add |
I don't think we should build a page to parse the values we send in the query parameters when we can build all those in a payload and send it to Plausible. Maybe I am missing something? |
Indeed, building a custom webpage on our site won't work since that page is supposed to be parsed in a browser that executes the JavaScript. @diemol Maybe your idea is that Selenium Manager creates a POST request as specified in the event API doc to a non-existing URL as follows:
Is that correct? |
Yes, that was the initial idea. Using a different URL, though. So the stats do not mix with the website stats. |
What URL then? Also. Do the Selenium project has already an account on Plausible? |
Yes, we do. We can create a new subdomain, like sm.selenium.dev. |
I'm going to open a separate ticket for the bindings to send language info. Primarily we need to make sure the data is showing up properly in Plausible from the SM in the PR before we close this out. |
@bonigarcia I think there should be a connection / read timeout set to only a few seconds to avoid issues in cases the service is offline or not reachable. As i read the code and the docs there is currently no timeout at all and it could block the selenium manager? |
As explained in the comments of #13173:
Moreover, the call to Plausible already has a little timeout of 3 seconds. See line 68 in |
…3173) * [rust] Send stats to Plausible (#11211) * [rust] Read selenium version from clap cargo feature * [rust] Set language binding in test * [rust] Include assertion about stats error message * [rust] Use selenium domain (selenium.dev) as plausible stats target * [rust] Make asynchronous request to plausible * [rust] Include argument --avoid-stats to not send statistics to plausible.io * [rust] Include request timeout and improve error handling * [rust] Spawn new thread for calling to stats function * [rust] Use message channel for passing errors when sending stats * [rust] Update checksum un lock file * Update rust/src/stats.rs * [rust] Check if SM is offline before sending the stats * [rust] Send custom properties to plausible in lowercase --------- Co-authored-by: Diego Molina <[email protected]>
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Feature and motivation
The idea is to get basic usage information about who is using Selenium Manager.
We could create a basic page on
https://www.selenium.dev/selenium/manager
. The Selenium Manager would route all requests for downloads through that page.Something like:
Which would redirect and return the correct file, but would allow us to see what language/os/browser/browser versions are being used.
The text was updated successfully, but these errors were encountered: