Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http, logs: only count GET requests as downloads. #171

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Jehan
Copy link

@Jehan Jehan commented Apr 10, 2024

Other request methods such as POST or HEAD or anything else than GET should not be counted as a new download.


GNOME admins updated our MirrorBits instance for GIMP now that #128 is merged, but I was disappointed to still see a lot of downloads (too many that I would believe, i.e. 2.6 million downloads a day!). I looked closer at MirrorBits code and realized it is counting as download any request. In particular, I am guessing that HEAD may be quite a common request run by download accelerators or torrent clients on webseeds… well I am mostly guessing that this may be another source of false positives which may explain our huge stats.

In any case, whether it's a reason or not for our over-high stats, I think it does make sense that only GET requests should be counted as downloads, right?

Note that it would still log all requests (adding an additional "method:GET" or "method:HEAD" for instance in the log file), but only the GET requests would now increase the download stats.

Other request methods such as POST or HEAD or anything else than GET
should not be counted as a new download.
dlogger.l.Printf("%s %d \"%s\" ip:%s mirror:%s%s %sasn:%d distance:%skm countries:%s",
typ, statuscode, p.FileInfo.Path, p.IP, m.Name, fallback, sameASNum, m.Asnum, distance, countries)
dlogger.l.Printf("%s %d \"%s\" method:%s ip:%s mirror:%s%s %sasn:%d distance:%skm countries:%s",
typ, statuscode, p.FileInfo.Path, method, p.IP, m.Name, fallback, sameASNum, m.Asnum, distance, countries)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not log method:%s in other cases, while you're at it? Ie. in the cases below, with statuscode == 404, 500 and the final else?

logs.LogDownload(resultRenderer.Type(), status, results, err)
if len(mlist) > 0 {
logs.LogDownload(resultRenderer.Type(), r.Method, status, results, err)
if len(mlist) > 0 && r.Method == "GET" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the impression that you could tighten it a bit more, and check the resultRenderer. In case it's a JSONRenderer, it's not really a download.

So it would be about adding && resultRenderer.Type() == "REDIRECT" I think

Copy link
Contributor

@elboulangero elboulangero Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another comment: why not move the check r.Method == "GET" two lines above, at the level of if !ctx.IsMirrorlist()? This way, there's no need to modify logs.LogDownload at all.

It could be argued that the file downloads.log is meant to log downloads, and HEAD requests are not downloads, so shouldn't be logged in this file. I have the impression that it was the original intention. You can always look at HEAD requests in the nginx logs (assuming you run mirrorbits being a reverse proxy).

OTOH it could be argued that the file downloads.log was misnamed from the start, and it's useful to log HEAD requests in there, even if it's not a download?

This is just for the sake of discussion, I don't think it matters much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants