[meta] alexa webhook to prioritize the issues based on their domain name #1533

MDTsai · 2017-04-26T07:58:33Z

I would like to implement an alexa webhook. While a new issue created, the webhook can get the ranking of the website on alexa and attach the information to the issue. This could help us to prioritize the issue when triage.

Implement a python script to dump alexa top sites to db #1579 Implement a python script to dump alexa top sites to db + Unit tests
Create 3 labels for issue priority from alexa ranking #1592 Create 3 labels for issue priority from alexa ranking
Implement alexa webhook for priority labels #1594 Implement alexa webhook for priority labels

MDTsai · 2017-05-03T10:32:51Z

I think in helpers.py, I can extend parse_and_set_label to get the base URL from URL, then query the alexa ranking, set to a new label (?)

zoepage · 2017-05-03T11:05:13Z

@karlcow ^ ?

karlcow · 2017-05-03T12:20:37Z

@MDTsai this is a good idea. Interesting Project. We crossed the bridge of the tests flood that we need to handle every day. So we really need to prioritize. This will require unittests and probably performance tests.

First - Evaluation

As a first test, we should grab the list of all individual domains we have currently and make a one time script querying alexa for all these domains and define how we would set meaningfully the priority for each domain. So we have a better idea if it really help us to prioritize.

Implementation ideas

There are probably two ways of doing this.
These services might require a requests queue management.
It has to be async so we don't wait on the Alexa answer before publishing the information.

WebHooks

a new issue is created
Webhook listening on IssueEvent for the action: opened.
Grab the URI of the issue and the payload (both are in the event)
Send a request to Alexa for the rank (listen for success and failure). Log failures.
Convert the rank in a priority scale (probably a scale of 2 "It's important./It's not important")
Ask webcompat-bot to add the label.

Internally

A new issue is created
we get the URI back from GitHub and we know the payload.
we send a query to an async function in the current Flask (using threading? async? or maybe Flask Signals TO THINK)
Send a request to Alexa for the rank (listen for success and failure). Log failures.
Convert the rank in a priority scale (probably a scale of 2 "It's important./It's not important")
Ask webcompat-bot to add the label.

miketaylr · 2017-05-04T20:55:40Z

think in helpers.py, I can extend parse_and_set_label to get the base URL from URL, then query the alexa ranking, set to a new label (?)

You could do it this way, or just create a new webhook endpoint and operate only on the URL.

miketaylr · 2017-05-04T20:56:52Z

Convert the rank in a priority scale (probably a scale of 2 "It's important./It's not important")

I think we're getting ahead of ourselves. First step is just to leave a comment, or add a label that reflects what the Alexa ranking is. I think we need a human process to figure out what the priorities are, and once that's understood, teach robots how to help.

karlcow · 2017-05-05T00:14:55Z

I think we're getting ahead of ourselves. First step is just to leave a comment, or add a label that reflects what the Alexa ranking is.

But to get there you are already too fast in developing the mechanics. label is not practical. Let's say you get an alexa-000001, alexa-000002, etc. A comment would be the sensible thing to do. But as I said in my initial comment, it's too soon already. Small steps. 👣

@miketaylr About

I think we need a human process to figure out what the priorities are, and once that's understood, teach robots how to help.

And It's why I was saying:

As a first test, we should grab the list of all individual domains we have currently and make a one time script querying alexa for all these domains and define how we would set meaningfully the priority for each domain. So we have a better idea if it really help us to prioritize.

This means: Create a script, nothing into our mechanic, and test with the current list of domains so we can learn something. I don't even think we should go ahead without an idea of what our data are and if having an Alexa rank helps. :)

miketaylr · 2017-05-05T02:02:52Z

et's say you get an alexa-000001, alexa-000002, etc. A comment would be the sensible thing to do.

Yeah, that kind of label wouldn't be interesting, it's too granular. Something more like like alexa-top-100, alexa-top-1000, alexa-top-10000, alexa-top-100-mexico, or whatever.

That said, I think we should let @MDTsai have the freedom to experiment and do research (some script like you're describing could be useful). Let's discuss f2f in our next team meeting about priority triage, I have some other thoughts on how it might be done.

MDTsai · 2017-05-05T10:11:04Z

There are 2 alexa APIs provide by amazon, 1st is alexa top sites. It gives a list by request, we can give start ranking, count and country code. Per URL request return costs $0.0025.

2nd API is alexa web information service. This API provides detail information like here mentioned. I don't think it's a good idea to query each website then cache it, we don't need that detailed information. This API doesn't require minimum-fee, first 1000 request is free.

The purpose of this idea is to save time handling issues, so my idea is to give these priorities:

Critical: alexa top 100 in worldwide
Important: alexa top 101-1000 in worldwide or alexa top 100 in tier 1 countries/regions
Normal: alexa top 1001-10000 or alexa top 101-1000 in tier 1 countries/regions
Others: others
Numbers are not fixed, it's just a concept and we can change that as our wish. We can cache site for first 3 priorities and update every week or month, to increase the response time.

karlcow · 2017-05-05T13:37:39Z

@MDTsai thanks for the clarification. the 1st API makes it possible to do caching indeed.
Currently we have close of 6000 bugs, maybe 10000 with Tech Evangelism/Mozilla, with some duplicates.

softvision-sergiulogigan · 2017-05-09T13:45:08Z

A question from our meeting on May 9th:
How will the Alexa thing work? Sites may be on a low position on a global scale, but very high in a country list.

MDTsai · 2017-05-10T08:32:25Z

@softvision-sergiulogigan thanks for the question. For the Alexa thing work, while an issue opened, the webhook will add a label or leave a comment with Alexa ranking. It's not decided yet. I prefer labels, easy to filter when do diagnosis.
For 2nd question, in my previous comment, if it's top 100 in tier 1 countries/regions, it's also important for us, could handle that faster than others. This remind me to find tier 1 list.

karlcow · 2017-05-11T04:38:33Z

Related to this discussion the minutes of the meeting this week.
https://wiki.mozilla.org/Compatibility/Meetings/2017-05-09#webcompat_Priority_triage_.28miketaylr.29

 Mike: we still need a bit more information. Do you think we should add the labels today, or should we wait until we have stuff like the alexa bot? (Team agrees that we can start now)
Sergiu: I have a question: How does the Alexa thing work? Sites may be on a low position on a global scale, but very high in a country list.
Mike: I think it's unknown. We have a specific GitHub issue (https://github.com/webcompat/webcompat.com/issues/1533), can you raise that question in the issue?
Seriu: Sure.

karlcow · 2017-06-14T23:40:15Z

@MDTsai I arrange a list of issues with the ones you opened in your first comment. It will be easier to see the progress and if we missed anything.

MDTsai · 2017-06-15T07:40:16Z

Thanks @karlcow !

miketaylr · 2018-08-08T19:18:10Z

This seems done!

MDTsai self-assigned this Apr 26, 2017

karlcow added lang: Python type: feature request labels May 3, 2017

karlcow changed the title ~~[meta] alexa webhook~~ [meta] alexa webhook to prioritize the issues based on their domain name May 3, 2017

MDTsai mentioned this issue Jun 2, 2017

Implement a python script to dump alexa top sites to db #1579

Closed

This was referenced Jun 13, 2017

Create 3 labels for issue priority from alexa ranking #1592

Closed

Implement alexa webhook for priority labels #1594

Closed

softvision-oana-arbuzov mentioned this issue Sep 14, 2017

Priority Labeling Criteria #1824

Closed

miketaylr unassigned MDTsai Jan 30, 2018

miketaylr closed this as completed Aug 8, 2018

ksy36 mentioned this issue Mar 21, 2022

Alexa sunset. Priority rethinking. #3656

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[meta] alexa webhook to prioritize the issues based on their domain name #1533

[meta] alexa webhook to prioritize the issues based on their domain name #1533

MDTsai commented Apr 26, 2017 •

edited

Loading

MDTsai commented May 3, 2017

zoepage commented May 3, 2017

karlcow commented May 3, 2017

miketaylr commented May 4, 2017

miketaylr commented May 4, 2017 •

edited

Loading

karlcow commented May 5, 2017

miketaylr commented May 5, 2017

MDTsai commented May 5, 2017 •

edited

Loading

karlcow commented May 5, 2017

softvision-sergiulogigan commented May 9, 2017

MDTsai commented May 10, 2017

karlcow commented May 11, 2017

karlcow commented Jun 14, 2017

MDTsai commented Jun 15, 2017

miketaylr commented Aug 8, 2018

[meta] alexa webhook to prioritize the issues based on their domain name #1533

[meta] alexa webhook to prioritize the issues based on their domain name #1533

Comments

MDTsai commented Apr 26, 2017 • edited Loading

MDTsai commented May 3, 2017

zoepage commented May 3, 2017

karlcow commented May 3, 2017

First - Evaluation

Implementation ideas

WebHooks

Internally

miketaylr commented May 4, 2017

miketaylr commented May 4, 2017 • edited Loading

karlcow commented May 5, 2017

miketaylr commented May 5, 2017

MDTsai commented May 5, 2017 • edited Loading

karlcow commented May 5, 2017

softvision-sergiulogigan commented May 9, 2017

MDTsai commented May 10, 2017

karlcow commented May 11, 2017

karlcow commented Jun 14, 2017

MDTsai commented Jun 15, 2017

miketaylr commented Aug 8, 2018

MDTsai commented Apr 26, 2017 •

edited

Loading

miketaylr commented May 4, 2017 •

edited

Loading

MDTsai commented May 5, 2017 •

edited

Loading