Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choosing filter lists for AdGuard Home #1325

Closed
3 tasks done
DandelionSprout opened this issue Jan 2, 2020 · 51 comments
Closed
3 tasks done

Choosing filter lists for AdGuard Home #1325

DandelionSprout opened this issue Jan 2, 2020 · 51 comments

Comments

@DandelionSprout
Copy link
Member

DandelionSprout commented Jan 2, 2020

Prerequisites

Please answer the following questions for yourself before submitting an issue. YOU MAY DELETE THE PREREQUISITES SECTION.

  • I am running the latest version
  • I checked the documentation and found no answer
  • I checked to make sure that this issue has not already been filed

Problem Description

As I was testing with a stock AGH install on AMD64 Windows, I noticed that the available selection of lists on default installations is still pretty weak:
image

So I think it's time to consider trusting some more lists enough for them to be included by default as well. I'd recommend one or more of the following (in mostly random order):

—— Non-regional ——

—— Regional ——

And that's even without getting into more proof-of-concept-esque list purposes like censorship evasion lists, Nintendo DS online servers, software update blockers, anti-gambling, and so on. I also chose to exclude unofficial domains versions of ABP-formatted lists.

Proposed Solution

Add some more lists to default AdGuard Home installations.

Alternatives Considered

Adding a link to FilterLists.com on the filter settings page would've also been great, although I (who've worked very extensively on its list cataloguing) will be the first to admit that it's become a daunting site for newcomers to browse through, and is almost unusable on phones.

Additional Information

These list additions would be done independently of you guys' plans to create an official script to convert adblocker lists to AGH lists, and these list suggestions should hopefully not interfere with those plans of yours for the immediate time being.

@ameshkov
Copy link
Member

ameshkov commented Jan 6, 2020

Yeah, it's time indeed.

Marked as "help wanted" -- please suggest other lists here.

@Slipi089
Copy link

The different lists from https://energized.pro/ would be great

@DandelionSprout
Copy link
Member Author

DandelionSprout commented Jan 16, 2020

Since the Energized lists are compiled almost entirely from other lists (except for ~20,000 of the entries that come from Energized Core), I personally would have chose to prioritise adding the best ones among the source lists instead.

@Slipi089
Copy link

The best ones will always be different for different people and you will never be able to please anyone

@DandelionSprout
Copy link
Member Author

Fair point.

@ameshkov
Copy link
Member

ameshkov commented Jan 16, 2020

I'd surely prioritize lists compiled using more advanced syntax over the old-school hosts files. People tend to turn on every available filter list, and bloated many-megabytes hosts lists aren't really good for performance.

@WildByDesign
Copy link

@ameshkov I'm not sure if this resource was listed yet or not: https://github.com/mmotti/adguard-home-filters

There is a fantastic REGEX list and also a separate filter list. The filter list appears to pull in from 9 or 10 other reputable source lists and actually parses those lists into a single AGH-compatible list which utilizes the more advanced syntax specifically just for AGH.

Cheers!

@gentlyxu
Copy link

For Chinese, anti-AD is a good choice. the filter list used Adblock-style syntax, see: https://raw.githubusercontent.com/privacy-protection-tools/anti-AD/master/anti-ad-easylist.txt , You can press command+F, search the "/" symbol.

and the project url: https://github.com/privacy-protection-tools/anti-AD/

This project is still being updated and will get better and better

@adworacz
Copy link

adworacz commented Feb 8, 2020

Another list recommendation: https://github.com/notracking/hosts-blocklists/blob/master/adblock/adblock.txt

I opened an issue with the repo, and after some discussion the author was willing to produce a list that's compatible with Adguard Home (and other browser plugins, to be honest).

The nice thing about this list is that it is optimized to take advantage of wildcard/subdomain matching, which saves a LOT of space.

It's also updated regularly, and includes a multitude of sources that are checked regularly for updates, dead domains, and more.

@notracking
Copy link

@gentlyxu be aware that these lists contain quite some filters that a lot of people would consider false positives.

0.0.0.0 scribol.com
0.0.0.0 tracking.epicgames.com
0.0.0.0 logrocket.com
0.0.0.0 loggly.com
0.0.0.0 om.cbsi.com
0.0.0.0 ipinfo.io
0.0.0.0 v.shopify.com
0.0.0.0 adobedtm.com
0.0.0.0 c.evidon.com
0.0.0.0 ereg.wip3.adobe.com
0.0.0.0 csi.gstatic.com
0.0.0.0 g.msn.com
0.0.0.0 sascdn.com
0.0.0.0 duckdns.org
0.0.0.0 dl.360safe.com
0.0.0.0 prf.hn
0.0.0.0 placehold.it
0.0.0.0 digg.com
0.0.0.0 feedburner.com
0.0.0.0 rambler.ru
0.0.0.0 jiathis.com
0.0.0.0 uol.com.br
0.0.0.0 rs6.net
0.0.0.0 com.com
0.0.0.0 s0.2mdn.net
0.0.0.0 pr0gramm.com
0.0.0.0 consent.cmp.oath.com
0.0.0.0 s.youtube.com
0.0.0.0 purch.com
0.0.0.0 fpdownload.macromedia.com
0.0.0.0 dynatrace.com
0.0.0.0 om.cbsi.com
0.0.0.0 auditude.com
0.0.0.0 om.cbsi.com
0.0.0.0 app.link

@gentlyxu
Copy link

@notracking Thank you. I will make a white list to include these host names you mentioned and more...

@ameshkov
Copy link
Member

Vietnamese blocklist:
AdguardTeam/AdguardForiOS#1298 (comment)

@ameshkov
Copy link
Member

ameshkov commented Mar 1, 2020

Meanwhile, we made a simple helper tool for anyone making filter lists:
https://urlfilter.adtidy.org/

With the help of it, you'll be able to check if the domain is blocked by any of the existing filter lists.

@notracking
Copy link

I'd surely prioritize lists compiled using more advanced syntax over the old-school hosts files. People tend to turn on every available filter list, and bloated many-megabytes hosts lists aren't really good for performance.

Do not forget that all network based filters (||ads.google.com^ as well as normal host files) are extremely resource efficient. A single regex (or 'dynamic') filter is multiple factors slower than 1k of network filter rules.

@ameshkov ameshkov changed the title Is it perhaps time to have more than 4 lists in stock AGH installations? Choosing filter lists for AdGuard Home -- please suggest Mar 4, 2020
@Krizzii
Copy link

Krizzii commented Mar 13, 2020

I use 1 list called dbl.oisd.nl. Works great so far.

@vager88
Copy link

vager88 commented Mar 13, 2020

Hi. I don't believe this is something that anyone has considered. We need a way to filter out proxys/redirectors. There's lots of them out there and people use them to bypass DNS filters.
I found the following website that has a list of categorys..
http://dsi.ut-capitole.fr/blacklists/index_en.php

The one called redirectors would be the list of proxys that can be incorporated. Personally, I've copied that file and I'm loading to my server locally.. However it would be great if it can be added as an option.

@DandelionSprout
Copy link
Member Author

DandelionSprout commented Mar 13, 2020

Université Toulouse 1's lists are in TAR.GZ-format only, which no adblockers known to ever have existed are able to unpack and use on their own without user interaction. It's also the reason why virtually none of their lists are on FilterLists.com at the time of writing.

An alternative could be https://blocklist.site/app/dl/redirect, but its licence system is confusing to figure out heads and tails of.

@imTHAI
Copy link

imTHAI commented Apr 17, 2020

I switched from https://github.com/notracking/hosts-blocklists/ to http://abp.oisd.nl/ one month ago. I've only one false positive since then. I also recommend this list.

@ammnt
Copy link

ammnt commented Apr 17, 2020

Unified hosts from Steven Black is really good choice!👍

@pedrolamas
Copy link

pedrolamas commented Apr 28, 2020

So I think it's time to consider trusting some more lists enough for them to be included by default as well. I'd recommend one or more of the following (in mostly random order):

—— Non-regional ——

...

I strongly recommend that you do not use the StreamingAds list indicated on the first post...

I manually added this list and used it for a few days, only to find out that Spotify stopped working!

This seems to affect only the Spotify Android client, and was reported back in 2018 to the repo owner several times (here), only to have it marked as "will not fix".

/cc @DandelionSprout

@DandelionSprout
Copy link
Member Author

DandelionSprout commented Apr 28, 2020

Looking into the matter and seeing that FadeMind/hosts.extras#25 and FadeMind/hosts.extras#33 were both closed without even citing a reason for it, that does indeed cast major doubts on the seriousness of FadeMind, especially seeing as I presume that ≥80% of those that'd need a streaming-ad blocklist would be phone or smart-TV users. Thanks for the heads-up.

@notracking
Copy link

notracking commented Apr 28, 2020

I switched from https://github.com/notracking/hosts-blocklists/ to http://abp.oisd.nl/ one month ago. I've only one false positive since then. I also recommend this list.

Would you mind sharing the false positives that you encountered?

@imTHAI
Copy link

imTHAI commented Apr 29, 2020

Would you mind sharing the false positives that you encountered?

I've reset my personal rules on my AdguardHome server, as I do from time to time. So I can't tell which one I have faced before ( maybe 1 or 2 now).
But yesterday it was gigatribe.com. It's like a direct connect network. The list blocks the entire domain ||gigatribe.com^ ( so it includes login and next.gigatribe.com, we can't connect to the network ).
It's not big deal and, again, I strongly recommend this list.

@notracking
Copy link

@imTHAI thanks for your reply! Domain is queued for whitelisting, will take until the next auto update to show up in the repository.

@JOduMonT
Copy link

JOduMonT commented May 1, 2020

few more list with description : https://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists
than an example on how it could be listed with : https://iplists.firehol.org/

@imTHAI
Copy link

imTHAI commented May 1, 2020

than an example on how it could be listed with : https://iplists.firehol.org/

I don't understand what you mean about firehol ? firehol, that has I use since +15y btw, is a firewall and those predefined lists are IP lists. (For example I'm using firehol+ipset tool to block entire countries from titillate my port 22). I don't see how it could be used at DNS level ?

@DandelionSprout
Copy link
Member Author

AdGuard Home has actually been able to filter by the domains' IP addresses for ~6 months now.

That being said, IP lists tend to block very broadly, so it'd require a truly exceptionally and outstandingly good IP list to even be considered for AGH inclusion. It's also worth mentioning that almost every single list in https://en.wikipedia.org/wiki/Comparison_of_DNS_blacklists were either several years out of date or lacked raw free versions.

@mupkoo
Copy link

mupkoo commented May 2, 2020

https://www.reddit.com/r/oisd_blocklist/comments/dwxgld/dbloisdnl_internets_1_domain_blocklist/

This is probably the best DNSBL list compilation, quite curated.

https://dbl.oisd.nl/

I agree. I have it as my first list and it catches everything before there is a need to check the rest. And as an added benefit, it is being updated quite often and there are a lot of people reporting false positives if there are any

@JOduMonT
Copy link

JOduMonT commented May 3, 2020

@imTHAI

than an example on how it could be listed with : https://iplists.firehol.org/

yes of course what I thought is not clear

I thought instead of compiling them inside an issue under GitHub it could be worth it to be inspired by firehol and have statistic on every list. The best part of firehol is to being able to see the which list are overlapping.

here another list of list: https://firebog.net

Personally I use both

  • DNSBlock to protect my users (AdGuard/PiHole/)
  • IPSet to protect my services (FireHol)

@ameshkov
Copy link
Member

ameshkov commented May 13, 2020

@ArtemBaskal here are some implementation details:

  1. "Add blocklist" shows a modal dialog with two options:

    • "Choose from a list" -- show a new modal dialog with a selection of blocklists
    • "Add a custom list" -- opens the old "New blocklist" dialog
  2. "Known blocklists" dialog contains a list of blocklists grouped by category. Mockup: https://uploads.adguard.com/up04_AdGuard_DNS_-_Filter_lists__Moqups_bbfh9.png. Home icon leads to the blocklist homepage, "source code" icon opens the raw list URL. Please note that this is just a mockup, use our standard styles for that modal dialog.

  3. Categories and lists should be configurable via a single json file (so that it was easier for people to pull request new lists).

@BugZappr
Copy link

Here's a French site, by on Chez Airelle:

http://rlwpx.free.fr/WPFF/hosts.htm

I don't read French, but here's a translation:

http://translate.google.com/translate?hl=fr&langpair=fr|en&u=http://rlwpx.free.fr/WPFF/hosts.htm

I very much support the idea of having lists of segmented interests. I don't mind some ads, if they aren't too flashy or cramping real content. I support capitalism; but only when it's in the public good.
The HPHosts had a really great breakdown before they got killed by the bean-counters. Adware, tracking, malware, exploits, fraud, hijack, misleading marketing, illegal pharma, phishing, potentially unwanted, and warez/piracy. Privacy might be a category, too. Many lists will not break down as cleanly; so fewer categories could be supported.

I think that "Unified hosts from Steven Black" is a poor choice. Half of the lists he offers are "fakenews" sites; many of which I think are more trustworthy than CBS, NBC & ABC. Probably political operatives involved at some step in the pipeline, offering up blocklists for sites publishing independent news and views as fake news: i.e. false positives, IMHO. Very few of the sites he blocks are actually malware; which is what I'm interested in. The larger these lists are, the more it impacts performance negatively - especially in length of list fetches and memory usages. I'd rather not bog down my browser with huge lists, so I'd prefer to just block malware/fishing/annoyances - but lots of them.

@liamengland1
Copy link

Airelle list is notorious for false positives.

If you don't like the "fake news" lists offered by steven black, just use the unified ads and tracking offering. See: https://github.com/StevenBlack/hosts/blob/master/readme.md#list-of-all-hosts-file-variants

Ping @StevenBlack

@StevenBlack
Copy link

@BugZappr

Half of the lists he offers are "fakenews"...
Very few of the sites he blocks are actually malware...

WTF? Dude, that's insulting. Educate yourself.

@DandelionSprout
Copy link
Member Author

Things worth noting:

  1. Airelle's lists are only available in compressed format, which means they generally can't be included or updated in any adblocker tools that I know of.

  2. My personal understanding of Steven Black's lists, is that the "fakenews" variants get such entries sourced from https://raw.githubusercontent.com/marktron/fakenews/master/fakenews. The only ones on that list that anyone should even remotely consider going to, are rt.com and christwire.org. I'd definitely go to NBC's websites a thousand times before I'd go to something called racerelations.news.

  3. If the Readme of the plain list version is to be believed, the mentioned sources would've meant that around ~25,000 of the 57,000 domains are against malware, phishing or scams in some way.

@ameshkov ameshkov self-assigned this Jun 22, 2020
@thespad
Copy link

thespad commented Jun 30, 2020

One of the "problems" with the fakenews list is that includes a load of satirical news sites as well as actual "fake news" sites which makes it worthless to me.

@uservictor
Copy link

Consider to add NoTrack list for blocking online trackers.
https://gitlab.com/quidsup/notrack-blocklists#other-projects

@jerrac
Copy link

jerrac commented Jul 5, 2020

So, with all the suggestions of what lists to include, I could see things getting very confusing...

Can I suggest that there be two lists of filters on the /#filters page?

The first would be vetted filters that AdGuard has determined are good and actively monitors for problems. (What that all actually means, I'm not sure. Maybe they subscribe to issue queues or something for the lists? Or put them through some form of automated testing?)

The second would be community suggested filters. Use at your own risk. Maybe there'd be a separate repo for this list that people can submit pull requests to to get new filters added?

I'd also suggest adding a "Description" column. Then populate that with however the different lists describe themselves.

@ameshkov
Copy link
Member

ameshkov commented Jul 5, 2020

We're going to keep a really short list of the default pre-installed filter lists. Maybe it will be just the AdGuard DNS filter only.

But when you click "Add blocklist", you'll see a list of blocklists you can choose from.

The second would be community suggested filters. Use at your own risk. Maybe there'd be a separate repo for this list that people can submit pull requests to to get new filters added?

There's always filterlists.com where one can look for lists, I am not sure if it makes sense to duplicate it.

@adguard adguard closed this as completed in 49646cf Jul 6, 2020
@ameshkov ameshkov changed the title Choosing filter lists for AdGuard Home -- please suggest Choosing filter lists for AdGuard Home Jul 23, 2020
@YBS-PC
Copy link

YBS-PC commented Aug 30, 2020

it will be good if you replace it Spam404 link
from https://raw.githubusercontent.com/Spam404/lists/master/main-blacklist.txt
to https://raw.githubusercontent.com/Spam404/lists/master/adblock-list.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests