Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blacklisting in Zonemaster creates false WARNING #1281

Closed
matsduf opened this issue Aug 24, 2023 · 3 comments · Fixed by #1285
Closed

Blacklisting in Zonemaster creates false WARNING #1281

matsduf opened this issue Aug 24, 2023 · 3 comments · Fixed by #1285
Assignees
Labels
T-Bug Type: Bug in software or error in test case description
Milestone

Comments

@matsduf
Copy link
Contributor

matsduf commented Aug 24, 2023

Problem description

Running a test on festo.press

$ zonemaster-cli festo.press --level notice
Seconds Level    Testcase       Message
======= ======== ============== =======
  28.37 WARNING  CONNECTIVITY03 All authoritative nameservers have their IPv4 addresses in the same AS (203391).
  28.37 WARNING  CONNECTIVITY03 All authoritative nameservers have their IPv6 addresses in the same AS (203391).
  28.38 WARNING  CONNECTIVITY04 The following name server(s) are announced in the same IPv6 prefix (2a06:fb00:1::/48): "ns2.atvirtual.com/2a06:fb00:1::1:210;ns6.atvirtual.com/2a06:fb00:1::2:210;ns7.atvirtual.com/2a06:fb00:1::3:210;ns8.atvirtual.com/2a06:fb00:1::4:210"
  84.37 WARNING  ZONE09         No response on MX query from name servers "185.136.96.210;185.136.97.210;185.136.98.210;185.136.99.210".

it looks like some name servers have problem returning response on MX query (ZONE09), but that turns out to be a false warning. Running a test on ZONE09 only returns no WARNING

$ zonemaster-cli festo.press --level notice --test zone/zone09
Seconds Level    Testcase       Message
======= ======== ============== =======
Looks OK.

Running zonemaster-cli with --level debug3 and inspecting all queries reveals the issue behind the false warning. Zonemaster does not even send a query to those name servers.

The nameservers behind the domain has an issue with version.bind ch txt queries:

$ zonemaster-cli festo.press --level info --test nameserver/nameserver15
Seconds Level    Testcase       Message
======= ======== ============== =======
   0.00 INFO     UNSPECIFIED    Using version v4.7.2 of the Zonemaster engine.
  46.74 INFO     NAMESERVER15   The following name server(s) do not respond to software version queries. Returned from name servers: "ns2.atvirtual.com/185.136.96.210;ns2.atvirtual.com/2a06:fb00:1::1:210;ns6.atvirtual.com/185.136.97.210;ns6.atvirtual.com/2a06:fb00:1::2:210;ns7.atvirtual.com/185.136.98.210;ns7.atvirtual.com/2a06:fb00:1::3:210;ns8.atvirtual.com/185.136.99.210;ns8.atvirtual.com/2a06:fb00:1::4:210"

When there is no response in NAMESERVER15, the name servers get blacklisted by Zonemaster. When Zonemaster comes to ZONE09 it just assumes that those name servers will not respond and acts as if a query was sent and there is no response.

Log

I ran the following command to collect all messages. The test was limited to BASIC, NAMESERVER15 and ZONE09 to keep down the volume:

zonemaster-cli festo.press  --test basic --test nameserver/nameserver15 --test zone/zone09 --level debug3 > festo.press--basic-nameserver15-zone09--2023-08-24.log

festo.press--basic-nameserver15-zone09--2023-08-24.log.zip is attached to this issue.

In the log file it is documented that the queries in NAMESERVER15 causes blacklisting and that the queries are never sent in ZONE09.

Discussion

Something has to be done about the blacklisting function in Zonemaster since it obviously can make one test case affect the running and result of another. Blacklisting is there to keep down the number of queries to name servers that do not respond. Only the first query is sent, and then it is assumed that further queries would not receive a response anyway.

But is not only the first query that can trigger blacklisting, as we can see here, and a query for one thing can block a query for another thing.

Alternative solutions

As I can see it, blacklisting must be adjusted. The less aggressive it is, the higher the risk that Zonemaster has to wait for queries that will not be responded to anyway.

  1. Only do blacklisting when the query is for ZONE SOA with no EDNS and the same protocol (UDP/TCP/DoT/DoH/DoQ).
  2. Let blacklisting have effect on a identical query only (identical in all respects).
  3. Manually select what queries should potentially trigger blacklisting.
@matsduf matsduf added the T-Bug Type: Bug in software or error in test case description label Aug 24, 2023
@matsduf matsduf added this to the v2023.2 milestone Aug 24, 2023
@marc-vanderwal
Copy link
Contributor

I’m sorry, but I don’t seem to have access to the attachment…

I’d say that it isn’t a typical thing for an authoritative DNS server to blackhole version.bind/TXT/CH queries; I have a strong suspicion that some kind of middlebox is doing this. Try the same query in TCP and you’ll get a proper response (NODATA, in this case).

We could start by exempting version.bind/TXT/CH queries from blacklisting, as a first stopgap measure.

Among the three solutions, I favor option 2.

@marc-vanderwal
Copy link
Contributor

marc-vanderwal commented Sep 4, 2023

We may want to substitute a different word to “blacklisting”. Not only because the word may be offensive to some audiences, but also because it does a poor job at describing what the list is for. The word implies that we are putting servers on some kind of naughty list. That’s why the usual suggestion of “blocklisting” won’t work here. We could maybe talk about “caching of non-responses”.

As a general rule, we could start by implementing the non-response cache as a set of pairs of name server IPs and DNS queries in their wire format representations where the transaction ID is set to zero.

There might be a special case when dealing with EDNS, though. If a name server does not return any response at all to an EDNS-enabled query (i.e. if, for a given server, Nameserver02 emit BREAKS_ON_EDNS), it may be desirable to forgo sending subsequent queries to the same name server that require EDNS to work, because we know anything involving EDNS has little chance of succeeding.

@tgreenx
Copy link
Contributor

tgreenx commented Sep 12, 2023

Fixed by #1285

@tgreenx tgreenx closed this as completed Sep 12, 2023
@tgreenx tgreenx modified the milestones: v2023.2, v2023.1.4 Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-Bug Type: Bug in software or error in test case description
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants