Try every host when attempting to get the ip to hostname mapping if a failure is hit. #6014

Sam-Kramer · 2022-04-29T14:48:32Z

Goals (and why):
==COMMIT_MSG==
Try every host when attempting to get the ip to hostname mapping if a failure is hit.
==COMMIT_MSG==

Implementation Description (bullets):

Obtain the list of Cassandra servers which are alive
Iterate through the servers, until (most likely 1st) returns a valid response (i.e. not a TimeOutException)
Return result

Testing (What was existing testing like? What have you done to improve it?):

Unit tests, which should be feasible enough to test that retry occurs on multiple hosts..

Concerns (what feedback would you like?):

Trying every host can be expensive, and querying a quorum of hosts is not much better. In the worst case scenario, it could take timeout * N for the task to fail to update the addresses. This would be problematic on very large clusters, running in Kubernetes (as the IP changes are significant here), restarting an entire rack a time. However, this failure condition is so unlikely, I'd argue it should be overlooked (for now).
As basically: N * timeout > time to start all nodes in rack Y, but that assumes we are hitting the timeout exception, rather than something else (i.e. connection refused, which is almost instant).

Where should we start reviewing?:

Priority (whenever / two weeks / yesterday):
Yesterday

changelog-app · 2022-04-29T14:48:35Z

Generate changelog in `changelog/@unreleased`

Type

Description

Try every host when attempting to get the ip to hostname mapping if a failure is hit.

Check the box to generate changelog(s)

Generate changelog entry

schlosna · 2022-04-29T15:06:56Z

...b-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/CassandraService.java

@@ -104,7 +105,8 @@ public CassandraService(
        this.blacklist = blacklist;
        this.poolMetrics = poolMetrics;

-        Supplier<Map<String, String>> hostnamesByIpSupplier = new HostnamesByIpSupplier(this::getRandomGoodHost);
+        Supplier<Map<String, String>> hostnamesByIpSupplier =
+                new HostnamesByIpSupplier(this::getAllHostsUnlessBlacklisted);
        this.hostnameByIpSupplier = Suppliers.memoizeWithExpiration(hostnamesByIpSupplier::get, 2, TimeUnit.MINUTES);


How does this 2 minute memoization factor into cassandra nodes coming into and leaving cluster, especially during rolling restarts and upgrades?

I don't have the historical context on why the 2 minutes was picked here, but from what I have seen, it takes roughly 2 (usually 3) minutes for a Cassandra node to leave/re-join a cluster, so this should be safe. Although, maybe we knock this down to 1 minute? Question for the Atlas folks!

No strong opinions on the memoisation length - it was probably picked arbitrarily, honestly. Yeah we can make this 1.

...b-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/CassandraService.java

schlosna · 2022-04-29T15:18:44Z

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java

+                .map(Optional::get)
+                .findFirst()


If we're grabbing the first result, should we filter out any empty responses to grab the first non-empty view of hostnamesByIp?

Suggested change

.map(Optional::get)

.findFirst()

.map(Optional::get)

.filter(hostnamesByIp -> !hostnamesByIp.isEmpty())

.findFirst()

I did this originally, but the (subtle) case that an empty map is a valid response, in the case that they mapping keyspace doesn't exist, table doesn't exist, or there are no rows in the map.

As a result, you end up querying every single node, which isn't needed

When does the mapping keyspace or table not exist for a given node (e.g. is it only on bootstrap)? Aside from duration, why wouldn't we want to continue querying nodes to try and find a non-empty view from other nodes?

Related to the duration & 60 second timeout mentioned above, it feels like we might want to have something managing cluster state in background thread and possibly Refreshable mechanisms for updating & subscribing suppliers

So the keyspace/table are an optional dependency, and are only needed in the case that AtlasDb is ran in a split environment (i.e. in two separate VPCs). As a result, we can expect that sometimes these tables do not exist, and in those cases, the IP addresses should be sufficient for communicating with.

As for querying other nodes for non-empty views, the read itself is done at quorum, so there should be no difference in querying the 1 or 2 nodes, as that is what we're effectively doing under-the-hood.

Related to the duration & 60 second timeout mentioned above, it feels like we might want to have something managing cluster state in background thread and possibly Refreshable mechanisms for updating & subscribing suppliers Agreed -- this is already ran in a background thread, so this doesn't block any critical read/write paths.

…cassandra/pool/CassandraService.java Co-authored-by: David Schlosnagle <[email protected]>

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java

Sam-Kramer · 2022-04-29T15:29:53Z

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java

-            log.warn("Could not get hostnames by ip from Cassandra", e);
-            return ImmutableMap.of();
-        }
+        return hosts.get().stream()


We should unroll this into a loop, and track a best-effort timeout here. I.e., if we take over 60s for this to run, we should just fail, in the case that we prevent an actual update occurring where hostnames are not needed.

Also @jeremyk-91 I decided to go through each host rather than a quorum of hosts, as the blacklist filter removes an unknown number of hosts.

Given the potential expense of resolving these, it seems like we might want this all to be updating & heartbeating in a background thread as opposed to potentially impacting on regular atlas transaction execution thread

good flag - I think this should be on a background thread already though, as following up the places where this supplier is called seems to all point back to the background task that updates the token ring (unless this PR changed that, of course).

…cassandra/pool/HostnamesByIpSupplier.java Co-authored-by: David Schlosnagle <[email protected]>

schlosna · 2022-04-29T15:38:01Z

...b-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/CassandraService.java

@@ -367,6 +369,12 @@ public Optional<CassandraClientPoolingContainer> getRandomGoodHostForPredicate(
        return randomLivingHost.map(pools::get);
    }



This might be a separate consideration, but for com.palantir.atlasdb.keyvalue.cassandra.pool.CassandraService#getReachableProxies should we be including the inputHost as one of the reachable proxies as that may be the client addressable hostname/IP? I was thinking something along the lines of:

private Set<InetSocketAddress> getReachableProxies(String inputHost) throws UnknownHostException { int knownPort = getKnownPort(); ImmutableSet.Builder<InetSocketAddress> proxies = ImmutableSet.builder(); proxies.add(new InetSocketAddress(inputHost, knownPort)); for (InetAddress address : InetAddress.getAllByName(inputHost)) { // It is okay to have reachable proxies that do not have a hostname proxies.add(new InetSocketAddress(address, knownPort)); } return proxies.build(); }

That seems fine to me, though not super aware of the network-related implications

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java

jeremyk-91 · 2022-04-29T16:26:35Z

...b-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/CassandraService.java

@@ -104,7 +105,8 @@ public CassandraService(
        this.blacklist = blacklist;
        this.poolMetrics = poolMetrics;

-        Supplier<Map<String, String>> hostnamesByIpSupplier = new HostnamesByIpSupplier(this::getRandomGoodHost);
+        Supplier<Map<String, String>> hostnamesByIpSupplier =
+                new HostnamesByIpSupplier(this::getAllHostsUnlessBlacklisted);
        this.hostnameByIpSupplier = Suppliers.memoizeWithExpiration(hostnamesByIpSupplier::get, 2, TimeUnit.MINUTES);


No strong opinions on the memoisation length - it was probably picked arbitrarily, honestly. Yeah we can make this 1.

jeremyk-91 · 2022-04-29T16:27:08Z

...b-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/CassandraService.java

@@ -367,6 +369,14 @@ public Optional<CassandraClientPoolingContainer> getRandomGoodHostForPredicate(
        return randomLivingHost.map(pools::get);
    }

+    public List<PoolingContainer<CassandraClient>> getAllHostsUnlessBlacklisted() {


nit: getAllNonBlacklistedHosts? but yeah, the main part looks fine

jeremyk-91 · 2022-04-29T16:32:28Z

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java

-            log.warn("Could not get hostnames by ip from Cassandra", e);
-            return ImmutableMap.of();
-        }
+        return hosts.get().stream()


good flag - I think this should be on a background thread already though, as following up the places where this supplier is called seems to all point back to the background task that updates the token ring (unless this PR changed that, of course).

jeremyk-91 · 2022-04-29T16:34:08Z

...b-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/CassandraService.java

@@ -367,6 +369,12 @@ public Optional<CassandraClientPoolingContainer> getRandomGoodHostForPredicate(
        return randomLivingHost.map(pools::get);
    }



That seems fine to me, though not super aware of the network-related implications

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java

…cassandra/pool/HostnamesByIpSupplier.java Co-authored-by: David Schlosnagle <[email protected]>

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java

jeremyk-91 · 2022-05-03T12:05:50Z

👍 from my end (no strong preference on the duration compares, though I think I agree with @schlosna on that one)

…cassandra/pool/HostnamesByIpSupplier.java Co-authored-by: David Schlosnagle <[email protected]>

Try every host when trying to update hostname mappings if one fails

b5e63ca

schlosna reviewed Apr 29, 2022

View reviewed changes

schlosna requested review from tpetracca, ellisjoe and jeremyk-91 April 29, 2022 15:20

Update atlasdb-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/…

9908044

…cassandra/pool/CassandraService.java Co-authored-by: David Schlosnagle <[email protected]>

schlosna reviewed Apr 29, 2022

View reviewed changes

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java Outdated Show resolved Hide resolved

Sam-Kramer commented Apr 29, 2022

View reviewed changes

Update atlasdb-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/…

7bba24a

…cassandra/pool/HostnamesByIpSupplier.java Co-authored-by: David Schlosnagle <[email protected]>

schlosna reviewed Apr 29, 2022

View reviewed changes

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java Outdated Show resolved Hide resolved

jeremyk-91 reviewed Apr 29, 2022

View reviewed changes

Sam-Kramer and others added 3 commits April 29, 2022 18:31

Update atlasdb-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/…

ffe5e81

…cassandra/pool/HostnamesByIpSupplier.java Co-authored-by: David Schlosnagle <[email protected]>

refactor a bit

4d9647c

fix tests

93d4209

schlosna reviewed Apr 29, 2022

View reviewed changes

...sandra/src/main/java/com/palantir/atlasdb/keyvalue/cassandra/pool/HostnamesByIpSupplier.java Outdated Show resolved Hide resolved

Sam Kramer added 2 commits May 3, 2022 08:57

reduce sleep, use lenient mock

f5b8e25

fix compile java

f3c787a

Sam-Kramer and others added 2 commits May 3, 2022 13:08

Update atlasdb-cassandra/src/main/java/com/palantir/atlasdb/keyvalue/…

d495052

…cassandra/pool/HostnamesByIpSupplier.java Co-authored-by: David Schlosnagle <[email protected]>

Add generated changelog entries

4fd725c

Sam-Kramer merged commit b2aa074 into develop May 3, 2022

delete-merged-branch bot deleted the skramer/try-all-hosts-if-hostname-query-fails branch May 3, 2022 13:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try every host when attempting to get the ip to hostname mapping if a failure is hit. #6014

Try every host when attempting to get the ip to hostname mapping if a failure is hit. #6014

Sam-Kramer commented Apr 29, 2022

changelog-app bot commented Apr 29, 2022 •

edited by Sam-Kramer

Loading

schlosna Apr 29, 2022

Sam-Kramer Apr 29, 2022

jeremyk-91 Apr 29, 2022

schlosna Apr 29, 2022

Sam-Kramer Apr 29, 2022

Sam-Kramer Apr 29, 2022

schlosna Apr 29, 2022

Sam-Kramer Apr 29, 2022

Sam-Kramer Apr 29, 2022

Sam-Kramer Apr 29, 2022

Sam-Kramer Apr 29, 2022

schlosna Apr 29, 2022

jeremyk-91 Apr 29, 2022

schlosna Apr 29, 2022

jeremyk-91 Apr 29, 2022

jeremyk-91 Apr 29, 2022

jeremyk-91 Apr 29, 2022

jeremyk-91 Apr 29, 2022

jeremyk-91 Apr 29, 2022

jeremyk-91 commented May 3, 2022

		@@ -367,6 +369,12 @@ public Optional<CassandraClientPoolingContainer> getRandomGoodHostForPredicate(
		return randomLivingHost.map(pools::get);
		}

Try every host when attempting to get the ip to hostname mapping if a failure is hit. #6014

Try every host when attempting to get the ip to hostname mapping if a failure is hit. #6014

Conversation

Sam-Kramer commented Apr 29, 2022

changelog-app bot commented Apr 29, 2022 • edited by Sam-Kramer Loading

Generate changelog in changelog/@unreleased

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremyk-91 commented May 3, 2022

changelog-app bot commented Apr 29, 2022 •

edited by Sam-Kramer

Loading

Generate changelog in `changelog/@unreleased`