Improve Sqlite namespace loader implementation #6488

ergo14 · 2023-03-28T14:51:22Z

General

Before this PR:
Namespaces are loaded with a select distinct query
After this PR:
Namespaces are loaded sequentially, each query find the next smallest namespace. This forces sqlite to use the more efficient search plan (as opposed to scan). This is fine because namespace cardinality should be low.
==COMMIT_MSG==
==COMMIT_MSG==

Priority:
P2
Concerns / possible downsides (what feedback would you like?):
Pending testing
Is it okay to replace the implementation as opposed to adding a new endpoint?
Is documentation needed?:
No

Compatibility

Does this PR create any API breaks (e.g. at the Java or HTTP layers) - if so, do we have compatibility?:
No
Does this PR change the persisted format of any data - if so, do we have forward and backward compatibility?:
No
The code in this PR may be part of a blue-green deploy. Can upgrades from previous versions safely coexist? (Consider restarts of blue or green nodes.):
Yes
Does this PR rely on statements being true about other products at a deployment - if so, do we have correct product dependencies on these products (or other ways of verifying that these statements are true)?:
Relies on namespace cardinality to be low
Does this PR need a schema migration?
No

Testing and Correctness

What, if any, assumptions are made about the current state of the world? If they change over time, how will we find out?:
N/A
What was existing testing like? What have you done to improve it?:
Swapped one implementation for another, existing tests should suffice
If this PR contains complex concurrent or asynchronous code, is it correct? The onus is on the PR writer to demonstrate this.:
N/A
If this PR involves acquiring locks or other shared resources, how do we ensure that these are always released?:
N/A

Execution

How would I tell this PR works in production? (Metrics, logs, etc.):
When the request is made on a busy nodes with lots of rows, TimeLock does not get sad
Has the safety of all log arguments been decided correctly?:
N/A
Will this change significantly affect our spending on metrics or logs?:
No
How would I tell that this PR does not work in production? (monitors, etc.):
When the request is made on a busy nodes with lots of rows, TimeLock gets sad
If this PR does not work as expected, how do I fix that state? Would rollback be straightforward?:
Recall and rollback
If the above plan is more complex than “recall and rollback”, please tag the support PoC here (if it is the end of the week, tag both the current and next PoC):
@jeremyk-91

Scale

Would this PR be expected to pose a risk at scale? Think of the shopping product at our largest stack.:
No more than the existing
Would this PR be expected to perform a large number of database calls, and/or expensive database calls (e.g., row range scans, concurrent CAS)?:
O(200) database calls instead of one, but the endpoint is called rarely
Would this PR ever, with time and scale, become the wrong thing to do - and if so, how would we know that we need to do something differently?:
N/A

Development Process

Where should we start reviewing?:

If this PR is in excess of 500 lines excluding versions lock-files, why does it not make sense to split it?:

Please tag any other people who should be aware of this PR:
@jeremyk-91
@sverma30

changelog-app · 2023-03-28T14:51:27Z

Generate changelog in `changelog/@unreleased`

Type

Description

Improve Sqlite namespace loader implementation

Check the box to generate changelog(s)

Generate changelog entry

ergo14 · 2023-03-28T14:53:48Z

leader-election-impl/src/main/java/com/palantir/paxos/SqlitePaxosStateLog.java

+        @SqlQuery("SELECT MIN(namespace) FROM paxosLog")
+        Optional<String> getSmallestNamespace();


Two queries, resisted the temptation of removing this and using "" for the first query

I think this is fine

I actually prefer using just one query as that's consistent with other patterns we have in Atlas - the thing you describe is pretty standard!

ergo14 · 2023-03-28T14:55:12Z

timelock-impl/src/main/java/com/palantir/atlasdb/timelock/management/SqliteNamespaceLoader.java

+    private Optional<String> getSmallestNamespace() {
+        return jdbi.withExtension(SqlitePaxosStateLog.Queries.class, SqlitePaxosStateLog.Queries::getSmallestNamespace);
+    }
+
+    private Optional<String> getNextSmallestNamespace(String lastReadNamespace) {
+        return jdbi.withExtension(
+                SqlitePaxosStateLog.Queries.class, dao -> dao.getNextSmallestNamespace(lastReadNamespace));


Could reduce this to one call but did not want to use an Optional as a method parameter

ergo14 · 2023-03-28T14:56:49Z

timelock-impl/src/main/java/com/palantir/atlasdb/timelock/management/SqliteNamespaceLoader.java

+        Set<Client> clients = new HashSet<>();
+
+        Optional<String> currentNamespace = getSmallestNamespace();
+        while (currentNamespace.isPresent()) {
+            String namespaceString = currentNamespace.get();
+            clients.add(Client.of(namespaceString));
+            currentNamespace = getNextSmallestNamespace(namespaceString);
+        }
+
+        return clients;


Could have used a custom inner class that implements Supplier while keeping some state

Then, the code here would probably be simpler stg like Stream.generate(...).takeWhile(..).collect(..), stg to that effect

But I think this is more human-readable

I don't know if Stream.generate(...) is simpler than what you have! Agree with your approach here

[skip ci]

ergo14 · 2023-03-30T02:44:37Z

leader-election-impl/src/main/java/com/palantir/paxos/SqlitePaxosStateLog.java

+        @SqlQuery("SELECT MIN(namespace) FROM paxosLog WHERE namespace > :namespace")
+        Optional<String> getNextSmallestNamespace(@Bind("namespace") String lastReadNamespace);


Might be contentious, but we should test that the query plan has the word search and not scan

(or even match it directly)

nit: getNext_Lexicographically_SmallestNamespace(...)

Also, making assertions on the query plan is probably a good idea.

So getting the query plan in code is a little involved. Just adding a comment instead of a test if that's good enough?

jeremyk-91

Yep, looks good. I actually prefer just having one query, and the query plan based tests would be a good idea

jeremyk-91 · 2023-03-30T12:20:35Z

leader-election-impl/src/main/java/com/palantir/paxos/SqlitePaxosStateLog.java

+        @SqlQuery("SELECT MIN(namespace) FROM paxosLog WHERE namespace > :namespace")
+        Optional<String> getNextSmallestNamespace(@Bind("namespace") String lastReadNamespace);


nit: getNext_Lexicographically_SmallestNamespace(...)

Also, making assertions on the query plan is probably a good idea.

jeremyk-91 · 2023-03-30T12:21:06Z

leader-election-impl/src/main/java/com/palantir/paxos/SqlitePaxosStateLog.java

+        @SqlQuery("SELECT MIN(namespace) FROM paxosLog")
+        Optional<String> getSmallestNamespace();


I actually prefer using just one query as that's consistent with other patterns we have in Atlas - the thing you describe is pretty standard!

jeremyk-91 · 2023-03-30T18:09:09Z

timelock-impl/src/main/java/com/palantir/atlasdb/timelock/management/SqliteNamespaceLoader.java

+        Set<Client> clients = new HashSet<>();
+
+        Optional<String> currentNamespace = getSmallestNamespace();
+        while (currentNamespace.isPresent()) {
+            String namespaceString = currentNamespace.get();
+            clients.add(Client.of(namespaceString));
+            currentNamespace = getNextSmallestNamespace(namespaceString);
+        }
+
+        return clients;


I don't know if Stream.generate(...) is simpler than what you have! Agree with your approach here

jeremyk-91

👍 Looks good. Thanks!

svc-autorelease · 2023-04-03T14:16:57Z

Released 0.828.0

(cherry picked from commit 1ef52ee)

that it?

b83ae22

ergo14 commented Mar 28, 2023

View reviewed changes

variable naming

561379d

ergo14 commented Mar 28, 2023

View reviewed changes

Add generated changelog entries

7d84708

ergo14 added the do not merge label Mar 28, 2023

ergo14 requested a review from jeremyk-91 March 28, 2023 14:58

ergo14 and others added 3 commits March 29, 2023 12:26

api flag

56beb1b

Merge branch 'aagan' of github.com:palantir/atlasdb into aagan

f8c7694

Autorelease 0.822.0-rc1

c6888a6

[skip ci]

ergo14 commented Mar 30, 2023

View reviewed changes

jeremyk-91 reviewed Mar 30, 2023

View reviewed changes

ergo14 added 2 commits March 31, 2023 02:31

sad

316464c

Merge branch 'aagan' of github.com:palantir/atlasdb into aagan

e8d912e

jeremyk-91 approved these changes Apr 3, 2023

View reviewed changes

ergo14 added autorelease merge when ready and removed do not merge labels Apr 3, 2023

bulldozer-bot bot merged commit 1ef52ee into develop Apr 3, 2023

bulldozer-bot bot deleted the aagan branch April 3, 2023 14:16

ergo14 added a commit that referenced this pull request Apr 5, 2023

Improve Sqlite namespace loader implementation (#6488)

1dd28dd

(cherry picked from commit 1ef52ee)

ergo14 mentioned this pull request Apr 5, 2023

[release/0.756.x] Improve Sqlite namespace loader implementation #6509

Merged

ergo14 added a commit that referenced this pull request Apr 6, 2023

Improve Sqlite namespace loader implementation (#6488) (#6509)

97c6673

(cherry picked from commit 1ef52ee)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Sqlite namespace loader implementation #6488

Improve Sqlite namespace loader implementation #6488

ergo14 commented Mar 28, 2023

changelog-app bot commented Mar 28, 2023 •

edited by ergo14

Loading

ergo14 Mar 28, 2023

jeremyk-91 Mar 30, 2023

ergo14 Mar 28, 2023

ergo14 Mar 28, 2023

jeremyk-91 Mar 30, 2023

ergo14 Mar 30, 2023

jeremyk-91 Mar 30, 2023

ergo14 Mar 31, 2023

jeremyk-91 left a comment

jeremyk-91 Mar 30, 2023

jeremyk-91 Mar 30, 2023

jeremyk-91 Mar 30, 2023

jeremyk-91 left a comment

svc-autorelease commented Apr 3, 2023

		@SqlQuery("SELECT MIN(namespace) FROM paxosLog")
		Optional<String> getSmallestNamespace();

		@SqlQuery("SELECT MIN(namespace) FROM paxosLog WHERE namespace > :namespace")
		Optional<String> getNextSmallestNamespace(@Bind("namespace") String lastReadNamespace);

Improve Sqlite namespace loader implementation #6488

Improve Sqlite namespace loader implementation #6488

Conversation

ergo14 commented Mar 28, 2023

General

Compatibility

Testing and Correctness

Execution

Scale

Development Process

changelog-app bot commented Mar 28, 2023 • edited by ergo14 Loading

Generate changelog in changelog/@unreleased

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremyk-91 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremyk-91 left a comment

Choose a reason for hiding this comment

svc-autorelease commented Apr 3, 2023

changelog-app bot commented Mar 28, 2023 •

edited by ergo14

Loading

Generate changelog in `changelog/@unreleased`