Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][client] PIP-302 Introduce refreshAsync API for TableView #21166

Closed
wants to merge 6 commits into from
Closed

[improve][client] PIP-302 Introduce refreshAsync API for TableView #21166

wants to merge 6 commits into from

Conversation

liangyepianzhou
Copy link
Contributor

@liangyepianzhou liangyepianzhou commented Sep 12, 2023

Motivation

Prerequisite: Since messages are constantly being written into the Topic and there is no read-write lock guarantee, we cannot assure the retrieval of the most up-to-date value.
Implementation Goal: Record a checkpoint before reading and ensure the retrieval of the latest value of the key up to this checkpoint.
Use Case: When read and write operations for a certain key do not occur simultaneously, we can refresh the TableView before reading the key to obtain the latest value for this key.

Modification

Introduce a new API refreshAsync.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

In the context of utilizing the TableView component, there are instances where we aspire to consistently retrieve the most up-to-date value associated with a given key. To accomplish this, we can employ an API that allows us to wait until all data has been fully retrieved before accessing the value corresponding to the desired key.
### Modification
Introduce a new API method called `readAllExistingMessages()`
```java
@Override
public CompletableFuture<Void> readAllExistingMessages() {
return reader.thenCompose(this::readAllExistingMessages);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just suddenly realised there's a bug here. In the worst condition, the publisher publishes messages continuously, and the reader.hasMessageAvailableAsync() always returns true. The tableview will stuck for a long time. :/

We should give a checkpoint before starting the infinity loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

@liangyepianzhou liangyepianzhou Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. This is a latent issue that has existed previously. I was initially vacillating on whether to incorporate the fix within this proposal. I am grateful for your input; let us then incorporate this resolution into this proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, it might be more beneficial to create a separate PR solely for the purpose of rectifying this bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I raise a PR to fix it. Feel free to give comments.

pip/pip-302.md Outdated

```java
@Override
public CompletableFuture<Void> readAllExistingMessages() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not good to expose this method directly to the TableView interface because the method named read something but nothing returned to the caller.

I think maybe refresh() is better.

table.refresh().whenComplete((__, ex) -> {
       if (ex == null) {
              table.get(key);
       }
})

Add we should also provide a clear description of the behavior for the new method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for this PR.

It's not an issue of the existing TableView implementation, but it's an issue if you want to expose readAllExistingMessages()method to users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this an issue with the existing implementation? If there are a large number of producers sending messages to a single topic, then the delay in building the TableView for this topic could be very large, and in extreme cases, it may never be completed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe refresh() is better.

table.refresh().whenComplete((__, ex) -> {
       if (ex == null) {
              table.get(key);
       }
})

Add we should also provide a clear description of the behavior for the new method.

Great suggestion; I've updated the notes.

@merlimat
Copy link
Contributor

If we want to allow to implement different consistency models, it would be better to have that as part of the TableView configuration builder, instead of requiring a user to read everything that was published.

@liangyepianzhou liangyepianzhou changed the title [improve][client] pip-302 Add new API readAllExistingMessages for TableView [improve][client] PIP-302 Add alwaysRefresh Configuration Option for TableView to Read Latest Values Sep 15, 2023
@liangyepianzhou
Copy link
Contributor Author

liangyepianzhou commented Sep 15, 2023

If we want to allow to implement different consistency models, it would be better to have that as part of the TableView configuration builder, instead of requiring a user to read everything that was published.

Good suggestion. I have made the change as you suggested.

@liangyepianzhou
Copy link
Contributor Author

@merlimat Following our discussion, there might be some potential downsides to modifying the tableViewBuilder as follows:

  1. We would need to add many asynchronous interfaces, such as T getAsync(String key); boolean containsKeyAsync(String key); Collection valuesAsync(); etc.
  2. Even if we refresh the tableView before each data read, we can't guarantee that the data retrieved is the most up-to-date. This configuration might confuse users.

@liangyepianzhou
Copy link
Contributor Author

Prerequisite: Since messages are constantly being written into the Topic and there is no read-write lock guarantee, we cannot assure the retrieval of the most up-to-date value.
Implementation Goal: Record a checkpoint before reading and ensure the retrieval of the latest value of the key up to this checkpoint.
Use Case: When read and write operations for a certain key do not occur simultaneously, we can refresh the TableView before reading the key to obtain the latest value for this key.

Based on the above, I believe it would be more reasonable to introduce an asynchronous API refreshAsync(). There are two reasons for this:

  • It provides more flexibility for the user.
  • The operation of refresh is easier for users to understand.

@liangyepianzhou
Copy link
Contributor Author

@merlimat @codelipenghui, I changed the design from adding alwaysRefresh configuration option for TableViewBuilder to Add refreshAsync method. If necessary, we can add the alwaysRefresh configuration option for TableViewBuilder in the future.
Please help to leave your valuable feedback. Thanks.

@codelipenghui
Copy link
Contributor

@merlimat @codelipenghui, I changed the design from adding alwaysRefresh configuration option for TableViewBuilder to Add refreshAsync method. If necessary, we can add the alwaysRefresh configuration option for TableViewBuilder in the future.
Please help to leave your valuable feedback. Thanks.

@liangyepianzhou +1, I think we can continue the proposal.

pip/pip-302.md Outdated Show resolved Hide resolved
@liangyepianzhou liangyepianzhou changed the title [improve][client] PIP-302 Add alwaysRefresh Configuration Option for TableView to Read Latest Values [improve][client] PIP-302 Introduce refreshAsync API for TableView Sep 25, 2023
@liangyepianzhou liangyepianzhou closed this by deleting the head repository Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs type/PIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants