Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker] PIP-192 Added ServiceUnitStateCompactionStrategy #19045

Merged

Conversation

heesung-sn
Copy link
Contributor

Master Issue: #16691

Motivation

This PR adds ServiceUnitStateCompactionStrategy.

Modifications

For the PIP-192, this PR adds the ServiceUnitStateCompactionStrategy and its unit tests.

Also, this PR updates the related classes to enable ServiceUnitStateCompactionStrategy:

  • Added the strategicCompactor member in PulsarService.
  • Enabled ServiceUnitStateCompactionStrategy for the tableview member in ServiceUnitStateChannelImpl.
  • Added the strategicCompactionMap member in PersistentTopic

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

  • *Added unit tests for the new classes.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

We will have separate PRs to update the Doc later.

Matching PR in forked repository

PR in forked repository: heesung-sn#18

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 24, 2022
@heesung-sn heesung-sn changed the title [improve][broker] Added ServiceUnitStateCompactionStrategy [improve][broker] PIP-192 Added ServiceUnitStateCompactionStrategy Dec 24, 2022
@heesung-sn heesung-sn force-pushed the pip-192-compact-service-unit-state branch from 271bee2 to e348be3 Compare December 28, 2022 19:10
@Demogorgon314 Demogorgon314 requested a review from shibd January 19, 2023 01:58
@codecov-commenter
Copy link

codecov-commenter commented Jan 20, 2023

Codecov Report

Merging #19045 (dbbc861) into master (d8569cd) will increase coverage by 13.51%.
The diff coverage is 19.73%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #19045       +/-   ##
=============================================
+ Coverage     47.20%   60.71%   +13.51%     
- Complexity    10645    26081    +15436     
=============================================
  Files           709     1884     +1175     
  Lines         69421   137392    +67971     
  Branches       7449    15106     +7657     
=============================================
+ Hits          32769    83424    +50655     
- Misses        32984    46262    +13278     
- Partials       3668     7706     +4038     
Flag Coverage Δ
inttests 23.99% <14.47%> (?)
unittests 60.12% <19.73%> (+12.92%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...xtensions/channel/ServiceUnitStateChannelImpl.java 0.97% <0.00%> (+0.97%) ⬆️
...a/org/apache/pulsar/client/impl/TableViewImpl.java 85.83% <0.00%> (+85.83%) ⬆️
...ns/channel/ServiceUnitStateCompactionStrategy.java 13.33% <13.33%> (ø)
...sar/broker/service/persistent/PersistentTopic.java 64.86% <50.00%> (+3.31%) ⬆️
...n/java/org/apache/pulsar/broker/PulsarService.java 72.17% <100.00%> (+14.12%) ⬆️
...adbalance/extensions/channel/ServiceUnitState.java 85.71% <100.00%> (+85.71%) ⬆️
.../pulsar/compaction/StrategicTwoPhaseCompactor.java 76.30% <100.00%> (+0.11%) ⬆️
.../pulsar/broker/service/SharedConsumerAssignor.java 5.55% <0.00%> (-62.97%) ⬇️
...apache/pulsar/broker/service/EntryAndMetadata.java 0.00% <0.00%> (-40.75%) ⬇️
...a/org/apache/bookkeeper/mledger/ManagedCursor.java 46.15% <0.00%> (-39.57%) ⬇️
... and 1512 more

@heesung-sn
Copy link
Contributor Author

@Technoboy- Hi, can you take a look again?

private boolean checkBrokers = true;

public ServiceUnitStateCompactionStrategy() {
schema = JSONSchema.of(ServiceUnitStateData.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use SchemaJSON()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -385,6 +385,7 @@ private <T> void phaseTwoLoop(String topic, Iterator<Message<T>> reader,
promise.completeExceptionally(e);
return;
}
outstanding.release(MAX_OUTSTANDING);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move this after promise.complete(null); ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I found a deadlock if we don't release this semaphore before completing the future.

deadLock example

import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Semaphore;
import org.junit.Test;

public class SemaphoreTest {
    
    @Test
    public void deadlockTest()  {
        Semaphore sm = new Semaphore(2);

        ExecutorService executor = Executors.newSingleThreadExecutor();
        CompletableFuture<String> future = new CompletableFuture();
        CompletableFuture.supplyAsync(() -> {
            System.out.println("starting");
            try {
                sm.acquire(2);
                System.out.println("acquired 2");
            } catch (InterruptedException e) {
            }
            try {
                Thread.sleep(1000*5);
            } catch (InterruptedException e) {
            }
            System.out.println("return!");
            future.complete(""); // hangs here.
            System.out.println("proceeding");
            sm.release(2);
            System.out.println("released 2");
            return "";
        }, executor);
        future.thenCompose(x -> {
            try {
                System.out.println("acquiring one!");
                sm.acquire(); // hangs here.
                System.out.println("acquired 1");
                sm.release();
            } catch (InterruptedException e) {
            }
            return CompletableFuture.completedFuture("");
        }).join();
    }
}

compactionScheduler = Executors.newSingleThreadScheduledExecutor(
new ThreadFactoryBuilder().setNameFormat("compaction-%d").setDaemon(true).build());
bk = pulsar.getBookKeeperClientFactory().create(this.conf, null, null, Optional.empty(), null);
schema = JSONSchema.of(ServiceUnitStateData.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema.JSON

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@heesung-sn heesung-sn force-pushed the pip-192-compact-service-unit-state branch from e0c7cfb to bf665d3 Compare January 26, 2023 05:29
log.error("Failed cleaning the ownership serviceUnit:{}, stateData:{}.",
serviceUnit, stateData);
serviceUnitTombstoneErrorCnt.incrementAndGet();
}
});
serviceUnitTombstoneCnt.incrementAndGet();
Copy link
Contributor

@Technoboy- Technoboy- Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help explain why do we need modify here ?
we update serviceUnitTombstoneErrorCnt and serviceUnitTombstoneCnt in the async method. these values could both be 0. and let line 643 pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It relies on producer.flush() to persist all outstanding messages before returning this call.

Yes, because these values could be 0, I updated this code path here. Also, I further cleaned the metrics computation code to be cleaner.

@heesung-sn heesung-sn force-pushed the pip-192-compact-service-unit-state branch from 3618416 to cb61a15 Compare January 28, 2023 08:03
@heesung-sn heesung-sn force-pushed the pip-192-compact-service-unit-state branch from cb61a15 to c818108 Compare January 28, 2023 08:42
@heesung-sn
Copy link
Contributor Author

@Technoboy- ping

Copy link
Contributor

@Technoboy- Technoboy- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -1563,6 +1571,8 @@ public void checkCompaction() {
}

if (backlogEstimate > compactionThreshold) {
log.info("topic:{} backlogEstimate:{} is bigger than compactionThreshold:{}. Triggering compaction",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a debug-level log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this topic compaction log is useful. Could you share your concerns about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose you have many topics, like 50k topics per broker. By default, we will have 50k logs per minute. And we already have logs after the compaction task started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will make this debug lvl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@codelipenghui codelipenghui merged commit 1cd1aef into apache:master Jan 31, 2023
@heesung-sn heesung-sn deleted the pip-192-compact-service-unit-state branch April 2, 2024 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants