Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eventhub Scaler: add new dapr checkpoint strategy to eventhub scaler #3732

Merged
merged 6 commits into from
Dec 6, 2022

Conversation

christle
Copy link
Contributor

@christle christle commented Oct 8, 2022

Hello,

Dapr uses a prefix for the Checkpoint blob "dapr-hubName-consumerGroup-". So here is another Checkpoint Strategy to support that case.

I will open another PR to update the docs.

Relates to #3022

@christle christle requested a review from a team as a code owner October 8, 2022 16:39
Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, but I think this is duplicated #3139
Thanks for the contribution, but let's wait to check if the original PR is active or no

@christle
Copy link
Contributor Author

christle commented Oct 8, 2022

Oh you are right. I didn’t see that parallel activity.

@tomkerkhove
Copy link
Member

Can we close this PR in favor of #3139 @christle?

@christle
Copy link
Contributor Author

christle commented Dec 2, 2022

Yes, currently, I test the e2e test on this base, but I could cherry-pick the e2e over so that we can close this PR

@tomkerkhove
Copy link
Member

What works for you!

@christle
Copy link
Contributor Author

christle commented Dec 2, 2022

@tomkerkhove I'm ready with the e2e test, but before the new test is green, the related image from the PR kedacore/test-tools#99 should be merged. I've tested it locally with a custom image on my registry.

I've added the e2e test to my PR because I have no access rights to the other fork. I hope it is ok.
Tomorrow, I will create the docs PR.

@christle christle force-pushed the master branch 2 times, most recently from 792c639 to f4dc0ca Compare December 2, 2022 22:16
@JorTurFer
Copy link
Member

JorTurFer commented Dec 2, 2022

/run-e2e
Update: You can check the progress here

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

it seems that the required package "ghcr.io/kedacore/tests-azure-eventhub-dapr" isn't pushed

@JorTurFer
Copy link
Member

I'll check, I guess that it's private as default :(

@JorTurFer
Copy link
Member

JorTurFer commented Dec 3, 2022

/run-e2e event*
Update: You can check the progress here

@JorTurFer
Copy link
Member

It was private, I have already changed it+

@JorTurFer
Copy link
Member

It seems like scaling in doesn't work, maybe the timeout is too small or the worker doesn't consume the messages correctly

@zroubalik
Copy link
Member

zroubalik commented Dec 3, 2022

/run-e2e event*
Update: You can check the progress here

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

On my private setup, based on the "test-infrastructure" repo, I can run all event* tests without problems. The "scale in" needs around 10+ seconds. Currently, I don't see the difference. I see inside the pipeline run a lot of errors around missing eventhub infrastructure and some BlobNotFound exceptions regarding the storageAccount. The BlobNotFound could be the reason for the missing "scale in" because there is no checkpoint. But I don't get this on my private azure, and I have no further instruments for the underlying pipeline infrastructure or the container logs. Could it be that are some parallel e2e tests that cause the deletion of the eventhubs and Storage Blobs?

@JorTurFer
Copy link
Member

On my private setup, based on the "test-infrastructure" repo, I can run all event* tests without problems. The "scale in" needs around 10+ seconds. Currently, I don't see the difference. I see inside the pipeline run a lot of errors around missing eventhub infrastructure and some BlobNotFound exceptions regarding the storageAccount. The BlobNotFound could be the reason for the missing "scale in" because there is no checkpoint. But I don't get this on my private azure, and I have no further instruments for the underlying pipeline infrastructure or the container logs. Could it be that are some parallel e2e tests that cause the deletion of the eventhubs and Storage Blobs?

We use the random names to avoid conflicts between tests, so it shouldn't happen. Let me check the logs

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

ok thank you, I will check another possible reason regarding different dapr versions

@JorTurFer
Copy link
Member

I'm checking in depth the e2e test, and you aren't installing dapr, so basically the checkpoint probably isn't created, and the messages aren't consumed. The e2e cluster doesn't have any infra as default, all needed infrastructure should be installed/removed during the tests. I can't see either any dapr manifest in your e2e test.
I guess you have deployed those things manually into your testing cluster, could this be the issue?

@JorTurFer
Copy link
Member

you have examples about how you can install dapr "globally" here, where we install Workload Identity and AWS Role Assumptions, and you also have other samples about "test scoped" installation in other scalers e2e tests like Datadog or NewRelic, where we install their against in the scope of the test, and we remove them at the end

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

i deploy dapr as single edge container within the deployment template

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

I have created a cluster with the kedacore/test-infrastructure repo, and there it runs as expected, but I will do some further checks. I think as a second container within the deployment, it is very easy to set up and to clean up

@JorTurFer
Copy link
Member

i deploy dapr as single edge container within the deployment template

true, I didn't see it, sorry

@JorTurFer
Copy link
Member

JorTurFer commented Dec 3, 2022

/run-e2e azure_event_hub_dapr_test*
Update: You can check the progress here

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

I've found the issue. I have an older dapr image. I put "ImagePullAllways" inside the template and now I could reproduce the pipeline issue. Ok, i will check that.

@JorTurFer
Copy link
Member

let me trigger them again and check the container logs

@JorTurFer
Copy link
Member

wow, nice catch! 👏
I have triggered it again

Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the improvement 🙇

@JorTurFer
Copy link
Member

Could you create another PR to docs repo adding this new checkpoint?

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

Yes, I will do it

@christle
Copy link
Contributor Author

christle commented Dec 3, 2022

btw, I like the whole test setup very much, it was very easy to set up a test environment with the test-tools and test-infrastructure repo.

@christle
Copy link
Contributor Author

christle commented Dec 5, 2022

I've created a pull request for the docs repository kedacore/keda-docs#996

@tomkerkhove
Copy link
Member

Thank you!

@tomkerkhove
Copy link
Member

tomkerkhove commented Dec 5, 2022

/run-e2e azure_event_hub_dapr_test*

Update: You can check the progress here

@tomkerkhove
Copy link
Member

@christle Thanks a ton! Should we abandon the other PR then?

@christle
Copy link
Contributor Author

christle commented Dec 5, 2022

If it is ok for you, then yes. It was easier for me to test & develop this PR.

CHANGELOG.md Outdated Show resolved Hide resolved
@zroubalik
Copy link
Member

zroubalik commented Dec 5, 2022

/run-e2e azure_event_hub_dapr_test*
Update: You can check the progress here

CHANGELOG.md Outdated Show resolved Hide resolved
Signed-off-by: Tom Kerkhove <[email protected]>
Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zroubalik
Copy link
Member

zroubalik commented Dec 6, 2022

/run-e2e azure_event_hub_dapr_test*
Update: You can check the progress here

@tomkerkhove tomkerkhove enabled auto-merge (squash) December 6, 2022 12:11
@tomkerkhove tomkerkhove merged commit 9dde386 into kedacore:main Dec 6, 2022
josephangbc pushed a commit to josephangbc/keda that referenced this pull request Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants