-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[@azure/event-hubs] Trying to close a subscription for which there is no Container for checkpoints, gives error #11316
Comments
Hi @marcodalessandro, let me explain what's happening first and then we can talk about a path forward. The reason that the subscription causes a checkpoint store to occur is that the subscription, when stopped, is updating the remote blob store so any other consumers will be aware that the partitions are now abandoned (ie, no longer be read from). Without this the consumers would all have to continually check until they consider the partition ownership to be expired, which is time based. This seems like the right behavior - when we are closed we attempt to do so gracefully. Reading your issue I'm not sure if I understand what you are trying to accomplish or that the behavior as it is is incorrect. Can you help me understand why you manually delete checkpoints? The primary design feature behind checkpoints is that they are an implementation detail managed by the checkpoint store, which means making manual modifications are not accounted for. Are checkpoints becoming corrupted? |
Hey @richardpark-msft thanks for the answer :) We delete the checkpoint mainly to clean up them when we don't need them anymore (i.e.: the eventhub to which they refer to, has been deleted). |
Ah, I see :) In this case the eventprocessor.stop() is doing the abandon as the last step - it's the last critical piece so it's done after all of message retrieval code is deactivated. You can see that here Given that you specifically do not care about updates to the blob store at that point this error is ignorable for you when tearing down. |
I'm going to resolve this issue with a doc update - it seems worth mentioning that .close() of the subscription does update remote state to indicate that it's no longer consuming from a partition, which can be useful to know. |
… when the subscription is closed. (#11345) As part of investigating #11316 we found it was non-obvious that we would be using the checkpoint store when closing. The reason we use it is when we close the subscription we also mark, in the checkpoint store, that the partitions are abandoned. This lets other consumers more quickly pick up abandoned partitions rather than forcing them to wait for an expiration interval and discover it. Fixes #11316
* Sdk automation/@Azure arm operationalinsights (#10935) * Generated from ba891b7274af8cb22ee173e1998b4145d2d8d98b Adding a point get. * version 4.0.0 Co-authored-by: SDK Automation <[email protected]> * Increment package version after release of azure-cosmos (#11311) * [Text Analytics] Ignore document index when parsing opinion pointers (#11302) * [Text Analytics] Ignore Document Index * adding a test * update tests * add recordings * [Text Analytics] Update release date (#11313) * [Text Analytics] Update release date for 5.1.0-beta.1 * format * [core-http] Move challenge based auth to core-http (#11226) * Begin move of challenge-based auth * Begin move of challenge-based auth * Remove keyvault-common * Update paths * Fixing paths after removing keyvault-common * fixed bad dist-esm index.js references * Update paths * Update docs * Add mock tests * Add mock tests Co-authored-by: Daniel Rodríguez <[email protected]> * Revert "[core-http] Move challenge based auth to core-http (#11226)" (#11320) This reverts commit 43033ff. * [AnomalyDetector] Update AnomalyDetector README (#11110) * Update AnomalyDetector README * Update pto beta * Update changelog * Address PR comments * Replaced relative link with absolute links and remove locale (#11317) Replaced relative link with absolute links and remove locale * Enable the link check on aggregate-report (#11330) * Increment package version after release of azure-ai-text-analytics (#11315) * [Tables] Enable recorded tests in CI (#11238) * Enable recorded tests in CI * Update test script * Add pollyfill * [Anomaly Detector] Add sample snippet to README (#11335) * Add samples to readme * Update changelog date * [Core-http] Support xml namespaces (#11201) * Prototype xmlns * Update prototype * Additional tests * Address PR comments * formatting * Apply changes to core-client * Address comments * update core-client api-extractor * use getXmlObjectValue * Increment package version after release of azure-ai-anomaly-detector (#11340) * [event-hubs] Update docs to mention the checkpoint store updates made when the subscription is closed. (#11345) As part of investigating #11316 we found it was non-obvious that we would be using the checkpoint store when closing. The reason we use it is when we close the subscription we also mark, in the checkpoint store, that the partitions are abandoned. This lets other consumers more quickly pick up abandoned partitions rather than forcing them to wait for an expiration interval and discover it. Fixes #11316 * Add testcases for handling odata (#11321) * Add testcases for handling odata * Formatting code * PR Comments * Minor Formatting * Update sdk/search/search-documents/src/odata.ts Co-authored-by: Jeff Fisher <[email protected]> * Update sdk/search/search-documents/src/odata.ts Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> * release arm-netapp (#10849) * release arm-netapp * Generated from efcb7ecf4722c919d5528c53671b124046badcdb update netapp readme.java.md * version change Co-authored-by: SDK Automation <[email protected]> * Sdk automation/@Azure arm hybridcompute (#11307) * Generated from a35b1cd98e20d7c9b3c4fdcb23125b7ea85acca6 add alias for auto_rev_minor_version * v2.0.0 Co-authored-by: SDK Automation <[email protected]> * [Event Grid] Remove some 'en-us' from some links (#11185) * [Identity] Fix broken eslint in the builds (#11354) * [Identity] Fix broken eslint in the builds * .eslintignore was being ignored * Feedback by Deya * [Abort Controller] Update linting and fix linting errors (#11269) * update linting and fix linting errors * update linting scripts * formatting * fix linting errors * fixes * fix * run api-extractor * include the shim files in shipped files * use the unrolledup type declarations * do not lint package.json * still lint package.json * remove the call to api-extractor from the npm build script * simplify eslint commands * only ship src types * [Identity] Idea for the Device Code Credential Use Console Feature (#11355) * [Identity] Idea for the Device Code Credential Use Console Feature * Feedback by Schaab and Vinay * [Key Vault] Use the swagger generated from the service 7.2-preview (#11370) * swagger reference to 7.2-preview/keys.json * generated update * 7.2 on keys * 7.2 preview on certs and secrets * generated files for secrets and certs * secrets and certs * sticking the swagger change to a specific commit * Add placeholder yml file for pipeline generation * Smoke test failure due to rollup peer dependency error (#11372) * Change to move @rollup/plugin-json as dev dependency * Add client libraries for Azure Communication Services (#11385) * Add Communication service mapping * Communication - Add code owners and label triggers (#11401) * Communication - Add code owners and label triggers * [ESLint Plugin] fix docs link (#11400) * Increment package version after release of azure-keyvault-admin (#11216) * Sync eng/common directory with azure-sdk-tools repository for Tools PR 1022 (#11347) * Disable smoke test for storage-blob-changefeed (#11406) * [Core paging] Update linting scripts and auto linting (#11274) * [Core paging] Update linting scripts and auto linting * formatting changes * addressing Jeff's comments * fixes * turn sideEffects back on * adding comments to address Ramya's feedback * a typo * adding dist-esm/index.js back * Update sdk/core/core-paging/.eslintrc.json Co-authored-by: Ramya Rao <[email protected]> * remove dist-esm/src from files in package.json Co-authored-by: Ramya Rao <[email protected]> * fix package.json (#11412) * [Service Bus] Remove stream-browserify dependency (#11221) * [Core asynciterator] Update linting scripts and apply auto linting and some formatting (#11272) * update linting scripts * edits * turn off relevant rules and lint package.json back * adding .eslintrc.json * [service-bus] Fixing broken option - we renamed it to maxWaitTimeInMs a few previews back. (#11410) * Communication: fix test-resources config (#11434) * [core-http] Add NDJSON support (#11325) * [EventGrid] Add Azure Communication Service Events (#11428) This change adds typings for the new events sent by the Azure Communication Services * Sync eng/common directory with azure-sdk-tools repository for Tools PR 989 (#11214) * [Cosmos] Adding missing copyright headers (#11449) * [Service Bus] Type docs fix - Added @internal and @ignore tags for internal methods (#11427) * logger, isError, toAmqpAnnotatedMessage changed to internal * SubQueue renamed to _SubQueue and changed to internal for jsdoc * Revert "SubQueue renamed to _SubQueue and changed to internal for jsdoc" This reverts commit b97b682. * [Storage] Recursive ACL swagger transform patch * [Storage] Add support for recursive ACL * [Storage] Recording for recursive ACL * [Storage] Update changelog for recursive acl * [Storage] Add comments for recursive ACL parameters; Resolve review comments * Add continueOnFailure option for set AccessControlRecursive methods * Removed PathSetAccessControlRecursiveMode from public API list * Update AccessControlChangeFailure to AccessControlChangeError with typical name and message properties * Resolve build failures * Resolve build failures Co-authored-by: changlong-liu <[email protected]> Co-authored-by: SDK Automation <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Deyaaeldeen Almahallawi <[email protected]> Co-authored-by: Jonathan Turner <[email protected]> Co-authored-by: Daniel Rodríguez <[email protected]> Co-authored-by: Jose Manuel Heredia Hidalgo <[email protected]> Co-authored-by: Sima Zhu <[email protected]> Co-authored-by: Richard Park <[email protected]> Co-authored-by: Sarangan Rajamanickam <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: colawwj <[email protected]> Co-authored-by: Matt Ellis <[email protected]> Co-authored-by: Daniel Rodríguez <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: praveenkuttappan <[email protected]> Co-authored-by: Dominik <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Ramya Rao <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: Mohsin Mehmood <[email protected]>
* [storage][stg74] swagger update (#11171) * build:autorest * f7425b8 * update service and pacakge versions * queue updateMessage protocol layer breaking change * createPermissionOperationSpec.isXML=false * regenerate with 3360920245f8c06e10f5e349ce3dca5e4d10372e * fix ae-forgotten-export Co-authored-by: Lin Jian <[email protected]> * swagger-regenerate: 196d5c1 * smb multi-channel (#11178) * [Storage] Feature - Recursive ACL (#9689) * Sdk automation/@Azure arm operationalinsights (#10935) * Generated from ba891b7274af8cb22ee173e1998b4145d2d8d98b Adding a point get. * version 4.0.0 Co-authored-by: SDK Automation <[email protected]> * Increment package version after release of azure-cosmos (#11311) * [Text Analytics] Ignore document index when parsing opinion pointers (#11302) * [Text Analytics] Ignore Document Index * adding a test * update tests * add recordings * [Text Analytics] Update release date (#11313) * [Text Analytics] Update release date for 5.1.0-beta.1 * format * [core-http] Move challenge based auth to core-http (#11226) * Begin move of challenge-based auth * Begin move of challenge-based auth * Remove keyvault-common * Update paths * Fixing paths after removing keyvault-common * fixed bad dist-esm index.js references * Update paths * Update docs * Add mock tests * Add mock tests Co-authored-by: Daniel Rodríguez <[email protected]> * Revert "[core-http] Move challenge based auth to core-http (#11226)" (#11320) This reverts commit 43033ff. * [AnomalyDetector] Update AnomalyDetector README (#11110) * Update AnomalyDetector README * Update pto beta * Update changelog * Address PR comments * Replaced relative link with absolute links and remove locale (#11317) Replaced relative link with absolute links and remove locale * Enable the link check on aggregate-report (#11330) * Increment package version after release of azure-ai-text-analytics (#11315) * [Tables] Enable recorded tests in CI (#11238) * Enable recorded tests in CI * Update test script * Add pollyfill * [Anomaly Detector] Add sample snippet to README (#11335) * Add samples to readme * Update changelog date * [Core-http] Support xml namespaces (#11201) * Prototype xmlns * Update prototype * Additional tests * Address PR comments * formatting * Apply changes to core-client * Address comments * update core-client api-extractor * use getXmlObjectValue * Increment package version after release of azure-ai-anomaly-detector (#11340) * [event-hubs] Update docs to mention the checkpoint store updates made when the subscription is closed. (#11345) As part of investigating #11316 we found it was non-obvious that we would be using the checkpoint store when closing. The reason we use it is when we close the subscription we also mark, in the checkpoint store, that the partitions are abandoned. This lets other consumers more quickly pick up abandoned partitions rather than forcing them to wait for an expiration interval and discover it. Fixes #11316 * Add testcases for handling odata (#11321) * Add testcases for handling odata * Formatting code * PR Comments * Minor Formatting * Update sdk/search/search-documents/src/odata.ts Co-authored-by: Jeff Fisher <[email protected]> * Update sdk/search/search-documents/src/odata.ts Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> * release arm-netapp (#10849) * release arm-netapp * Generated from efcb7ecf4722c919d5528c53671b124046badcdb update netapp readme.java.md * version change Co-authored-by: SDK Automation <[email protected]> * Sdk automation/@Azure arm hybridcompute (#11307) * Generated from a35b1cd98e20d7c9b3c4fdcb23125b7ea85acca6 add alias for auto_rev_minor_version * v2.0.0 Co-authored-by: SDK Automation <[email protected]> * [Event Grid] Remove some 'en-us' from some links (#11185) * [Identity] Fix broken eslint in the builds (#11354) * [Identity] Fix broken eslint in the builds * .eslintignore was being ignored * Feedback by Deya * [Abort Controller] Update linting and fix linting errors (#11269) * update linting and fix linting errors * update linting scripts * formatting * fix linting errors * fixes * fix * run api-extractor * include the shim files in shipped files * use the unrolledup type declarations * do not lint package.json * still lint package.json * remove the call to api-extractor from the npm build script * simplify eslint commands * only ship src types * [Identity] Idea for the Device Code Credential Use Console Feature (#11355) * [Identity] Idea for the Device Code Credential Use Console Feature * Feedback by Schaab and Vinay * [Key Vault] Use the swagger generated from the service 7.2-preview (#11370) * swagger reference to 7.2-preview/keys.json * generated update * 7.2 on keys * 7.2 preview on certs and secrets * generated files for secrets and certs * secrets and certs * sticking the swagger change to a specific commit * Add placeholder yml file for pipeline generation * Smoke test failure due to rollup peer dependency error (#11372) * Change to move @rollup/plugin-json as dev dependency * Add client libraries for Azure Communication Services (#11385) * Add Communication service mapping * Communication - Add code owners and label triggers (#11401) * Communication - Add code owners and label triggers * [ESLint Plugin] fix docs link (#11400) * Increment package version after release of azure-keyvault-admin (#11216) * Sync eng/common directory with azure-sdk-tools repository for Tools PR 1022 (#11347) * Disable smoke test for storage-blob-changefeed (#11406) * [Core paging] Update linting scripts and auto linting (#11274) * [Core paging] Update linting scripts and auto linting * formatting changes * addressing Jeff's comments * fixes * turn sideEffects back on * adding comments to address Ramya's feedback * a typo * adding dist-esm/index.js back * Update sdk/core/core-paging/.eslintrc.json Co-authored-by: Ramya Rao <[email protected]> * remove dist-esm/src from files in package.json Co-authored-by: Ramya Rao <[email protected]> * fix package.json (#11412) * [Service Bus] Remove stream-browserify dependency (#11221) * [Core asynciterator] Update linting scripts and apply auto linting and some formatting (#11272) * update linting scripts * edits * turn off relevant rules and lint package.json back * adding .eslintrc.json * [service-bus] Fixing broken option - we renamed it to maxWaitTimeInMs a few previews back. (#11410) * Communication: fix test-resources config (#11434) * [core-http] Add NDJSON support (#11325) * [EventGrid] Add Azure Communication Service Events (#11428) This change adds typings for the new events sent by the Azure Communication Services * Sync eng/common directory with azure-sdk-tools repository for Tools PR 989 (#11214) * [Cosmos] Adding missing copyright headers (#11449) * [Service Bus] Type docs fix - Added @internal and @ignore tags for internal methods (#11427) * logger, isError, toAmqpAnnotatedMessage changed to internal * SubQueue renamed to _SubQueue and changed to internal for jsdoc * Revert "SubQueue renamed to _SubQueue and changed to internal for jsdoc" This reverts commit b97b682. * [Storage] Recursive ACL swagger transform patch * [Storage] Add support for recursive ACL * [Storage] Recording for recursive ACL * [Storage] Update changelog for recursive acl * [Storage] Add comments for recursive ACL parameters; Resolve review comments * Add continueOnFailure option for set AccessControlRecursive methods * Removed PathSetAccessControlRecursiveMode from public API list * Update AccessControlChangeFailure to AccessControlChangeError with typical name and message properties * Resolve build failures * Resolve build failures Co-authored-by: changlong-liu <[email protected]> Co-authored-by: SDK Automation <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Deyaaeldeen Almahallawi <[email protected]> Co-authored-by: Jonathan Turner <[email protected]> Co-authored-by: Daniel Rodríguez <[email protected]> Co-authored-by: Jose Manuel Heredia Hidalgo <[email protected]> Co-authored-by: Sima Zhu <[email protected]> Co-authored-by: Richard Park <[email protected]> Co-authored-by: Sarangan Rajamanickam <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: colawwj <[email protected]> Co-authored-by: Matt Ellis <[email protected]> Co-authored-by: Daniel Rodríguez <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: praveenkuttappan <[email protected]> Co-authored-by: Dominik <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Ramya Rao <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: Mohsin Mehmood <[email protected]> * lastAccessed in BlobDownloadResponse wrapper * [storage][stg74] container restore (#11457) * interface * includeDeleted * [storage][stg74] get file range diff (#11455) * wip * circular dependency * move share, dirctory and file clients to the same file * swagger-regeneration: df297c8 * get file range diff * nit: PR comments * [storage][stg74] quick query new output format "arrow" (#11423) * quick query new outputSerialization "arrow" * for datalake, and remove preprod * [storage][stg74] Delegation SAS v2 and Directory SAS (#11395) * move under sas/ * wip * wip * test wip * autofill directoryDepth * update allow optional messageText (#11258) * [storage][stg74] 4TB file for standard account (#11177) * build:autorest * update service and pacakge versions * createPermissionOperationSpec.isXML=false * test create, resize, uploadRange for 4tb file with a preprod account * record test then manually remove preprod from recording file, fix documentation * update comments 1TB -> 4TB Co-authored-by: Lin Jian <[email protected]> * add set expiry (#11461) Co-authored-by: Lin Jian <[email protected]> * fix merge issue * Continuation Token wrapped with Error when Recursive Acl call is interrupted (#11716) * [storage][stg74] address PR comments (#11520) * wip * overload SASQueryParameters constructor * api extract * Rename leaseTime -> leaseTimeInSeconds * share lease tests * container restore test wip * restore container test fixed * fix CI * make options optional and add comments * [storage][stg74] set file tier (#11735) * format datalake * re-generate file with de8a4f0 * add set share tier * edit import * [storage] temporary fix for issue #11505 (#11737) * temporary fix for issue #11505 * format * update pacakge versions and change log (#11739) * [storage][stg74] fix ci (#11752) * use beta * fix listContainers when include = [] * fix CI * unskip quick query record & support include-leased for x-ms-delete-snapshots (#11754) * Resolve CRI failure (#11756) * fix lease test case * Record RecAcl (#11765) * [storage][stg74] test coverage improvement (#11757) * use preview for blob and use defaultCredential in datalake * use preview subscription for datalake * turn on log * throw when accountName is invalid * CONTAINER_SOFT_DELETE_ * set cors in service properties * PREMIUM_FILE_ * blob remove .only * SOFT_DELETE_ for both blob and share * unskip a tag case * unskip all f,t cases in browser * file use production subscription * add role for datalakeAccount * me * undo datalake roleAssignment * unskip blob delegation SAS cases via using DefaultCredential * add blobDataOwnerRoleId * premiumFileAccountName * me * unskip last access tracking case * me * delegation sas refactor * fix CI * getRangeListDiff .only * run all cases in file * skip getRangeListDiff in live tests * rename and add comments * last access * skip LAT * me * re-record blob delegation SAS cases * modify testutils and re-record * record for browser * me * fix CI and unskip cases * tune setProperties in IE * remove .only Co-authored-by: Lin Jian <[email protected]> Co-authored-by: xiaonlimsft <[email protected]> Co-authored-by: changlong-liu <[email protected]> Co-authored-by: SDK Automation <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Deyaaeldeen Almahallawi <[email protected]> Co-authored-by: Jonathan Turner <[email protected]> Co-authored-by: Daniel Rodríguez <[email protected]> Co-authored-by: Jose Manuel Heredia Hidalgo <[email protected]> Co-authored-by: Sima Zhu <[email protected]> Co-authored-by: Richard Park <[email protected]> Co-authored-by: Sarangan Rajamanickam <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: colawwj <[email protected]> Co-authored-by: Matt Ellis <[email protected]> Co-authored-by: Daniel Rodríguez <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: praveenkuttappan <[email protected]> Co-authored-by: Dominik <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Ramya Rao <[email protected]> Co-authored-by: Jeff Fisher <[email protected]> Co-authored-by: Mohsin Mehmood <[email protected]>
Describe the bug
If you try to invoke
subscription.close()
on an eventhub subscription for which the container (keeping track of the checkpoints) has already been deleted, it gives this error:To Reproduce
Steps to reproduce the behavior:
The easiest way to repro is to delete the container that stores the checkpoints and then invoke the
subscription.close()
.PseudoCode:
The subscription, even though it throws this error, seems not to be dangling and not to be polling the eventhub anymore.
Notes:
subscription.close()
is not propagated to theprocessError
handler.subscription.close()
), at any polling (that I believe happens every 10 seconds by default), the error is instead propagated to theprocessError
.Expected behavior
It seems the
subscription.close()
is listing all the checkpoint containers before closing the subscription.I would expect ignoring the list of current checkpoint containers.
If instead for any reasons this behavior has some design motivation, it could probably be explicitly noted in the documentation that before deleting the checkpoint all the subscription needs to be closed (but this is something that probably cannot be achieved in all scenarios).
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Our application is running on several K8 clusters and uses multiple instances of different microservices (stateless). One of this microservice (running with multiple instances) opens subscription on eventhub and uses
@azure/[email protected]
to keep track of checkpoints. Under some conditions we need to close the subscription and in our scenario each running instances tries to:409
or404
since the operation is executed from all the instances of the same microservice.The text was updated successfully, but these errors were encountered: