Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the amqp layer to inspect the inner exception for am amqp error code #2053

Merged
merged 7 commits into from
Jun 29, 2021

Conversation

abhipsaMisra
Copy link
Member

@abhipsaMisra abhipsaMisra commented Jun 25, 2021

When link detaches are initiated by the amqp library on service side, the initiated operation is cancelled ungracefully. This results in an OperationCancelledException being thrown. The actual amqp error is present as an inner exception, though.

Since we've not been inspecting these inner exceptions for an amqp error code, this inspection can now be done in three stages:

  • If exception thrown is OperationCancelledException and inner amqp error code is AmqpErrorCode.MessageSizeExceeded -> throw a MessageTooLargeException.

    • Previously - OperationCancelledException was thrown. It is a transient exception which was retried. This is a bug.
    • Now - MessageTooLargeException will be thrown. It is a terminal exception.
      This is technically a breaking change since previously customers would be receiving an OperationCancelledException but now they'd be receiving a MessageTooLargeException. However, behavior wise, the client wouldn't ever succeed previously, nor would it succeed now. Additionally, previously it would retry needlessly, while now it would fail-fast.
  • If exception thrown is OperationCancelledException and inner exception is an AmqpException -> throw the exception based on inner exception.

    • Previously - OperationCancelledException was thrown. It is a transient exception which was retried. This could be the correct behavior for some scenarios while incorrect for others.
    • Now - Depending on the inner AmqpExeception the operation will be retried.
      This is a wider breaking change since previously customers would be receiving an OperationCancelledException but now they'd be receiving an exception depending on what the inner exception is.
      • Inner exception is not AmqpExeception -> sdk retries, no behavior change.
      • Inner exception is AmqpExeception and is transient -> sdk retries, no behavior change.
      • Inner exception is AmqpExeception and is terminal -> sdk will not retry. This would have always failed, however previously customers would have received an OperationCancelledException while now they'd be receiving a different exception based on the inner exception.
  • For all non AmqpException exceptions, if inner exception is an AmqpException -> throw the exception based on inner exception.

    • Previously - whatever exception was caught would have been retried or rethrown.
    • Now - Depending on the inner AmqpExeception the operation will be retried.
      Ideally we should have done this from the beginning, but doing this now can potentially have the most breaking change effect. Based on how many scenarios actually have the exception nested in an inner exception, all those cases might now throw a different exception. Even though I think the end result will be desirable, i.e. the sdk will retry based more information that is provided to it; the extent of change in behavior might be too large.

This PR implements option (2), where I think the cost-return is a bit more balanced (not specific to the one message size exceeded scenario).

Fix for #2055

@abhipsaMisra abhipsaMisra marked this pull request as ready for review June 28, 2021 23:57
abhipsaMisra and others added 3 commits June 29, 2021 08:42
Co-authored-by: David R. Williamson <[email protected]>
@abhipsaMisra
Copy link
Member Author

@barustum FYI

@abhipsaMisra
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@abhipsaMisra abhipsaMisra enabled auto-merge (squash) June 29, 2021 17:04
@abhipsaMisra abhipsaMisra merged commit a2a23d6 into master Jun 29, 2021
@abhipsaMisra abhipsaMisra deleted the abmisr/messageTest branch June 29, 2021 18:50
abhipsaMisra added a commit that referenced this pull request Oct 8, 2021
* feat(shared): Add common resources for convention-based operations

* feat(iot-device): Add support for convention-based telemetry operation

* feat(iot-device): Add support for convention-based command operations

* * feat(iot-device): Add support for convention-based properties operations

* feat(iot-device): Add support for convention-based properties operations

Co-authored-by: James Davis ⛺️🏔 <[email protected]>

* feat(e2e-tests): Add telemetry E2E tests

* feat(e2e-tests): Add command E2E tests

* fix(iot-device): Updating client property collection to handle no convention

* feat(samples): Add thermostat and temperature controller sample

* fix(doc, samples): Update API design doc and move SystemTextJson helper to samples

* fix(iot-device): Separate out root-level and component-level property addition operations

* feat(tests): Add unit tests for ClientPropertyCollection

feat(tests): Add unit tests for ClientPropertyCollection

Co-authored-by: Abhipsa Misra <[email protected]>

* feat(e2e-tests): Add properties E2E tests

Co-authored-by: Abhipsa Misra <[email protected]>

* feat(e2e-tests): Add fault injection tests for properties operations (#2001)

* fix(iot-device, shared, samples): Rename StatusCodes to CommonClientResponseCodes and add a comment to highlight ClientOptions behavior

* fix(iot-device): Fix enumerator implementation to return key-value pairs

* fix(iot-device): Make ClientOptions.PayloadConvention readonly

* fix(shared): Fix from merge conflict

* fix(samples): Update the pnp samples readme (#2025)

* Rename E2ETests.Configuration to TestConfiguration to avoid SDK type name conflict

* Revert "Rename E2ETests.Configuration to TestConfiguration to avoid SDK type name conflict"

This reverts commit 75034e5.

* Rename E2ETests.Configuration to TestConfiguration to avoid SDK type name conflict

* feat(service): add support for DeviceScope to import/export job

* fix(iot-device): Update IoT Hub CONNACK timeout to be 60 seconds

* feat(provisioning-device, prov-amqp, prov-mqtt, prov-https): Add support for timespan timeouts to provisioning device client (#2041)

As discussed in #2036, AMQP provisioning device clients have no way to configure the timeout given to the AMQP library for operations like opening links. This adds overloads to the existing provisioning device client's registerAsync methods that allow for users to configure this timespan timeout.

* fix(iot-device): Update doc comments for MqttTransportSettings

* Add documentation to repo about which platforms this SDK supports (#2046)

fixes #2032

* readme: Update readme to have LTS patch reference

* doc(readme): Updated readme with details on support for device streaming feature. (#2054)

* fix(iot-device): Update the amqp layer to inspect the inner exception for an amqp error code (#2053)

* refactor(samples): Move preview samples to samples repository (#2057)

* fix(e2e-tests): Fix file renaming Configutaion to TestConfiguration

* fix(iot-device): Update connection string validation param

* Update team codeowners

* fix(e2e-tests): Update resource generation script to generate resources
for samples

* initial changeS (#2060)

Co-authored-by: Sindhu Nagesh <[email protected]>

* Adding readme to explain authentication, retry and timeouts in DeviceClient (#2096)

* refactor(iot-device): Rename property add method

* Last of error codes documentation (#2110) (#2074)

refactor(iot-service): Add API example for DeviceAlreadyExists error (#2101)

fix(iot-service): Hide unreferenced error code (#2094)

a few misc errors (#2092)

* a few misc errors

* fixup

Document ArgumentNull (#2091)

Document InvalidOperation and ArgumentInvalid (#2089)

Updating exception details for IotHubThrottledException (#2088)

refactor(iot-service): Add comments for service returned error codes (#2083)

Fix build error (#2087)

Updating exception details for IotHubSuspendedException (#2086)

Updating exception details for IotHubCommunicationException (#2085)

Fix InvalidProtocolVersion documentation. (#2084)

More codes doc'd and code cleanup (#2082)

* More codes doc'd, code clean up

* Update license

* Random is static

* License header

* More codes doc'd, code clean up

More updates to exceptions thrown (#2081)

More error code doc comments

Remove ref to ExponentialBackoff

[Error Codes] Update ThrottleBacklogLimitExceeded error code (#2078)

[Error Codes] Update PreconditionFailed error description (#2077)

* [Error Codes] Update PreconditionFailed error description

* Remove space

[Error Codes] Updated MessageTooLarge (#2073)

* [Error Codes] Updated MessageTooLarge

* Removed en-us where seen

* Remove en-us

* REmove en-us

[Error Codes] Updating UnauthorizedException (#2072)

* [Error Codes] Updating UnauthorizedException

* Remove en-us

* Remove en-us

* Update UnauthorizedException.cs

* Update UnauthorizedException.cs

[Error Codes] Updated ServerErrorException (#2071)

* [Error Codes] Updated ServerErrorException

* Update ServerErrorException.cs

* Update ServerErrorException.cs

* Update ServerErrorException.cs

* Update ServerErrorException.cs

* Update ServerErrorException.cs

[Error Codes] Updated QuotaExceededException (#2070)

* Updated QuotaExceededException classes

* Update QuotaExceededException.cs

* Update QuotaExceededException.cs

Updated ServerBusyException (#2069)

[Error Codes] Update PartitionNotFound error code (#2075)

* [Error Codes] Update PartitionNotFound error code

* remove double lines

* remove double lines

Obsolete error codes that are not throw by the service (#2079)

Fix deprecation messages.

Notes for and deprecation of BlobContainerValidationError, BulkRegistryOperationFailure, and JobQuotaExceeded

Document errors and remove unreferenced file

* Rename readme per style and update reference (#2113)

* * fix(iot-device): TryGetValue methods should check for the component identifier when applicable and should not throw any exceptions (#2111)

* refactor(iot-device): Separate client reported properties into a separate accessor

* fix(iot-device): Add checks to verify that client property update response has version and requestId (#2115)

* refactor(iot-service) Make all clients mockable (#2117)

* refactor(job-client): Make job client easy to mock (#1875)

* refactor(service-client): Make service client easy to mock (#1878)

* refactor(registry-manager): Make registry manager mockable (#1881)

* refactor(digital-twin-client): Make the DigitalTwin client mockable. (#1892)

* refactor(iot-service): Move direct method request file out of JobClient folder into root

* refactor(prov-amqp, prov-mqtt, prov-http): Fix timeout tests failing occasionally (#2121)

* fix(tools): Fix readme for ConsoleEventListener usage (#2123)

* fix(tools): Fix readme for ConsoleEventListener usage

* doc(all): Add documentation for the upcoming service TLS changes for user awareness (#2112)

* fix(doc): Fix typo is readme

* Remove barustum from CODEOWNERS (#2126)

* refactor(ado): Update area paths for analyzers pipeline (#2136)

* refactor(tools-logs): Update log capture script to be configurable and use powershell (#2141)

* refactor(iot-device, prov-mqtt): Target new DotNetty version when not net451 (#2129)

* refactor(doc): Add additional comments to log capture readme

* Bump versions for 2021-08-12 release (#2143)

* fix(readme): Update the LTS support dates readme (#2147)

* fix(e2e): get values from create call to avoid timing issues

* refactor(iot-device): Update the internal method names as per functionality

* refactor(e2e): Remove unnecessary value in keyvault from setup file (#2153)

* fix(e2e): Fix e2e setup script to choose the correct dps endpoint based on region of the resource (#2134)

* fix(readme): Update readme to specify preview strategy

* Removing vinagesh from codebase owners list

* feat(iot-device): Add convenience method for acknowledging writable property update requests

* fix(iot-device): Fix params passed into ObjectDisposedException

* fix(githib): Update github issues template

* Add support for .NET 5 (#2169)

* feat(all): Add support for .NET 5.0

* refactor(all): Update APIs to follow .NET 5.0 patterns

* refactor(e2e-tests): Update E2E tests to follow .NET 5.0 patterns

* refactor(all): Simplify disposal of IDisposable X509Certificate

* fix(vsts): Update the location for log analytics workspace resource

* refactor(e2e-tests): Dispose IDisposable types

* fix(e2e-tests): Dispose IDisposable types created - hmac

* fix(iot-device): Update PayloadCollection to accept dictionary of values (#2171)

* Wrong reference to Device Streaming Sample (#2173)

* Fix link to Device Streaming Sample

* Update iothub/service/samples/readme.md

Co-authored-by: David R. Williamson <[email protected]>

* fix(dps-service): Remove client side validation of x509 CA references format (#2172)

* Add Andy to the repos (#2179)

* Default Branch update: Update master branch to main. (#2183)

* Update master branch to main.

* Update horton-e2e.yaml

* fix(iot-serv): Fix bug where device scope and parent scopes set to device weren't used in bulk add operations (#2189)

#2184

* fix(iot-svc): Cleanup and deprecation warning of code in CryptoKeyGenerator (#2187)

* fix(iot-svc): Add support for GeneratePassword to other targets

* Don't add net472 support, but keep code cleanup

* Add deprecated attribute

* Fix test references by making a local copy for tests

* Add XML docs for exceptions that can be thrown by ModuleClient.SendEventAsync and DeviceClient.SendEventAsync (#2178)

* Add XML docs for exceptions that can be thrown by ModuleClient.SendEventAsync and DeviceClient.SendEventAsync

* Add more XML docs for exceptions for ModuleClient.SendEventAsync and DeviceClient.SendEventAsync

* Update iothub/device/src/DeviceClient.cs

Co-authored-by: David R. Williamson <[email protected]>

* Add using for exceptions

Co-authored-by: David R. Williamson <[email protected]>

* refactor(iot-device): Merge flows for twin and client property operations (#2180)

* fix(e2e-tests): Update E2E tests to initialize and dispose resources correctly

* fix

Co-authored-by: James Davis ⛺️🏔 <[email protected]>
Co-authored-by: jamdavi <[email protected]>
Co-authored-by: David R. Williamson <[email protected]>
Co-authored-by: timtay-microsoft <[email protected]>
Co-authored-by: Sindhu Nagesh <[email protected]>
Co-authored-by: bikamani <[email protected]>
Co-authored-by: Bommas <[email protected]>
Co-authored-by: Basel Rustum <[email protected]>
Co-authored-by: Roman Marusyk <[email protected]>
Co-authored-by: Azad Abbasi <[email protected]>
timstewartm pushed a commit to timstewartm/azure-iot-sdk-csharp that referenced this pull request May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants