Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test AlwaysEncrypted.ConversionTests.ConversionSmallerToLarger... fails #1230

Closed
Wraith2 opened this issue Aug 25, 2021 · 11 comments · Fixed by #1251
Closed

Test AlwaysEncrypted.ConversionTests.ConversionSmallerToLarger... fails #1230

Wraith2 opened this issue Aug 25, 2021 · 11 comments · Fixed by #1251

Comments

@Wraith2
Copy link
Contributor

Wraith2 commented Aug 25, 2021

The test Microsoft.Data.SqlClient.ManualTesting.Tests.AlwaysEncrypted.ConversionTests.ConversionSmallerToLargerInsertAndSelectBulk is failing over multiple PRs with the error:

System.AggregateException : One or more errors occurred. (Cannot access destination table 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30'.) (Cannot drop column encryption key 'AE_CEK_0762a097_513a_4c7d_8442_f280a9212167' because the key is referenced by column 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30.Column1'.)
---- System.InvalidOperationException : Cannot access destination table 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30'.
-------- Microsoft.Data.SqlClient.SqlException : Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
The request failed to run because the batch is aborted, this can be caused by abort signal sent from client, or another request is running in the same session, which makes the session busy.
Operation cancelled by user.
------------ System.ComponentModel.Win32Exception : The wait operation timed out
---- Microsoft.Data.SqlClient.SqlException : Cannot drop column encryption key 'AE_CEK_0762a097_513a_4c7d_8442_f280a9212167' because the key is referenced by column 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30.Column1'.

which looks like it's a test setup issue.

@cheenamalhotra
Copy link
Member

@johnnypham could you check this plz?

Thanks!

@johnnypham
Copy link
Contributor

@Wraith2 could you point me to a PR where it failed? I went through the test but didn't see anything that could cause this.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Aug 27, 2021

@johnnypham
Copy link
Contributor

Test is passing now after doing some cleanup on the agent. Let us know if you see it happening again.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Aug 31, 2021

Thanks. Any chance you could re-run the tests on that PR? I'd have to commit to it to cause it to run and I don't have any changes to make.

@johnnypham
Copy link
Contributor

It's still failing, this time on a managed SNI pipeline with a different underlying exception:

System.AggregateException : One or more errors occurred. (Cannot access destination table 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30'.) (Cannot drop column encryption key 'AE_CEK_0762a097_513a_4c7d_8442_f280a9212167' because the key is referenced by column 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30.Column1'.)
---- System.InvalidOperationException : Cannot access destination table 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30'.
-------- Microsoft.Data.SqlClient.SqlException : Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
The request failed to run because the batch is aborted, this can be caused by abort signal sent from client, or another request is running in the same session, which makes the session busy.
Operation cancelled by user.
------------ System.ComponentModel.Win32Exception : The wait operation timed out
---- Microsoft.Data.SqlClient.SqlException : Cannot drop column encryption key 'AE_CEK_0762a097_513a_4c7d_8442_f280a9212167' because the key is referenced by column 'AE_large_type_enc_112eff3e_f052_464d_8943_170b2b2d3f30.Column1'.

This test has not been problematic in the past, even when the AE machines were slow. Recent PRs are not failing either. Could this be related to changes in your PR?

@Wraith2
Copy link
Contributor Author

Wraith2 commented Aug 31, 2021

There's a remote chance that it could be but it's very unlikely. I've changed how packets are received but this operation will all fit in a single packet so that won't have kicked in. If it was caused by an error in the library it wouldn't have that really specific message about the constraint that it was unable to delete, that's coming from the server.

@johnnypham
Copy link
Contributor

I've been testing some changes that ensure the table is dropped before attempting to drop the CEK. After a few runs, I'm not seeing the original error anymore but a couple of new intermittent errors have surfaced, which I think might have been the true cause of the original error:

Microsoft.Data.SqlClient.SqlException : Failed to decrypt column 'Column1'.
Decryption failed. The last 10 bytes of the encrypted column encryption key are: '13-23-B0-50-40-19-44-72-49-57'. The first 10 bytes of ciphertext are: '01-4B-8F-50-F9-BD-51-0B-FF-70'.
Specified ciphertext has an invalid authentication tag.
Parameter name: cipherText
 ---- System.ArgumentException : Specified ciphertext has an invalid authentication tag.
  Error Message:
   The values read for row '65' column 'Column1' are not identical
Expected: True
Actual:   False

These look driver related. The first one is caused by an incorrect ciphertext received from the server. Any thoughts?

Some of the failing runs:
https://dev.azure.com/sqlclientdrivers-ci/sqlclient/_build/results?buildId=37803&view=results
https://dev.azure.com/sqlclientdrivers-ci/sqlclient/_build/results?buildId=37802&view=results

@Wraith2
Copy link
Contributor Author

Wraith2 commented Sep 3, 2021

If the original error is gone then that's the job done. Thanks for sorting it.

Any real errors like the new one will need to be worked through as usual. I opened the PR to try tests I couldn't run locally and given the scope of changes some small problems aren't surprising.

@johnnypham
Copy link
Contributor

Right but as you pointed out, the netfx pipeline was failing so I'm not sure if I've gotten to the root cause.

@Wraith2
Copy link
Contributor Author

Wraith2 commented Sep 3, 2021

Any failures on netfx definitely aren't me. It's probably worth opening an issue and letting it be prioritised for the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants