Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support vpinsrq in delocater #1543

Merged
merged 1 commit into from
Apr 23, 2024
Merged

Conversation

torben-hansen
Copy link
Contributor

@torben-hansen torben-hansen commented Apr 20, 2024

Issues:

P125414272

Description of changes:

Instead of generating portable code, gcc can be configured with e.g. march=cpu-type that allows it to generate code using instructions from instruction sets supported on up-to cpu-type.

In one of these cases, we for example saw:

[ 62%] Generating bcm-delocated.S
error while processing "\tvpinsrq\t$1, EVP_PKEY_CTX_dup@GOTPCREL(%rip), %xmm1, %xmm0\n" on line 27676: "GOT access must be source operand, instrOther"

In this case, the compiler tries to be smart and get the necessary memory addresses for EVP_PKEY_CTX_free and EVP_PKEY_CTX_dup to populate EVP_MD_pctx_ops e.g.:

    vmovq   EVP_PKEY_CTX_free@GOTPCREL(%rip), %xmm1
    vpinsrq $1, EVP_PKEY_CTX_dup@GOTPCREL(%rip), %xmm1, %xmm0
    vmovdqa %xmm0, -16(%rbp)

This fails because vpinsrq is not supported in the delocater.

This PR adds support for vpinsrq an instruction from the set AVX512DQ. I don't think there are other 4-argument instructions, for where a GOT reloc can be emitted, that we need to support right now. This also makes the implementation a tad easier, because we do not need to cater for the relocation being either the first or second argument - it can only be the second.

Otherwise, the implementation follows the one for the type instrThreeArg.

Testing:

Tested this in an environment with conditions that triggers the missing support for vpinsrq. Before it failed, after this PR is applied, build and tests succeeded.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

@torben-hansen torben-hansen requested a review from a team as a code owner April 20, 2024 21:30
@dkostic
Copy link
Contributor

dkostic commented Apr 22, 2024

nit: vpinsrq was introduced in AVX ISA.

@torben-hansen torben-hansen merged commit d940162 into aws:main Apr 23, 2024
46 checks passed
skmcgrail pushed a commit to skmcgrail/aws-lc that referenced this pull request May 1, 2024
Instead of generating portable code, gcc can be configured with e.g. march=cpu-type that allows it to generate code using instructions from instruction sets supported on up-to cpu-type.

This fails because vpinsrq is not supported in the delocater.

This PR adds support for vpinsrq an instruction from the set AVX512DQ. I don't think there are other 4-argument instructions, for where a GOT reloc can be emitted, that we need to support right now. This also makes the implementation a tad easier, because we do not need to cater for the relocation being either the first or second argument - it can only be the second.

Otherwise, the implementation follows the one for the type instrThreeArg.

(cherry picked from commit d940162)
skmcgrail added a commit that referenced this pull request May 2, 2024
d52018b Minor functions to build with Ruby's cipher module (#1564)
364d28b Changed SSL_client_hello_get0_ciphers to align with OpenSSL
behavior (#1542)
e8eb7de ppc64le: EVP_has_aes_hardware is false w/ no-asm (#1566)
d726d06 OpenBSD 7.4 and 7.5 Support (#1437)
a66c66e Remove comments about overread for entropy generation (#1551)
f8a575f Migrate from __FreeBSD__ to __FreeBSD_version (#1562)
c31d1ce Centralize handling of s2n-bignum alt/non-alt function
selection (#1547)
00f3c45 CI for other MacOS versions (#1558)
0541314 Cleanup remaing duplicate symbol definitions and turn
Wredundant-decls on (#1561)
4d280eb Fix ec2 CI testing framework (#1541)
9a4b43e Update x25519_test.cc array initialization to avoid a bug with
a GCC 13 warning (#1555)
388cbe7 Remove duplicate X509_OBJECT_new and X509_OBJECT_free
declarations (#1560)
2ea6706 Avoid 'z' format with MSVCRT (#1559)
c25dc2a Add dependency to python3-six in github action grpc (#1554)
2bdcba3 Link porting guide table to header documentation (#1540)
311ca38 Basic GH CI build/test with full range of gcc/clang (#1546)
1f19717 Add SHA3-256 KAT to FIPS self-test (#1549)
0f3548a Add EC point add/dbl to speed.cc (#1545)
d7ddfc4 Fix the NTP integration test (NTP website changed) (#1548)
8ccd85b Fix skipped tests in Mariadb integration CI (#1533)
d940162 Support vpinsrq in delocater (#1543)
4cd6d21 Remove redundant test exec libraries (#1544)
56f3569 [ML-KEM] Add experimental support for ML-KEM-512-IPD (#1516)
c295aef Upstream merge 2024 04 16 (#1535)
2e51629 Re-add function
0aebf17 Define OPENSSL_NO_TLS_PHA, typedef PSK callback signatures
(#1526)
46056cf Pull the string-based extensions APIs into their own section
960ea42 Unexport X509_VERIFY_PARAM_lookup
3c597b1 Remove X509_VERIFY_PARAM_get0_peername
9c399e5 Document some key usage accessors
2fe70b5 Simplify and document X509_supported_extension
2e04897 Const-correct X509_LOOKUP_METHOD
9826568 Replace X509_LOOKUP_ctrl with real functions
e47c056 Tidy up x509_lu.c functions a little
62e019f Clean up the by_file_ctrl x509 code to be slightly less obtuse
45c46c2 Use relative links in markdown files

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants