Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for OOT (out-of-tree) SGX driver #2061

Merged
merged 1 commit into from
Nov 22, 2024
Merged

Remove support for OOT (out-of-tree) SGX driver #2061

merged 1 commit into from
Nov 22, 2024

Conversation

woju
Copy link
Member

@woju woju commented Nov 15, 2024

Description of the changes

See commit message.

How to test this PR?

CI.


This change is Reviewable

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 49 of 49 files at r1, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @woju)


debian/control line 13 at r1 (raw file):

 libprotobuf-c-dev,
 libsgx-dcap-quote-verify-dev,
 linux-libc-dev (>= 5.11) | linux-headers-amd64 (>= 5.11),

Don't we need linux-headers for sgx.h? Where does sgx.h come from?


pal/src/host/linux-sgx/host_framework.c line 551 at r1 (raw file):

int init_enclave(sgx_arch_secs_t* secs, sgx_sigstruct_t* sigstruct, sgx_arch_token_t* token) {
#ifndef CONFIG_SGX_DRIVER_OOT
    __UNUSED(token);

Now we can just remove this argument.


pal/src/host/linux-sgx/host_main.c line 224 at r1 (raw file):

    log_debug("Token file: %s", token_path);

    ret = read_enclave_token(token_fd, enclave_token);

read_enclave_token() is unused now

Copy link
Contributor

@efu39 efu39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 49 of 49 files at r1, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @woju)


pal/src/host/linux-sgx/host_main.c line 204 at r1 (raw file):

static int get_enclave_token(sgx_arch_token_t* enclave_token, sgx_sigstruct_t* enclave_sigstruct) {
    return create_dummy_enclave_token(enclave_sigstruct, enclave_token);
}

Since the get_enclave_token function is only creating a dummy token, wondering if the function and the code related to the generated dummy token can all be removed.

Code quote:

static int get_enclave_token(sgx_arch_token_t* enclave_token, sgx_sigstruct_t* enclave_sigstruct) {
    return create_dummy_enclave_token(enclave_sigstruct, enclave_token);
}

pal/src/host/linux-sgx/host_main.c line 274 at r1 (raw file):

        log_error("Reading enclave token failed: %s", unix_strerror(ret));
        goto out;
    }

The error message should be updated (or removed?), the function is no longer reading token file. Also create_dummy_enclave_token() never returns ret < 0

Code quote:

    ret = get_enclave_token(&enclave_token, &enclave_sigstruct);
    if (ret < 0) {
        log_error("Reading enclave token failed: %s", unix_strerror(ret));
        goto out;
    }

pal/src/host/linux-sgx/host_main.c line 278 at r1 (raw file):

#ifdef DEBUG
    if (enclave->profile_enable) {
        if (!(enclave_token.body.attributes.flags & SGX_FLAGS_DEBUG)) {

Maybe consider checkingenclave_sigstruct.body instead since the dummy token value was derived from it?

Copy link
Member Author

@woju woju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 47 of 51 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @efu39 and @mkow)


debian/control line 13 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Don't we need linux-headers for sgx.h? Where does sgx.h come from?

/usr/include/x86_64-linux-gnu/asm/sgx.h comes from linux-libc-dev (the one that gets included by #include <asm/sgx.h> without -I to gcc's commandline) . linux-headers (particularly linux-headers-$(uname -r)-common package) also contains sgx.h, and you can set up -I so that it also works as <asm/sgx.h>, however it's not technically meant to be used like that, because linux-headers is meant for compiling kernel modules, not normal software, and you include it as #include <uapi/asm/sgx.h> (the other one, #include <asm/sgx.h>, is not the one that you want).

The problem is (was), for each distro release there's only one linux-libc-dev matching the stable kernel version that does not get updated during the lifetime of a particular release (for 2 years it only gets backports) and on focal (Ubuntu 20.04) it was 5.4.0. So even if you installed newer kernel, you still got linux-libc-dev 5.4-something, so you can't #include <asm/sgx.h>. So we had that machinery to make up for it and steal sgx.h from linux-headers. Now that all supported distros run their base kernel >= 5.11, we can just remove it all.

Technically you should have reworked meson.build in the PR that removed 20.04 to always use /usr/include for upstream (and then change this dependency here), but since I knew we were going to make bigger change to meson.build and #ifdefs in that area when removing OOT, I didn't bother mentioning it then.


pal/src/host/linux-sgx/host_framework.c line 551 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Now we can just remove this argument.

Done.


pal/src/host/linux-sgx/host_main.c line 204 at r1 (raw file):

Previously, efu39 (Erica Fu) wrote…

Since the get_enclave_token function is only creating a dummy token, wondering if the function and the code related to the generated dummy token can all be removed.

Done. Yes, create_dummy_enclave_token we can remove, but I don't know anything about get_optional_sgx_features, so I'll leave it, unless you're sure it's unneeded?


pal/src/host/linux-sgx/host_main.c line 224 at r1 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

read_enclave_token() is unused now

Done.


pal/src/host/linux-sgx/host_main.c line 274 at r1 (raw file):

Previously, efu39 (Erica Fu) wrote…

The error message should be updated (or removed?), the function is no longer reading token file. Also create_dummy_enclave_token() never returns ret < 0

Done. All true, I removed it, and I took liberty to make get_optional_sgx_features void when refactoring.


pal/src/host/linux-sgx/host_main.c line 278 at r1 (raw file):

Previously, efu39 (Erica Fu) wrote…

Maybe consider checkingenclave_sigstruct.body instead since the dummy token value was derived from it?

enclave_sigstruct does not have .body, but yes, we should read from SIGSTRUCT.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @efu39 and @woju)


debian/control line 13 at r1 (raw file):

Previously, woju (Wojtek Porczyk) wrote…

/usr/include/x86_64-linux-gnu/asm/sgx.h comes from linux-libc-dev (the one that gets included by #include <asm/sgx.h> without -I to gcc's commandline) . linux-headers (particularly linux-headers-$(uname -r)-common package) also contains sgx.h, and you can set up -I so that it also works as <asm/sgx.h>, however it's not technically meant to be used like that, because linux-headers is meant for compiling kernel modules, not normal software, and you include it as #include <uapi/asm/sgx.h> (the other one, #include <asm/sgx.h>, is not the one that you want).

The problem is (was), for each distro release there's only one linux-libc-dev matching the stable kernel version that does not get updated during the lifetime of a particular release (for 2 years it only gets backports) and on focal (Ubuntu 20.04) it was 5.4.0. So even if you installed newer kernel, you still got linux-libc-dev 5.4-something, so you can't #include <asm/sgx.h>. So we had that machinery to make up for it and steal sgx.h from linux-headers. Now that all supported distros run their base kernel >= 5.11, we can just remove it all.

Technically you should have reworked meson.build in the PR that removed 20.04 to always use /usr/include for upstream (and then change this dependency here), but since I knew we were going to make bigger change to meson.build and #ifdefs in that area when removing OOT, I didn't bother mentioning it then.

I see, thanks for the explanation!


pal/src/host/linux-sgx/host_framework.c line 124 at r2 (raw file):

    secs->ssa_frame_size = SSA_FRAME_SIZE / g_page_size; /* SECS expects SSA frame size in pages */
    secs->misc_select = sig->misc_select;
    memcpy(&secs->attributes, &sig->attributes, sizeof(sgx_attributes_t));

Now this memcpy doesn't make much sense, you're copying 2 fields and then right afterwards overwrite one of them.

Copy link
Member Author

@woju woju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 50 of 51 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @efu39 and @mkow)


pal/src/host/linux-sgx/host_framework.c line 124 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Now this memcpy doesn't make much sense, you're copying 2 fields and then right afterwards overwrite one of them.

I can do it like that, but I'm unconvinced this is cleaner. It starts to rely that no-one will remove *out_xfrm = line in get_optional_sgx_features.

I also thought about moving this initialisation to host_main.c just before create_enclave() invocation, where two secs fields, base and size, are initialised, because they're copied from struct pal_enclave, not passed here. Or even refactoring to a dedicated constructor, probably create_secs(struct pal_enclave*, sgx_sigstruct_t*). I could use a suggestion either way.

Copy link
Contributor

@efu39 efu39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @mkow)

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)


pal/src/host/linux-sgx/host_framework.c line 124 at r2 (raw file):

It starts to rely that no-one will remove *out_xfrm = line in get_optional_sgx_features.

I don't understand this argument... That's literally what get_optional_sgx_features() does - it returns some value via argument (out_xfrm). It's like saying "what if someone removes return sum; from sum_some_numbers function?".

I also thought about moving this initialisation to host_main.c just before create_enclave() invocation, where two secs fields, base and size, are initialised, because they're copied from struct pal_enclave, not passed here. Or even refactoring to a dedicated constructor, probably create_secs(struct pal_enclave*, sgx_sigstruct_t*). I could use a suggestion either way.

I like the create_secs() idea, but plz not in this PR, it's already large.

Copy link
Member Author

@woju woju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)


pal/src/host/linux-sgx/host_framework.c line 124 at r2 (raw file):
Well, previous version would work if that line was removed, because get_optional_sgx_features() would just light optional bits (and only those optional bits) without providing base value (which was provided by memset). Which would also be a correct behaviour, as the documentation (a lack of it) provides no clue which was intended, and the function name is ambiguous.

plz not in this PR, it's already large.

Sure.

@woju
Copy link
Member Author

woju commented Nov 19, 2024

Jenkins, test Jenkins-Direct-24.04 please (fdatasync01 timed out)

mkow
mkow previously approved these changes Nov 19, 2024
Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)


pal/src/host/linux-sgx/host_framework.c line 124 at r2 (raw file):

the function name is ambiguous

But the argument is called out_xfrm - that means it's used as a return value which is written but not read by the function (that's how you "return a tuple" in C...).

Copy link
Contributor

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 47 of 49 files at r1, 3 of 4 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @woju)


a discussion (no related file):
What about

Because upstream support for SGX was only merged relatively recently and in
non-longterm release, together with the fact that EPID attestation is
and
# plaintext format imitates the legacy output of `gramine-sgx-get-token` tool
?


Documentation/devel/building.rst line 87 at r3 (raw file):

- Install out-of-tree driver and use our provided patches to the Linux kernel
  version 5.4. See section :ref:`legacy-kernel-and-hardware` for the exact
  steps.

What about here?

Code quote:

- Install out-of-tree driver and use our provided patches to the Linux kernel
  version 5.4. See section :ref:`legacy-kernel-and-hardware` for the exact
  steps.

Documentation/devel/building.rst line 154 at r3 (raw file):

built both).

Since Gramine 1.9, we only support upstream, in-kernel driver and the

-> v1.9

Code quote:

1.9

Documentation/devel/building.rst line 350 at r3 (raw file):

Although we recommend kernel version 5.11 or later, Gramine can be run on older
kernels with out-of-tree SGX driver. OOT driver is also the only possibility to
run Gramine on non-FLC hardware. In this configuration, we require kernel at

What about here and all related places below?

Code quote:

Although we recommend kernel version 5.11 or later, Gramine can be run on older
kernels with out-of-tree SGX driver. OOT driver is also the only possibility to
run Gramine on non-FLC hardware. In this configuration, we require kernel at

pal/src/host/linux-sgx/host_framework.c line 40 at r3 (raw file):

static void get_optional_sgx_features(uint64_t xfrm, uint64_t xfrm_mask, uint64_t* out_xfrm) {
    /* see also sgx_get_token.py:get_optional_sgx_features(), used for legacy non-FLC machines */

We don't really have this file now?

Code quote:

sgx_get_token.py

pal/src/host/linux-sgx/host_framework.c line 40 at r3 (raw file):

static void get_optional_sgx_features(uint64_t xfrm, uint64_t xfrm_mask, uint64_t* out_xfrm) {
    /* see also sgx_get_token.py:get_optional_sgx_features(), used for legacy non-FLC machines */

But it's said that "OOT driver is also the only possibility to run Gramine on non-FLC hardware"?

Code quote:

used for legacy non-FLC machines

Copy link
Member Author

@woju woju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 49 of 52 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @efu39, @kailun-qin, and @mkow)


a discussion (no related file):

What about

Because upstream support for SGX was only merged relatively recently and in
non-longterm release, together with the fact that EPID attestation is

I'll remove whole section. All distros listed above this section have kernel >= 5.11, or backported SGX driver.

and

# plaintext format imitates the legacy output of `gramine-sgx-get-token` tool

?

This is OK to retain I think, the output still resembles the non-existing tool. I can remove if you insist.


Documentation/devel/building.rst line 87 at r3 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

What about here?

Done. I've rewritten to mention sgx.h, which changed in this PR (see my other discussion with @mkow).


Documentation/devel/building.rst line 154 at r3 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

-> v1.9

Done.


Documentation/devel/building.rst line 350 at r3 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

What about here and all related places below?

Done. Removed almost all of it, save for a comment about RHEL kernels. I'leave the section with minimal comment and the anchor, for people who linked to it.


pal/src/host/linux-sgx/host_framework.c line 40 at r3 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

We don't really have this file now?

Done. Removed whole comment.


pal/src/host/linux-sgx/host_framework.c line 40 at r3 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

But it's said that "OOT driver is also the only possibility to run Gramine on non-FLC hardware"?

Done, removed whole comment.


Documentation/devel/building.rst line 169 at r4 (raw file):

   meson compile -C build/
   sudo meson compile -C build/ install

Thinking about it more, I changed this to match meson compile below. This way, not the other way around, because we don't explicitly apt-get install ninja anymore.

Copy link
Contributor

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r4, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@woju
Copy link
Member Author

woju commented Nov 21, 2024

Jenkins, test Jenkins-Direct-22.04-Debug please (creat01 timed out, looks random)

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r4, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved


Documentation/devel/building.rst line 360 at r1 (raw file):

check kernel's ``.config`` and consult your distro documentation.

1. Install Linux kernel with patched FSGSBASE

Finally gone!

@woju woju merged commit 5789620 into master Nov 22, 2024
26 checks passed
@woju woju deleted the woju/drop-oot branch November 22, 2024 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants