-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add symlink files to the trusted files list #120
Add symlink files to the trusted files list #120
Conversation
1c68bcf
to
2bbc3b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jkr0103)
-- commits
line 3 at r1:
I would also add an explanation as the commit body message:
Since `os.walk(..., followlinks=True)` can lead to infinite recursion
due to symlinks pointing to each other, we need to keep track of
already-visited directories via the `dirs` set.
finalize_manifest.py
line 64 at r1 (raw file):
continue st = os.stat(dir) dirkey = st.st_dev, st.st_ino, dir
I know we discussed this offline, but I still don't understand why we need to use dev
and ino
as additional identifiers of "is it the path we already traversed?"...
What happens if we remove st_dev
and st_ino
? Wouldn't just dir
(full path name) be enough?
Then the code will also look simpler and nicer:
...
for dirname in dirnames:
dir = os.path.join(dirpath, dirname)
if exclude_re.match(dir.decode('UTF-8')):
# exclude special paths from list of trusted files
continue
if dir not in dirs:
dirs.add(dir)
scandirs.append(dirname)
dirnames[:] = scandirs
2bbc3b4
to
199228f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would also add an explanation as the commit body message:
Since `os.walk(..., followlinks=True)` can lead to infinite recursion due to symlinks pointing to each other, we need to keep track of already-visited directories via the `dirs` set.
Done.
finalize_manifest.py
line 64 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I know we discussed this offline, but I still don't understand why we need to use
dev
andino
as additional identifiers of "is it the path we already traversed?"...What happens if we remove
st_dev
andst_ino
? Wouldn't justdir
(full path name) be enough?Then the code will also look simpler and nicer:
... for dirname in dirnames: dir = os.path.join(dirpath, dirname) if exclude_re.match(dir.decode('UTF-8')): # exclude special paths from list of trusted files continue if dir not in dirs: dirs.add(dir) scandirs.append(dirname) dirnames[:] = scandirs
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103)
a discussion (no related file):
@jkr0103 I know you experimented with this PR a bit on different workloads. Could you share your results? Like, how many more files are added to sgx.trusted_files
with this PR, and whether it requires greater enclave_size
in Gramine manifest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
@jkr0103 I know you experimented with this PR a bit on different workloads. Could you share your results? Like, how many more files are added to
sgx.trusted_files
with this PR, and whether it requires greaterenclave_size
in Gramine manifest?
I have attached the report in PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners
a discussion (no related file):
Previously, jkr0103 (Jitender Kumar) wrote…
I have attached the report in PR description.
Thanks, looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @jkr0103)
a discussion (no related file):
@jkr0103: Please don't paste screenshots / attach excel files if you just want to include a simple table. Use Markdown for this.
a discussion (no related file):
Do I understand correctly, that this PR fixes a few workloads, but breaks PyTorch? If that's right, then please debug and fix it.
finalize_manifest.py
line 63 at r2 (raw file):
# exclude special paths from list of trusted files continue if dir not in dirs:
@woju: Could you verify if this logic is correct for traversing directory tree with symlinks?
finalize_manifest.py
line 66 at r2 (raw file):
dirs.add(dir) scandirs.append(dirname) dirnames[:] = scandirs
How does this work?
Code quote:
dirnames[:] = scandirs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
@jkr0103: Please don't paste screenshots / attach excel files if you just want to include a simple table. Use Markdown for this.
done
a discussion (no related file):
If that's right, then please debug and fix it.
Pytorch workload, only in debug breaks on gramine 1.3.1 but works on gramine master master. Do we really need to root cause that in gramine 1.3.1 which is not seen in gramine latest master, may be as we don't have to addpal_internal_memory
as it's depricated. Hence I don't see the issue is blocker here.
finalize_manifest.py
line 66 at r2 (raw file):
it just replaces the list with new one, example below:
>>> p = ['P','y','t','h','o','n']
>>> p
['P', 'y', 't', 'h', 'o', 'n']
>>> p[:] = [1, 2, 3]
>>> p
[1, 2, 3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
a discussion (no related file):
Previously, jkr0103 (Jitender Kumar) wrote…
If that's right, then please debug and fix it.
Pytorch workload, only in debug breaks on gramine 1.3.1 but works on gramine master master. Do we really need to root cause that in gramine 1.3.1 which is not seen in gramine latest master, may be as we don't have to addpal_internal_memory
as it's depricated. Hence I don't see the issue is blocker here.
@mkow Yes, there is indeed some hang in PyTorch in the debug config and with this PR on Gramine v1.3.1.
Why this happens? This PR increases the number of sgx.trusted_files
, and thus increases the TOML representation in enclave memory. Something deep inside Gramine's memory allocation breaks (the old-style one, from the time when PAL memory alloc was separate from LibOS memory alloc). We didn't root cause the exact reason, because this problem disappeared in the latest master branch.
If you insist, we can debug this issue a bit more, to have a more concrete understanding of what exactly went wrong in the old memory allocation logic.
I felt that it was just wasting time, because the only thing this PR changes is the number of sgx.trusted_files
in the final manifest.sgx file.
finalize_manifest.py
line 66 at r2 (raw file):
@mkow This is slice assignment. See https://stackoverflow.com/questions/10623302/how-does-assignment-work-with-list-slices
Basically, this modified the list dirnames
in place, instead of creating a new list.
Why we do it instead of dirnames = scandirs
? Because we want to change dirnames
in os.walk()
. Please read this: https://docs.python.org/3/library/os.html#os.walk
An important exerpt:
When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.
Sometimes Python is really unintuitive...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 59 at r2 (raw file):
scandirs = [] for dirname in dirnames: dir = os.path.join(dirpath, dirname)
dir()
is a builtin (even reviewable highlighted it), please rename this variable
finalize_manifest.py
line 63 at r2 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
@woju: Could you verify if this logic is correct for traversing directory tree with symlinks?
Depending on what you're attempting to do? Do you want to deduplicate them or copy? os.walk
is naive, will traverse a directory twice if there are two links pointing to it (or one subdirectory and a symlink pointing to it). dirs = set()
and this if
will not prevent it, because the path is what os.walk
went through, not the realpath
. This if
is noop (always true) if that's what you're asking, and the logic of replacing dirnames
looks pointless.
But it goes worse, os.walk
will happily traverse link loops and will turn around the DFS when one scandir
throws OSError
for ELOOP
(40), in which case os.walk
will catch this exception and silently (!) ignore it and just won't follow further.
Try it yourself:
mkdir -p a/b
ln -s ../.. a/b/c
python3 -c "for d, _, _ in __import__('os').walk('a', followlinks=True): print(d)"
In the case from PR description (/bin/
in usrmerged system) it will probably have 2 copies of trusted files (same hashes for 2 different paths), but it's very hard to say if there aren't any unintended consequences in other FS tree layouts. In absolute sense, this is not a "correct logic" from POSIX pov, but I don't know enough about handling symlinks in Gramine and GSC to say if this should be rejected altogether. This if
is certainly to be either removed, or corrected with os.path.realpath()
or pathlib.Path.resolve()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
In the case from PR description (
/bin/
in usrmerged system) it will probably have 2 copies of trusted files (same hashes for 2 different paths)
@woju This is exactly what we want to achieve.
it's very hard to say if there aren't any unintended consequences in other FS tree layouts
I'm unsure what you mean by this? From the Gramine perspective, whatever happens on the host is irrelevant, as long as Gramine can access the host file using the path.
os.walk
is naive, will traverse a directory twice if there are two links pointing to it (or one subdirectory and a symlink pointing to it).
Yes, and this is exactly what we want here. We want to traverse the same directory twice, if there are two links (two "path names") leading to it.
This
if
is noop (always true) if that's what you're asking
I did some quick tests, Woju is right.
So looks like the if
should be like this:
# instead of `if dir not in dirs:`
st = os.stat(dir)
dirkey = st.st_dev, st.st_ino
if dirkey not in dirs:
dirs.add(dirkey)
scandirs.append(dirname)
@woju Does this check make more sense to you?
Basically, the whole point of this PR is to this scenario:
/bin/bash
is an executable, and/bin/
is the "real" directory./usr/bin/
is a symlink to/bin/
; so/usr/bin/bash
is the same executable as/bin/bash
.
Gramine needs to add both absolute paths to sgx.trusted_files
in the manifest:
sgx.trusted_files = [
"file:/bin/bash",
"file:/usr/bin/bash",
]
This is literally the only thing we try to achieve with this PR. We don't care that both paths lead to the same file. We don't care that a prefix of one path is a symlink to the other path. We rely on the host to follow symlinks. (And Gramine treats these two as just normal files, Gramine has no idea/concept of symlinks.)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Well, my proposal doesn't work too. Here is my quick test, that contains both the scenario we want to achieve ( mkdir -p usr/bin
touch usr/bin/bash
ln -s usr/bin bin
ln -s ../.. usr/bin/inflink This gives us the following tree: $ tree
.
├── bin -> usr/bin
└── usr
└── bin
├── bash
└── inflink -> ../.. This is the Python snippet that I used for testing, resembling closely what GSC wants to do: import os
dirs = set()
for dirpath, dirnames, files in os.walk('.', followlinks=True):
scandirs = []
for dirname in dirnames:
dir = os.path.join(dirpath, dirname)
st = os.stat(dir)
dirkey = st.st_dev, st.st_ino
if dirkey not in dirs:
dirs.add(dirkey)
scandirs.append(dirname)
dirnames[:] = scandirs
for file in files:
filename = os.path.join(dirpath, file)
if not os.path.isfile(filename):
# only regular files are added as trusted files
continue
print(filename) This is the result of running this script: ./bin/bash
# there is no ./usr/bin/bash !!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Well, my proposal doesn't work too.
Here is my quick test, that contains both the scenario we want to achieve (
bash
accessible through two paths) and an infinite link:mkdir -p usr/bin touch usr/bin/bash ln -s usr/bin bin ln -s ../.. usr/bin/inflinkThis gives us the following tree:
$ tree . ├── bin -> usr/bin └── usr └── bin ├── bash └── inflink -> ../..This is the Python snippet that I used for testing, resembling closely what GSC wants to do:
import os dirs = set() for dirpath, dirnames, files in os.walk('.', followlinks=True): scandirs = [] for dirname in dirnames: dir = os.path.join(dirpath, dirname) st = os.stat(dir) dirkey = st.st_dev, st.st_ino if dirkey not in dirs: dirs.add(dirkey) scandirs.append(dirname) dirnames[:] = scandirs for file in files: filename = os.path.join(dirpath, file) if not os.path.isfile(filename): # only regular files are added as trusted files continue print(filename)This is the result of running this script:
./bin/bash # there is no ./usr/bin/bash !!!
This is a really hard problem :( Or I'm that stupid.
The only hacky solution that I could come up is to allow X repetitions of the same path (same inode). For example, in the code below I assume 2 repetitions:
import os
dirs = set()
for dirpath, dirnames, files in os.walk('.', followlinks=True):
scandirs = []
for dirname in dirnames:
dir = os.path.join(dirpath, dirname)
st = os.stat(dir)
dirkey1 = st.st_dev, st.st_ino, 1 # first repetition
dirkey2 = st.st_dev, st.st_ino, 2 # second repetition
if dirkey1 not in dirs:
dirs.add(dirkey1)
scandirs.append(dirname)
elif dirkey2 not in dirs:
dirs.add(dirkey2)
scandirs.append(dirname)
dirnames[:] = scandirs
for file in files:
filename = os.path.join(dirpath, file)
if not os.path.isfile(filename):
# only regular files are added as trusted files
continue
print(filename)
This gives me correct output to the example I posted earlier:
./usr/bin/bash
./bin/bash
But of course if we'd have one more symlink (e.g. ln -s usr/bin one-more-bin
), then my script would fail to register the abs path one-more-bin/bash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @jkr0103, and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This is a really hard problem :( Or I'm that stupid.
The only hacky solution that I could come up is to allow X repetitions of the same path (same inode). For example, in the code below I assume 2 repetitions:
import os dirs = set() for dirpath, dirnames, files in os.walk('.', followlinks=True): scandirs = [] for dirname in dirnames: dir = os.path.join(dirpath, dirname) st = os.stat(dir) dirkey1 = st.st_dev, st.st_ino, 1 # first repetition dirkey2 = st.st_dev, st.st_ino, 2 # second repetition if dirkey1 not in dirs: dirs.add(dirkey1) scandirs.append(dirname) elif dirkey2 not in dirs: dirs.add(dirkey2) scandirs.append(dirname) dirnames[:] = scandirs for file in files: filename = os.path.join(dirpath, file) if not os.path.isfile(filename): # only regular files are added as trusted files continue print(filename)This gives me correct output to the example I posted earlier:
./usr/bin/bash ./bin/bashBut of course if we'd have one more symlink (e.g.
ln -s usr/bin one-more-bin
), then my script would fail to register the abs pathone-more-bin/bash
.
If you're after expansion of all the symlinks, then just drop the resolving logic? Let the walk, like, walk all the files incl. symlinks? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
If you're after expansion of all the symlinks, then just drop the resolving logic? Let the walk, like, walk all the files incl. symlinks? Or am I missing something?
@woju But you're suggesting basically your snippet that goes bananas on loop-links? This one:
python3 -c "for d, _, _ in __import__('os').walk('a', followlinks=True): print(d)"
This approach is probably working (I think the current state of this PR boils down to this?), but then we'll add ~40x more files than necessary. Sounds like something nobody wants to discover in their final sgx.trusted_files
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @jkr0103, and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
@woju But you're suggesting basically your snippet that goes bananas on loop-links? This one:
python3 -c "for d, _, _ in __import__('os').walk('a', followlinks=True): print(d)"
This approach is probably working (I think the current state of this PR boils down to this?), but then we'll add ~40x more files than necessary. Sounds like something nobody wants to discover in their final
sgx.trusted_files
.
Either I'm gravely misunderstanding something, or this is the only correct approach. If the expectation is that you can access the underlying file by any of those paths, they should be included, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
If the expectation is that you can access the underlying file by any of those paths, they should be included, right?
If we're talking about the purist approach, then you're right, this is the only correct approach.
However, no sane application should open files through these nested symlinks (a/b/c/a/b/c/a/b/c/a
). So ideally, I'd like an approach that removes back edges (removes cycles in the graph of files). Unfortunately, at least some popular Docker images (iirc, PyTorch) have such nested symlinks.
But writing an algo in Python for detecting cycles in traversed files graph... sounds like a total overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @jkr0103, and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
If the expectation is that you can access the underlying file by any of those paths, they should be included, right?
If we're talking about the purist approach, then you're right, this is the only correct approach.
However, no sane application should open files through these nested symlinks (
a/b/c/a/b/c/a/b/c/a
). So ideally, I'd like an approach that removes back edges (removes cycles in the graph of files). Unfortunately, at least some popular Docker images (iirc, PyTorch) have such nested symlinks.But writing an algo in Python for detecting cycles in traversed files graph... sounds like a total overkill.
I think yes, we are talking about purist approaches and POSIX correctness. The GSC is a general tool which should work with any docker doing whatever it likes, isn't it?
I think if you want to optimize this 40x, then we should think about adding symlink support to trusted files. We'd add another type of trusted file that would not only expose a symlink inside, but would also affect the resolution, i.e. we'd have some equivalent of realpath(3)
in trusted PAL that would resolve the path to a canonical version and verify the checksum from that.
Either way, I think GSC is a wrong place to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
I think if you want to optimize this 40x, then we should think about adding symlink support to trusted files. We'd add another type of trusted file that would not only expose a symlink inside, but would also affect the resolution, i.e. we'd have some equivalent of realpath(3) in trusted PAL that would resolve the path to a canonical version and verify the checksum from that.
Yes, we've been thinking about this. But adding links in Gramine is a non-trivial task, requiring a lot of thought and implementation effort.
But I agree in general that GSC should just work with arbitrary Docker images, without any specific hacks.
@jkr0103 So actually what happens if we simply flip followlinks=False
to followlinks=True
in normal GSC (without this PR)? Why won't it just work (though it may take much more time and have much bigger manifest file size)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I think if you want to optimize this 40x, then we should think about adding symlink support to trusted files. We'd add another type of trusted file that would not only expose a symlink inside, but would also affect the resolution, i.e. we'd have some equivalent of realpath(3) in trusted PAL that would resolve the path to a canonical version and verify the checksum from that.
Yes, we've been thinking about this. But adding links in Gramine is a non-trivial task, requiring a lot of thought and implementation effort.
But I agree in general that GSC should just work with arbitrary Docker images, without any specific hacks.
@jkr0103 So actually what happens if we simply flip
followlinks=False
tofollowlinks=True
in normal GSC (without this PR)? Why won't it just work (though it may take much more time and have much bigger manifest file size)?
That was the first thing I tried, and os.walk
keep running and files processed were 125,705,000
after 1 hour and still not over. Saw multiple loops in in /sys
and one of them was
/sys/kernel/software_nodes/node0/dmi-ipmi-si.0
. With the current changes, we exclude /sys
etc., hence we don't see that happening. I tested this is bash example.
Ideal approach wouold be to removes cycles in the graph of files
. Shell I try that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Ideal approach wouold be to removes cycles in the graph of files. Shell I try that?
Can you try it in another PR?
I think at this point we should do the following:
- Create a new PR that simply flips
followlinks=True
and test it extensively (just like you did with this PR) - Create a new PR with item 1 plus "remove-the-cycles" logic (but honestly, this feels like a total overkill)
- Mark this PR as blocked, we'll most probably close it in favor of solution 1 or 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @jkr0103, and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ideal approach wouold be to removes cycles in the graph of files. Shell I try that?
Can you try it in another PR?
I think at this point we should do the following:
- Create a new PR that simply flips
followlinks=True
and test it extensively (just like you did with this PR)- Create a new PR with item 1 plus "remove-the-cycles" logic (but honestly, this feels like a total overkill)
- Mark this PR as blocked, we'll most probably close it in favor of solution 1 or 2
I think you should start with a list of directories that you certainly don't want measured, like /proc
, /sys
, /tmp
, /run
, /var/run
etc. See if it solves the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
I think you should start with a list of directories that you certainly don't want measured, like
/proc
,/sys
,/tmp
,/run
,/var/run
etc. See if it solves the problem.
Current approach in PR exclude the directories which we don't want to measure, but it doesn't avoid loops (cycles). Do we want to fix that as part of this PR?
Another change left is renaming dir
keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, jkr0103 (Jitender Kumar) wrote…
Current approach in PR exclude the directories which we don't want to measure, but it doesn't avoid loops (cycles). Do we want to fix that as part of this PR?
Another change left is renaming
dir
keyword.
I agree with @woju. If the simple approach works, then let's just use it.
Avoiding loops (a) requires non-trivial Python code and a lot of testing, and (b) is significantly slower than the simple approach.
That's why I suggested to create a new PR with the logic that avoids loops (if you really want to have this functionality), so we can have two approaches and can compare them in terms of number of trusted files and performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I agree with @woju. If the simple approach works, then let's just use it.
Avoiding loops (a) requires non-trivial Python code and a lot of testing, and (b) is significantly slower than the simple approach.
That's why I suggested to create a new PR with the logic that avoids loops (if you really want to have this functionality), so we can have two approaches and can compare them in terms of number of trusted files and performance.
Code here
Line 69 in 7d94c58
if exclude_re.match(filename): |
dirnames
and exclude from this list those dirs that matches.
Only case it doesn't fix is infinite loop created using below commands, hope such a loop doesn't appear in real-world workloads.
Code snippet:
mkdir -p usr/bin
touch usr/bin/bash
ln -s usr/bin bin
ln -s ../.. usr/bin/inflink
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @woju)
finalize_manifest.py
line 59 at r2 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
dir()
is a builtin (even reviewable highlighted it), please rename this variable
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 59 at r2 (raw file):
Previously, jkr0103 (Jitender Kumar) wrote…
done
Resolving, but this is not really abs_
: the path will be either relative or absolute, depending on what was root_dir
. Or did you intend this to be absolute? Then you need to explicitly call
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Yes, I agree with @jkr0103, we still need to loop over the sub-directories' list and filter out the "not needed ones". The current solution looks good to me.
If we're going to settle on this solution, I'd like a comment in the source about this behaviour.
Also, this won't really be an infinite loop, it will end with ELOOP
, that is, when path length hits some arbitrary limit (around 256 bytes), kernel will error out any operation on such path, so no risk it will be stuck forever.
finalize_manifest.py
line 50 at r3 (raw file):
r'|proc/.*' r'|sys/.*' r'|var/.*)$')
WRT this list:
/var/.*
is wrong, what about/var/lib
,/var/db
etc.?/run
(and/var/run
, one of them is probably a symlink to the other) is missing.
And this is just from the top of my head, it needs to be checked with FHS and some variety of real distros, as different as possible.
Previously, woju (Wojtek Porczyk) wrote…
... call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow and @woju)
finalize_manifest.py
line 59 at r2 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
... call
os.path.abspath()
orpathlib.Path.resolve
or show thatroot_dir
is always absolute.
No, this was not supposed to be absolute. You're right, I'd rename to to_be_walked_dir
.
finalize_manifest.py
line 63 at r2 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
If we're going to settle on this solution, I'd like a comment in the source about this behaviour.
Also, this won't really be an infinite loop, it will end with
ELOOP
, that is, when path length hits some arbitrary limit (around 256 bytes), kernel will error out any operation on such path, so no risk it will be stuck forever.
@jkr0103 Could you add a comment right-before the line with os.walk()
like this:
# WARNING: below loop does not remove cycles (which can happen due to symlinks), but we assume
# that Docker images (excluding paths/files listed in `excluded_paths_regex`) don't have cycles.
finalize_manifest.py
line 50 at r3 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
WRT this list:
/var/.*
is wrong, what about/var/lib
,/var/db
etc.?/run
(and/var/run
, one of them is probably a symlink to the other) is missing.And this is just from the top of my head, it needs to be checked with FHS and some variety of real distros, as different as possible.
We should create a separate PR to improve this list. @woju Any ideas what would be good references/links to populate this list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @woju)
finalize_manifest.py
line 59 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
No, this was not supposed to be absolute. You're right, I'd rename to
to_be_walked_dir
.
renamed.
finalize_manifest.py
line 63 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
@jkr0103 Could you add a comment right-before the line with
os.walk()
like this:# WARNING: below loop does not remove cycles (which can happen due to symlinks), but we assume # that Docker images (excluding paths/files listed in `excluded_paths_regex`) don't have cycles.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow and @woju)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4.
Reviewable status: all files reviewed, 3 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @mkow)
finalize_manifest.py
line 50 at r3 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We should create a separate PR to improve this list. @woju Any ideas what would be good references/links to populate this list?
#128 with references for you (or reassign if you want).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @mkow)
finalize_manifest.py
line 50 at r3 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
#128 with references for you (or reassign if you want).
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the table after changing the implementation? Just to be sure that nothing new broke because of it.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @jkr0103)
a discussion (no related file):
We didn't root cause the exact reason, because this problem disappeared in the latest master branch.
So, it works on v1.4?
finalize_manifest.py
line 66 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
@mkow This is slice assignment. See https://stackoverflow.com/questions/10623302/how-does-assignment-work-with-list-slices
Basically, this modified the list
dirnames
in place, instead of creating a new list.Why we do it instead of
dirnames = scandirs
? Because we want to changedirnames
inos.walk()
. Please read this: https://docs.python.org/3/library/os.html#os.walkAn important exerpt:
When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.
Sometimes Python is really unintuitive...
Ouch, this is really underhanded, I don't know any other standard Python APIs working this way...
Maybe pathlib have something more sane?
finalize_manifest.py
line 57 at r4 (raw file):
# WARNING: below loop does not remove cycles (which can happen due to symlinks), but we assume # that Docker images (excluding paths/files listed in `excluded_paths_regex`) don't have cycles.
This comment doesn't match the commit message? Which one is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @jkr0103, @mkow, and @woju)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
We didn't root cause the exact reason, because this problem disappeared in the latest master branch.
So, it works on v1.4?
Yes, it works on v1.4. Actually, now that we have the release v1.4
, would be good to put this version in the config.yaml
of GSC and try with this (instead of master
).
-- commits
line 7 at r4:
We need to modify the commit body message:
This is done via `os.walk(..., followlinks=True)`. This does not remove cycles
(which can happen due to symlinks), but we assume that Docker images
(excluding paths/files listed in `excluded_paths_regex`) don't have cycles.
finalize_manifest.py
line 66 at r2 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Ouch, this is really underhanded, I don't know any other standard Python APIs working this way...
Maybe pathlib have something more sane?
Maybe @woju knows?
finalize_manifest.py
line 57 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
This comment doesn't match the commit message? Which one is true?
See my other comment, I proposed the new commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the PR with gramine v1.4 in config.yaml
with new implementation and updated the table.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @mkow and @woju)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Yes, it works on v1.4. Actually, now that we have the release
v1.4
, would be good to put this version in theconfig.yaml
of GSC and try with this (instead ofmaster
).
I used gramine v1.4 in config.yaml
and tested the Curated-Apps workloads in release and debug modes and have updated the table. All works fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @jkr0103, @mkow, and @woju)
a discussion (no related file):
Previously, jkr0103 (Jitender Kumar) wrote…
I used gramine v1.4 in
config.yaml
and tested the Curated-Apps workloads in release and debug modes and have updated the table. All works fine.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 66 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Maybe @woju knows?
This is not insane, there are many places where you're supposed to modify an object that you're handed from within stdlib. The difference here is that you're handed a list
, not some custom class instance to modify attributes.
pathlib
doesn't have anything better. There's https://docs.python.org/3/library/pathlib.html#pathlib.Path.rglob, but you can't tell it to not descend into one subtree like we do here.
If you're after readability and you dislike slice assignment, you can do it like this:
for i in range(len(dirnames), -1, -1):
if that_big_regex.fullmatch(os.path.join(dirpath, dirnames[i])):
dirnames.pop(i)
But beware: if you instead do it like this, it's unsound:
for dirname in dirnames:
if that_big_regex.fullmatch(os.path.join(dirpath, dirname)):
dirnames.remove(dirname)
Because after first .remove()
iterator is invalid (might have glitched). So in the future you need to be careful with reviews, to make sure that no-one ever makes it "more pythonic".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 66 at r2 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
This is not insane, there are many places where you're supposed to modify an object that you're handed from within stdlib. The difference here is that you're handed a
list
, not some custom class instance to modify attributes.
pathlib
doesn't have anything better. There's https://docs.python.org/3/library/pathlib.html#pathlib.Path.rglob, but you can't tell it to not descend into one subtree like we do here.If you're after readability and you dislike slice assignment, you can do it like this:
for i in range(len(dirnames), -1, -1): if that_big_regex.fullmatch(os.path.join(dirpath, dirnames[i])): dirnames.pop(i)But beware: if you instead do it like this, it's unsound:
for dirname in dirnames: if that_big_regex.fullmatch(os.path.join(dirpath, dirname)): dirnames.remove(dirname)Because after first
.remove()
iterator is invalid (might have glitched). So in the future you need to be careful with reviews, to make sure that no-one ever makes it "more pythonic".
Based on what @woju explained, I would keep the current code as-is. We may add a quick comment to this line, like:
# slice assignment, to modify the list `dirnames` in place
WDYT @mkow ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @jkr0103, and @woju)
finalize_manifest.py
line 66 at r2 (raw file):
The difference here is that you're handed a
list
, not some custom class instance to modify attributes.
Yes, that's what I mean. My problem is not about slice assignment, but about library returning a list which you then modify during iteration to influence the future iterations.
A comment like Dmitrii suggest would be good, but should rather say that this changes how os.walk()
will continue iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @woju)
finalize_manifest.py
line 66 at r2 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
The difference here is that you're handed a
list
, not some custom class instance to modify attributes.Yes, that's what I mean. My problem is not about slice assignment, but about library returning a list which you then modify during iteration to influence the future iterations.
A comment like Dmitrii suggest would be good, but should rather say that this changes how
os.walk()
will continue iteration.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @jkr0103 and @mkow)
finalize_manifest.py
line 66 at r2 (raw file):
Previously, jkr0103 (Jitender Kumar) wrote…
Done.
The comment looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We need to modify the commit body message:
This is done via `os.walk(..., followlinks=True)`. This does not remove cycles (which can happen due to symlinks), but we assume that Docker images (excluding paths/files listed in `excluded_paths_regex`) don't have cycles.
I concur.
finalize_manifest.py
line 66 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
The comment looks good to me.
yeah, LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from different teams (1 more required, approved so far: ITL), "fixup! " found in commit messages' one-liners (waiting on @dimakuv)
This is done via `os.walk(..., followlinks=True)`. This does not remove cycles (which can happen due to symlinks), but we assume that Docker images (excluding paths/files listed in `excluded_paths_regex`) don't have cycles. Signed-off-by: jkr0103 <[email protected]>
1aacb9d
to
c374329
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @mkow)
a discussion (no related file):
I did a final rebase and tested that it works correctly.
Previously, woju (Wojtek Porczyk) wrote…
I concur.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved
GSC doesn't add any symlink file to trusted list. In all the ubuntu:20.04 based docker images,
/bin
is symlink to/usr/bin
and because of this GSC doesn't add any/bin
file to trusted files.Any bash script executed at run time having shebang
#!/bin/bash
is not able to access/bin/bash
as it's not part of trusted files list, reports error:/bin/bash: bad interpreter: Permission denied
.As part of the fix, GSC will add all symlink files.
Fixes #116, fixes #25
I tested Curated-Apps (Redis, PyTorch, Sklearn and Tensorflow-serving) with this PR using gramine v1.3.1 and v1.4.
All the Curated-Apps on gramine v1.4 works without requiring any change in
enclave_size
andpal_internal_memory
.On gramine 1.3.1, Pytorch needed extra 64M
pal_internal_memory
in release mode and it did not work in debug mode even after increasingpal_internal_memory
upto 2G. This seems to an issue with gramine 1.3.1 which doesn't show up in gramine latest master. Please refer below table for more details.pal_internal_memory
in gramine 1.3.1 / 1.4How to test this PR?
This change is