-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove broken symlinks when updating deps #2842
Conversation
fix: knative#2778 Change-Id: Id249c00fa8a506fbbd1d21956b8bfd62c12a9958
a0d05bd
to
eb4b60b
Compare
run update-codegen.sh as well run update-codegen.sh as well run update-codegen.sh as well run update-codegen.sh as well run update-codegen.sh as well run update-codegen.sh as well run update-codegen.sh as well run update-codegen.sh as well
/retest |
...it is a symlink to /usr/bin/vim ...
/retest |
/assign @jonjohnsonjr : Hello Jon could you help take a look this PR? |
How did you resolve the @adrcunha do you have any ideas for that besides installing vim in the prow images? Context here: https://knative.slack.com/archives/C93E33SN8/p1546899936149900 |
That sounds right to me, just requires someone with more bash-fu than I have 😄 we definitely shouldn't depend on anything outside the repo. |
Assuming
|
The existence of https://github.com/harto/realpath-osx would suggest it isn't available 😂 |
Bummer... |
@adrcunha could you elaborate more on The solution in this PR is taken from this StackOverflow answer. There is another answer in the same question uses |
Ugh. Ok, just steal my solution for readlink from here: https://github.com/knative/test-infra/blob/master/scripts/e2e-tests.sh#L311 |
Discussed with @adrcunha offline:
Thanks for the help and will have a new commit soon. |
Tested the latest commit and it works on both Linux and Mac, and also the symlinks pointing outside of the source tree can be detected and deleted. @adrcunha Please help review again? Below was a test run of the script on
|
This might be useful enough to go into https://github.com/knative/test-infra, since I assume we'll want to do that for every project. |
hack/update-deps.sh
Outdated
# 2. Symlinks pointing outside of the source tree | ||
# We use recursion to find the canonical path of a symlink | ||
# because both `realpath` and `readlink -m` are not available on Mac. | ||
function remove_invalid_symlink() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is overkill. You don't need to traverse the directory tree, only check the target. As we discussed offline, parsing ls should be simple and do the trick.
Once this is merged, definitely deserves being moved to test-infra, together with a helper for the major features of update-deps.sh.
/hold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't quite understand that you meant (sorry for not being a shell expert). The command you reference gives a canonical path, however, for symlink even if we don't follow it all the way, we still need to follow it once by readlink
then generate canonical path for it.
By over overkill, could you hint me an oneliner?
Also I agree once this is merged I will go ahead with test-infra and refactor here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Remove symlinks in /vendor that are broken or lead outside the repo.
function remove_broken_symlinks() {
for link in $(find ./vendor -type l); do
local target="$(ls -l ${link})"
target="${target##* -> }"
# Remove broken symlinks
if [[ ! -e ${target} ]]; then
rm -f ${link}
continue
fi
# Get canonical path to target, remove if outside the repo
target="$(cd ${target%/*} && echo $PWD/${target##*/})"
if [[ ${target} != *github.com/knative/* ]]; then
rm -f ${link}
continue
fi
done
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @adrcunha for the hint!
On the other hand however, after using your changes there are some valid symlinks got deleted, see the following log:
$ ./hack/update-deps.sh
unlinking broken: ./vendor/github.com/knative/build/test/panic/kodata/LICENSE
unlinking broken: ./vendor/github.com/knative/build/test/workingdir/kodata/LICENSE
unlinking broken: ./vendor/github.com/knative/build/cmd/controller/kodata/LICENSE
unlinking broken: ./vendor/github.com/knative/build/cmd/logs/kodata/LICENSE
unlinking broken: ./vendor/github.com/knative/build/cmd/nop/kodata/LICENSE
unlinking broken: ./vendor/github.com/knative/build/cmd/git-init/kodata/LICENSE
unlinking broken: ./vendor/github.com/knative/build/cmd/creds-init/kodata/LICENSE
unlinking broken: ./vendor/github.com/knative/build/cmd/webhook/kodata/LICENSE
unlinking broken: ./vendor/github.com/prometheus/procfs/fixtures/self
However, e.g. https://github.com/knative/serving/blob/master/vendor/github.com/knative/build/test/panic/kodata/LICENSE is actually a valid symlink.
Any reason why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bug in my code, -e ${target}
must be -e ${link}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revised code. The real path resolver was buggy as well (I should have tested it when I wrote it the first time). The output now correctly doesn't include kodata/LICENSE
.
# Remove symlinks in /vendor that are broken or lead outside the repo.
function remove_broken_symlinks() {
for link in $(find ./vendor -type l); do
local target="$(ls -l ${link})"
target="${target##* -> }"
# Remove broken symlinks
if [[ ! -e ${link} ]]; then
echo "[DOESN'T EXIST] rm -f ${link}"
continue
fi
# Get canonical path to target, remove if outside the repo
[[ ${target} == /* ]] || target="./${target}"
target="$(cd `dirname ${link}` && cd ${target%/*} && echo $PWD/${target##*/})"
if [[ ${target} != *github.com/knative/* ]]; then
echo "[OUTSIDE TREE] rm -f ${link}"
continue
fi
done
}
/hold cancel |
LGTM, thanks! |
Two types of symlinks are considered broken: 1. Broken symlinks. 2. Symlinks pointing to a path outside of the source tree. This function was impleneted in knative/serving#2842 and should be useful for all knative modules.
/assign @mdemirhan Missing approval. |
Oops /approve |
It looks that this PR is still not approved? Could someone help? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to yank you around a bit on the approach; feel free to ignore my bash-fu in favor of Adriano's if we're not worried about the chains of links.
/approve
/hold
hack/update-deps.sh
Outdated
# Remove symlinks in /vendor that are broken or lead outside the repo. | ||
function remove_broken_symlinks() { | ||
for link in $(find ./vendor -type l); do | ||
local target="$(ls -l ${link})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest using readlink
for this:
local target="$(readlink ${link})"
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't work on Macs. That was the major reason for this PR taking so long to be finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think vanilla readlink
without -f
works on Macs. At least one test on Dan's laptop suggested this was installed in /bin
, which is pretty likely to be a core utility.
continue | ||
fi | ||
# Get canonical path to target, remove if outside the repo | ||
[[ ${target} == /* ]] || target="./${target}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what's going on here, and why this is needed to find a canonical path.
I'd have this follow the links recursively. I understand we want this to work with MacOS X/BSD, so we only have a simple readlink
and no realpath
(which we can emulate with $(cd $(dirname $FOO); pwd)
):
function remove_external_links() { # Removes a chain of links pointing out of tree
if ! [[ -e $1 ]]; then
return 1 # This link doesn't point anywhere.
fi
if ! [[ -L $1 ]]; then
return 0 # This is a real file/directory which is not a link.
fi
# This determines if it points outside the tree, either absolutely, or outside vendor.
target="$(readlink "$1")"
if [[ $target == /* ]] || [[ "$(cd $(dirname "$target"; pwd)/" == ${PWD}/vendor/* ]]; then
rm "$1" # Remove the errant link, but not its target (which is outside our tree)
return 1
fi
if ! remove_external_links "$target"; then
rm "$1" # Recursive examination found a bad target, remove this link, too.
return 1
fi
}
remove_external_links "${link}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I tested this as a set of one-liners, so you may want to re-test if you decide to use it.
Note that the current script does not handle any of the following:
- chained links (
ln -s b c; ln -s a b
) - Intermediate links that point outside the current directory (
ln -s go/src/github.com/knative/serving/vendor/foo a; ln -s ~/a go/src/github.com/knative/serving/vendor/bar
) - Link paths which exit and re-enter the working directory. (
ln -s ../../../../../../go/src/github.com/knative/serving/vendor/foo bar
, which assumesGOPATH=~/go
or the like. I didn't fix this one, either.)
Unfortuantely, for the third item, I think you have to process each segment of the $target
independently and do a check as to whether it is under PWD at each step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a) readlink
is not available on Mac, so this solution can't be used.
b) You don't need to check the beyond the first target. If it leads outside the repo tree, it'll be removed. If it leads to some other link inside the tree, the target link will be checked at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For b)
If you don't check beyond the first target, there may still be dangling symlinks after running N passes of this, depending on the order of directory entries returned by the OS. Consider the following:
$ cd /tmp/x
$ ls -l
b -> ../a
c -> b
d -> c
$ for x in $(find . -type l ); do [[ -L $x ]] || continue; t=$(readlink $x); [[ $t == /* ]] || t="./$t"; t2="$(cd $(dirname $x) && cd ${t%/*} && echo $PWD/${t##*/})"; echo $x $t $t2; if [[ ${t2}/ != $PWD/* ]]; then echo remove $x; continue; fi; done
./d ./c /tmp/x/c
./c ./b /tmp/x/b
./b ./../a /tmp/a
remove ./b
At the end of this, d
and c
will still be present and dangling (pointing at b
, which was removed from the directory).
Unfortunately, I just discovered a different problem with my solution, which is that a symlink loop will cause it to recurse until stack space is exhausted. 🤷♂️
I'm okay with this solution, but wanted to point out that it might miss symlink chains where the last element in the chain points outside the tree (and that order will depend on the order that directory entries are added and returned, which is OS-dependent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually "following the symlink all the way by simple readlink
was implemented" in a previous commit:
1adb644
And that approach can solve the problem you mentioned, since each symlink will be solved to the canonical path of a final target (and be deleted if it is outside of the source tree).
Let me know if you want me to revert to that approach, or do the minor improvement (as your first comment) on this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version in 1adb644 doesn't check that the intermediate symlinks are inside the tree, allowing for e.g.
vendor/foo --> /home/evan/foo --> /home/evan/go/src/github.com/knative/serving/vendor/bar
With that said, I'm happy with a simpler solution for now and then having dep
do the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(And I've approved, so if people want this version, they can just /hold cancel
.)
I propose we write a utility to do this in go :) Edit: Ideally we won't have to do this soon: golang/dep#1564 |
@jonjohnsonjr Solving this problem directly in |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adrcunha, erain, evankanderson The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
* A helper funcion for removing broken symlinks Two types of symlinks are considered broken: 1. Broken symlinks. 2. Symlinks pointing to a path outside of the source tree. This function was impleneted in knative/serving#2842 and should be useful for all knative modules. * Readd LF at end of file after merge
Fixes #2778
Proposed Changes
Release Note
NONE.