Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

external/Makefile race-condition #1543

Closed
jsarenik opened this issue Jun 4, 2018 · 2 comments
Closed

external/Makefile race-condition #1543

jsarenik opened this issue Jun 4, 2018 · 2 comments

Comments

@jsarenik
Copy link
Collaborator

jsarenik commented Jun 4, 2018

Issue and Steps to Reproduce

The issue is already mentioned in #1069 and I noticed it a couple of times in the past. Now I have a 100% reproduction on both Alpine Linux and Ubuntu 18.04. A full log with the error follows, then a full log with the same thing, just with extra --recurse-submodules additional parameter to the git clone line - this fixes all (and is the same as cloning without it and then cd into cloned directory and calling git submodule update --init):

Error full log

https://gist.github.com/jsarenik/98636bd46547a47dda959c4cdf8748d8#file-failure-txt

Success full log

https://gist.github.com/jsarenik/98636bd46547a47dda959c4cdf8748d8#file-success-txt

Proposed fix

Get rid of git commands called by make: #1542

getinfo output

Not related :-) We're not there yet.

This was referenced Jun 4, 2018
@jsarenik
Copy link
Collaborator Author

jsarenik commented Jun 5, 2018

So I did a git-bisect on Ubuntu 18.04. Here is one of the results:

cc1: all warnings being treated as errors
<builtin>: recipe for target 'gossipd/gossip.o' failed
make: *** [gossipd/gossip.o] Error 1
make: *** Waiting for unfinished jobs....
make: *** wait: No child processes.  Stop.
94581c6e0725575e490ed84c4a028112d263b6fb is the first bad commit
commit 94581c6e0725575e490ed84c4a028112d263b6fb
Author: Rusty Russell <[email protected]>
Date:   Mon Jun 4 13:52:25 2018 +0930

    gossipd: wire up infrastructure to generate query_short_channel_ids msg.
    
    Signed-off-by: Rusty Russell <[email protected]>

:040000 040000 09de552302c29346a6e6793a36af5a449055c8f5 18bb780f5ec650eea4d65eed3c170fa3807fcbe1 M      gossipd
:040000 040000 e7b93ef2648d0a8aff6f3f65a3f137d36a403cfc aefd9250c4b30a0225b314388fe75264ed6ab900 M      lightningd
bisect run success

I did it like this:

cat > ~/bisect.sh <<EOF
#!/bin/sh

{
git clean -xfd
git submodule deinit --all -f
./configure || true
make -j4
} >/dev/null 2>&1 && echo Success || { echo FAIL; exit 1; }
EOF
chmod a+x ~/bisect.sh
git clone https://github.com/ElementsProject/lightning
cd lightning
git remote add rusty https://github.com/rustyrussell/lightning
git fetch rusty
git checkout guilt/configure
git bisect start HEAD 27a186b --
git bisect run ~/bisect.sh
git bisect reset

Your mileage may vary. I still think this is a race condition so the bisect is not really showing when it was introduced and will oscillate on multiple commits. But one thing is clear to me, the latest commit on rustyrussel/lightning@guilt/configure is failing a lot. Try it please.

I used debootstrap to set up a fresh Ubuntu 18.04 chroot directory and chsys script to get into it.

mkdir ubuntu18
debootstrap --variant=buildd --arch=amd64 \
  bionic ./ubuntu18 http://archive.ubuntu.com/ubuntu/
./chsys ubuntu18 /bin/bash
# then as root in chroot
apt install autoconf automake build-essential git libtool libgmp-dev libsqlite3-dev python python3 net-tools zlib1g-dev

rustyrussell added a commit to rustyrussell/lightning that referenced this issue Jun 6, 2018
…hanges.

If we change an upstream URL, all submodules break.  Users would need
to run 'git submodule sync'.  Note that the libbacktrace fix was merged
upstream so this is no longer necessary, but it's good for future changes.

Also, stress-testing reveals that git submodule fails locking
'.git/config' when run in paralell.  It also segfaults and other
problems.

This is my final attempt to fix submodules; I've wasted far too many
days on obscure problems it creates: I've already lost one copy of my
repo to apparently unfixable submodule preoblems.  The next "fix" will
be to simply import the source code so it works properly.

Reported-by: @jsarenik
Fixes: ElementsProject#1543
Signed-off-by: Rusty Russell <[email protected]>
@jsarenik
Copy link
Collaborator Author

jsarenik commented Jun 6, 2018

rustyrussell@5d77c6a fixes the issue. Thank you! Closing.

@jsarenik jsarenik closed this as completed Jun 6, 2018
rustyrussell added a commit to rustyrussell/lightning that referenced this issue Jun 6, 2018
…hanges.

If we change an upstream URL, all submodules break.  Users would need
to run 'git submodule sync'.  Note that the libbacktrace fix was merged
upstream so this is no longer necessary, but it's good for future changes.

Also, stress-testing reveals that git submodule fails locking
'.git/config' when run in paralell.  It also segfaults and other
problems.

This is my final attempt to fix submodules; I've wasted far too many
days on obscure problems it creates: I've already lost one copy of my
repo to apparently unfixable submodule preoblems.  The next "fix" will
be to simply import the source code so it works properly.

Reported-by: @jsarenik
Fixes: ElementsProject#1543
Signed-off-by: Rusty Russell <[email protected]>
rustyrussell added a commit to rustyrussell/lightning that referenced this issue Jun 8, 2018
…hanges.

If we change an upstream URL, all submodules break.  Users would need
to run 'git submodule sync'.  Note that the libbacktrace fix was merged
upstream so this is no longer necessary, but it's good for future changes.

Also, stress-testing reveals that git submodule fails locking
'.git/config' when run in paralell.  It also segfaults and other
problems.

This is my final attempt to fix submodules; I've wasted far too many
days on obscure problems it creates: I've already lost one copy of my
repo to apparently unfixable submodule preoblems.  The next "fix" will
be to simply import the source code so it works properly.

Reported-by: @jsarenik
Fixes: ElementsProject#1543
Signed-off-by: Rusty Russell <[email protected]>
cdecker pushed a commit that referenced this issue Jun 8, 2018
…hanges.

If we change an upstream URL, all submodules break.  Users would need
to run 'git submodule sync'.  Note that the libbacktrace fix was merged
upstream so this is no longer necessary, but it's good for future changes.

Also, stress-testing reveals that git submodule fails locking
'.git/config' when run in paralell.  It also segfaults and other
problems.

This is my final attempt to fix submodules; I've wasted far too many
days on obscure problems it creates: I've already lost one copy of my
repo to apparently unfixable submodule preoblems.  The next "fix" will
be to simply import the source code so it works properly.

Reported-by: @jsarenik
Fixes: #1543
Signed-off-by: Rusty Russell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants