Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agd fails with "Cannot find dependency ..." in systemd due to lack of file descriptors #7817

Closed
dckc opened this issue May 20, 2023 · 8 comments
Assignees
Labels
agd Agoric (Golang) Daemon bug Something isn't working vaults_triage DO NOT USE

Comments

@dckc
Copy link
Member

dckc commented May 20, 2023

Describe the bug

When starting agd in systemd, it fails with Error#1: config.bundles.coreProposal2_5: Cannot find dependency picomatch ...

To Reproduce

NodesGuru reports:

# Build software
CI_GIT_NAME=Agoric
CI_GIT_FOLDER=agoric-sdk
CI_BIN_VER=ea8c1c64911b4c58fb43635b25e17e3d50d0cf2a
CI_BIN_NAME=agd

cd $HOME
git clone https://github.com/${CI_GIT_NAME}/${CI_GIT_FOLDER}.git
cd $HOME/${CI_GIT_FOLDER}
git fetch --all
git checkout ${CI_BIN_VER}

git submodule update
find . -name node_modules |xargs rm -rf
sudo apt update
curl https://deb.nodesource.com/setup_16.x | sudo bash
sudo apt install -y nodejs gcc g++ make < "/dev/null"
curl -sL https://dl.yarnpkg.com/debian/pubkey.gpg | gpg --dearmor | sudo tee /usr/share/keyrings/yarnkey.gpg >/dev/null
echo "deb [signed-by=/usr/share/keyrings/yarnkey.gpg] https://dl.yarnpkg.com/debian stable main" | sudo tee /etc/apt/sources.list.d/yarn.list
sudo apt update && sudo apt install yarn
yarn install --force
yarn build
(cd $HOME/${CI_GIT_FOLDER}/packages/cosmic-swingset && make)

sudo cp $HOME/go/bin/${CI_BIN_NAME} /usr/local/bin/${CI_BIN_NAME}

${CI_BIN_NAME} version

Then in a systemd unit:

$ cat /etc/systemd/system/agd.service
[Unit]
Description=Agoric Node
After=network-online.target
[Service]
User=ubuntu
ExecStart=/home/ubuntu/agoric-sdk/bin/agd start --address tcp://0.0.0.0:55658 --grpc-web.address 0.0.0.0:12091 --grpc.address 0.0.0.0:12090 --p2p.laddr tcp://0.0.0.0:53956 --rpc.laddr tcp://127.0.0.1:56657 --home /home/ubuntu/.agoric
Environment=PATH="/home/ubuntu/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin:/usr/local/go/bin:/home/ubuntu/go/bin"
Restart=always
RestartSec=3
LimitNOFILE=4096
[Install]
WantedBy=multi-user.target

Expected behavior

agd works as a systemd service

Platform Environment

  • what OS are you using? what version of Node.js?
  • is there anything special/unusual about your platform?
  • what version of the Agoric-SDK are you using? (run git describe --tags --always)

ea8c1c6

Additional context

agoricdev-18

note discord #devnet thread

Screenshots

stack trace from 0xAN | Nodes.Guru:

portHandler threw (Error#1)
Error#1: config.bundles.coreProposal2_5: Cannot find dependency picomatch for file:///home/ubuntu/agoric-sdk/node_modules/ava/
  at packages/SwingSet/src/controller/initializeSwingset.js:538:15
  at async Promise.all (index 18)
  at async processGroup (packages/SwingSet/src/controller/initializeSwingset.js:541:27)
  at async initializeSwingset (packages/SwingSet/src/controller/initializeSwingset.js:570:43)
  at async ensureSwingsetInitialized (packages/cosmic-swingset/src/launch-chain.js:160:5)
  at async buildSwingset (packages/cosmic-swingset/src/launch-chain.js:165:3)
  at async launch (packages/cosmic-swingset/src/launch-chain.js:307:52)
  at async launchAndInitializeSwingSet (packages/cosmic-swingset/src/chain-main.js:453:15)
  at async toSwingSet (packages/cosmic-swingset/src/chain-main.js:670:20)
Cannot initialize Controller Error: config.bundles.coreProposal2_5: Cannot find dependency picomatch for file:///home/ubuntu/agoric-sdk/node_modules/ava/
agd.service: Main process exited, code=exited, status=1/FAILURE

logs from Syd | FR Staking Community:

May 20 1359 ubuntu-8gb-hel1-1 systemd[1]: Started Agoric Cosmos daemon.
May 20 1300 ubuntu-8gb-hel1-1 agd[2190252]: 2023/05/20 1300 Running SwingSet until bootstrap is ready
May 20 1300 ubuntu-8gb-hel1-1 agd[2190252]: Loading slog sender modules: @Agoric/telemetry/src/flight-recorder.js
May 20 1300 ubuntu-8gb-hel1-1 agd[2190252]: 2023-05-20T1300.491Z launch-chain: Launching SwingSet kernel
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]: portHandler threw (Error#1)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]: Error#1: config.bundles.coreProposal2_5: Cannot find dependency picomatch for file:///home/ubuntu/agoric-sdk/node_modules/ava/
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at packages/SwingSet/src/controller/initializeSwingset.js:538:15
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async Promise.all (index 18)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async processGroup (packages/SwingSet/src/controller/initializeSwingset.js:541:27)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async initializeSwingset (packages/SwingSet/src/controller/initializeSwingset.js:570:43)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async ensureSwingsetInitialized (packages/cosmic-swingset/src/launch-chain.js:160:5)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async buildSwingset (packages/cosmic-swingset/src/launch-chain.js:165:3)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async launch (packages/cosmic-swingset/src/launch-chain.js:307:52)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async launchAndInitializeSwingSet (packages/cosmic-swingset/src/chain-main.js:453:15)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]:   at async toSwingSet (packages/cosmic-swingset/src/chain-main.js:670:20)
May 20 1317 ubuntu-8gb-hel1-1 agd[2190252]: Cannot initialize Controller Error: config.bundles.coreProposal2_5: Cannot find dependency picomatch for file:///home/ubuntu/agoric-sdk/node_modules/ava/
May 20 1317 ubuntu-8gb-hel1-1 systemd[1]: agd.service: Main process exited, code=exited, status=1/FAILURE
May 20 1317 ubuntu-8gb-hel1-1 systemd[1]: agd.service: Failed with result 'exit-code'.
May 20 1320 ubuntu-8gb-hel1-1 systemd[1]: agd.service: Scheduled restart job, restart counter is at 34452.
May 20 1320 ubuntu-8gb-hel1-1 systemd[1]: Stopped Agoric Cosmos daemon.
@dckc dckc added bug Something isn't working agd Agoric (Golang) Daemon labels May 20, 2023
@dckc dckc added this to the Vaults Go Live milestone May 20, 2023
@dckc
Copy link
Member Author

dckc commented May 20, 2023

reported work-around: run agd outside systemd

@dckc
Copy link
Member Author

dckc commented May 20, 2023

@ivanlei ivanlei added the vaults_triage DO NOT USE label May 20, 2023
@dckc dckc self-assigned this May 22, 2023
@dckc
Copy link
Member Author

dckc commented May 23, 2023

diagnosis: bundling ran into file descriptor limit

At start-up, agd does a lot of bundling of JavaScript modules. Outside of systemd, ubuntu has a limit around 1 million. In the reported configuration, we see:

LimitNOFILE=4096

We should reduce the required number of simultaneous file descriptors in due course, but in the mean time

work-around: increase file descriptor limit

Using 64K file descriptors seems to relieve the symptoms:

LimitNOFILE=65536

@dckc dckc changed the title agd fails with "Cannot find dependency picomatch" in systemd agd fails with "Cannot find dependency ..." in systemd due to lack of file descriptors May 23, 2023
@ivanlei ivanlei removed this from the Vaults Go Live milestone Jun 5, 2023
@warner
Copy link
Member

warner commented Jun 22, 2023

endojs/endo#1593 is the long-term fix for this, to either limit the bundle-source parallelism, and/or react to EMFILE by deferring the open() until some other FD has been closed.

@kriskowal
Copy link
Member

I’ve landed a fix for endojs/endo#1593 in Endo and this should be a thing of the past when we next sync Endo releases with Agoric SDK.

@michaelfig michaelfig assigned kriskowal and unassigned dckc and michaelfig Aug 3, 2023
@dckc
Copy link
Member Author

dckc commented Aug 31, 2023

@kriskowal is this now a thing of the past? It's in the upgrade-11 release notes.

@kriskowal
Copy link
Member

No, I have not yet successfully synced Endo with Agoric SDK. This is more likely to land in upgrade-12.

@kriskowal
Copy link
Member

I believe this is now a thing of the past. Please reöpen if symptoms persist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agd Agoric (Golang) Daemon bug Something isn't working vaults_triage DO NOT USE
Projects
None yet
Development

No branches or pull requests

5 participants