Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: set up machines and env to collect core dumps #34680

Closed
tbg opened this issue Feb 6, 2019 · 1 comment
Closed

roachtest: set up machines and env to collect core dumps #34680

tbg opened this issue Feb 6, 2019 · 1 comment
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@tbg
Copy link
Member

tbg commented Feb 6, 2019

Relevant for random crashes as well as #34241. This rando script might be helpful. I remember spending way too much time to actually get this to work.
Additionally we want GOTRACEBACK=crash for all roachprod run/start invocations.

#!/usr/bin/env bash
CMD_USAGE=""
source "$(dirname "$0")/config.sh"

CORE_PATTERN="/tmp/core.%e.%p.%h.%t"

ssh_task "
  set -euo pipefail
  echo -e '
  * soft core unlimited
  * hard core unlimited
  root soft core unlimited
  root hard core unlimited
  ' | sudo tee /etc/security/limits.d/core_unlimited.conf > /dev/null
  echo '$CORE_PATTERN' | sudo tee /proc/sys/kernel/core_pattern > /dev/null
  sudo sed -i'~' 's/enabled=1/enabled=0/' /etc/default/apport
  sudo sed -i'~' '/.*kernel\\.core_pattern.*/c\\' /etc/sysctl.conf
  echo 'kernel.core_pattern=$CORE_PATTERN' | sudo tee -a /etc/sysctl.conf > /dev/null
  echo 'Done. Make sure cockroach is started through ulimit -c unlimited, or it will run with a soft limit of zero and still not create core dumps.'
"


@tbg tbg added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Feb 6, 2019
@tbg
Copy link
Member Author

tbg commented Feb 12, 2019

I just checked whether this maybe "just works" and it doesn't. By default, core dumps are handled by apport which does the following with it:

ERROR: apport (pid 3149) Tue Feb 12 11:32:17 2019: called for pid 3073, signal 6, core limit 0, dump mode 1
ERROR: apport (pid 3149) Tue Feb 12 11:32:17 2019: executable: /home/tschottdorf/cockroach-v2.1.4.linux-amd64/cockroach (command line "./cockroach start --insecure")
ERROR: apport (pid 3149) Tue Feb 12 11:32:17 2019: executable does not belong to a package, ignoring

So I assume that we'll end up reusing most of the above script (which came into existence precisely because of the above)

petermattis added a commit to petermattis/cockroach that referenced this issue Mar 1, 2019
Automatically configure roachprod machines to generate core dumps, and
specify `GOTRACEBACK=crash` when running cockroach and other binaries so
that the Go runtime generates a core dump when panicing.

Fixes cockroachdb#34680

Release note: None
craig bot pushed a commit that referenced this issue Mar 4, 2019
35311: cmd/roachprod,cmd/roachtest: automatic core dump generation and collection r=andreimatei a=petermattis

Fixes #32921
Fixes #34680

Co-authored-by: Peter Mattis <[email protected]>
@craig craig bot closed this as completed in #35311 Mar 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

No branches or pull requests

2 participants