Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu packaging test: dpkg status database is locked by another process #33762

Closed
DaveCTurner opened this issue Sep 17, 2018 · 7 comments
Closed
Assignees
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts Team:Delivery Meta label for Delivery team >test-failure Triaged test failures from CI v7.0.0-beta1

Comments

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Sep 17, 2018

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+packaging-tests/5/console failed as follows

1) test10Install(org.elasticsearch.packaging.test.DefaultDebPreservationTests)
java.lang.RuntimeException: Installing distribution DEFAULT_DEB version 7.0.0-alpha1-SNAPSHOT failed: <org.elasticsearch.packaging.util.Shell$Result exitCode = [2] stdout = [] stderr = [dpkg: error: dpkg status database is locked by another process
INFO: [test10Install]: before test
]>
	at __randomizedtesting.SeedInfo.seed([58FAC93BEED0AEB3:B4A0C5C3B24011E9]:0)
Sep 16, 2018 6:02:05 AM org.elasticsearch.packaging.test.OssRpmPreservationTests logTestNameBefore
INFO: [test20Remove]: before test
Sep 16, 2018 6:02:05 AM org.elasticsearch.packaging.test.OssRpmPreservationTests logTestNameBefore
INFO: [test30PreserveConfig]: before test
	at org.elasticsearch.packaging.util.Packages.install(Packages.java:98)
	at org.elasticsearch.packaging.util.Packages.install(Packages.java:92)
	at org.elasticsearch.packaging.test.DebPreservationTestCase.test10Install(DebPreservationTestCase.java:71)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$2.evaluate(ThreadLeakControl.java:404)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:708)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.access$200(RandomizedRunner.java:138)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:629)
2) test10Install(org.elasticsearch.packaging.test.OssDebPreservationTests)
java.lang.RuntimeException: Installing distribution OSS_DEB version 7.0.0-alpha1-SNAPSHOT failed: <org.elasticsearch.packaging.util.Shell$Result exitCode = [2] stdout = [] stderr = [dpkg: error: dpkg status database is locked by another process
]>
	at __randomizedtesting.SeedInfo.seed([47968767ADB22FA0:ABCC8B9FF12290FA]:0)
	at org.elasticsearch.packaging.util.Packages.install(Packages.java:98)
	at org.elasticsearch.packaging.util.Packages.install(Packages.java:92)
	at org.elasticsearch.packaging.test.DebPreservationTestCase.test10Install(DebPreservationTestCase.java:71)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$2.evaluate(ThreadLeakControl.java:404)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:708)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.access$200(RandomizedRunner.java:138)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:629)

Possibly related to #32309 (comment), but I can't see any other occurrences.

Allegedly this works, but not for me:

REPRODUCE WITH: ./gradlew :qa:vagrant:vagrantUbuntu1604#javaPackagingTest \
  -Dtests.seed=141141AB8AA5B368 
@DaveCTurner DaveCTurner added :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts >test-failure Triaged test failures from CI v7.0.0 labels Sep 17, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@original-brownbear
Copy link
Member

Another one:

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+packaging-tests/63/console

Execution failed for task ':qa:vagrant:vagrantUbuntu1604#javaPackagingTest'.

REPRODUCE WITH: ./gradlew :qa:vagrant:vagrantUbuntu1604#javaPackagingTest \
  -Dtests.seed=4FE89E07FDDE167E 

@colings86
Copy link
Contributor

nik9000 added a commit to nik9000/elasticsearch that referenced this issue Nov 14, 2018
If `dpkg` fails, try and look for who has `/var/lib/dpkg/lock` open. If
it exists and is open then return a failure with information about who
has file open. This should help us debug elastic#33762.

Closes elastic#34309
nik9000 added a commit that referenced this issue Nov 27, 2018
If `dpkg` fails, try and look for who has `/var/lib/dpkg/lock` open. If
it exists and is open then return a failure with information about who
has file open. This should help us debug #33762.

Closes #34309
nik9000 added a commit that referenced this issue Nov 27, 2018
If `dpkg` fails, try and look for who has `/var/lib/dpkg/lock` open. If
it exists and is open then return a failure with information about who
has file open. This should help us debug #33762.

Closes #34309
@nik9000
Copy link
Member

nik9000 commented Nov 29, 2018

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+packaging-tests/154/consoleFull got us more debugging information:

01:01:05 not ok 132 [DEB PLUGINS] install a sample plugin with different logging modes and check output
01:01:05 # (from function `install_package' in file /project/build/packaging/bats/utils/packages.bash, line 86,
01:01:05 # (from function `install_package' in file /project/build/packaging/bats/utils/packages.bash, line 86,
01:01:05 #  from function `install' in file /project/build/packaging/bats/tests/50_modules_and_plugins.bats, line 88,
01:01:05 #  from function `install' in file /project/build/packaging/bats/tests/50_modules_and_plugins.bats, line 88,
01:01:05 #  from function `setup' in test file /project/build/packaging/bats/tests/50_modules_and_plugins.bats, line 63)
01:01:05 #  from function `setup' in test file /project/build/packaging/bats/tests/50_modules_and_plugins.bats, line 63)
01:01:05 #   `install' failed
01:01:05 #   `install' failed
01:01:05 # dpkg failed:
01:01:05 # dpkg: error: dpkg frontend is locked by another process
01:01:05 # lsof /var/lib/dpkg/lock:
01:01:05 # COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
01:01:05 # apt-get 24459 root    5uW  REG    8,3        0  938 /var/lib/dpkg/lock

I'll do a little more digging.

@nik9000
Copy link
Member

nik9000 commented Nov 29, 2018

I just checked on our ubuntu 1604 image. There is a file in /etc/cron.daily/apt-compat that execs /usr/lib/apt/apt.systemd.daily. cron.daily looks like it runs at 6:25am local time:

grep daily /etc/crontab 
25 6    * * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )

When I shift Jenkins into the server's local time zone it looks like the failure occurs right after 6am:

06:01:05 not ok 132 [DEB PLUGINS] install a sample plugin with different logging modes and check output

That doesn't quite line up! More digging!

@nik9000
Copy link
Member

nik9000 commented Nov 29, 2018

That doesn't quite line up! More digging!

I think I figured it out! Ubuntu 16.04 uses systemd and /etc/cron.daily/apt-compat bails if systemd is around. Why the file is on disk at all when the whole distro is sytemd, I dunno. But whatever.

In systemd Ubuntu 16.04 has a timer at /etc/systemd/system/timers.target.wants/apt-daily-upgrade.timer which looks like:

[Timer]
OnCalendar=*-*-* 6:00
RandomizedDelaySec=60m
Persistent=true

Which lines up well with the error. The unit file that is tied to the timer kicks off an apt-get update. That script runs unattended-upgrade which looks like it at least runs dpkg. I expect apt-get is in there too.

@dliappis
Copy link
Contributor

New vagrant boxes have been pushed that have various upgrade services disabled.

Details in: #37181 (comment)

Last failure occurred on Thursday, i.e. 1 day before the updated packages were pushed:

image

I am closing this issue now, will monitor build stats for any new occurrences of:

  • "dpkg failed and the lockfile still exists"

or

  • "dpkg frontend is locked by another process"

and reopen this issue if it happens again.

@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts Team:Delivery Meta label for Delivery team >test-failure Triaged test failures from CI v7.0.0-beta1
Projects
None yet
Development

No branches or pull requests

7 participants