Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pcp: some services won't start (x86-64, systemd, kirkstone) #592

Open
b1czu opened this issue Aug 9, 2022 · 1 comment
Open

pcp: some services won't start (x86-64, systemd, kirkstone) #592

b1czu opened this issue Aug 9, 2022 · 1 comment

Comments

@b1czu
Copy link

b1czu commented Aug 9, 2022

Hi,
it seems to be that pcp recipe is broken (at least for kirkstone with systemd enabled). The recipe builds correctly but on target system some pcp services fails to start.
Log:

# systemctl --failed
  UNIT                  LOAD   ACTIVE SUB    DESCRIPTION                         
● pmcd.service          loaded failed failed Performance Metrics Collector Daemon
● pmie.service          loaded failed failed Performance Metrics Inference Engine
● pmie_farm.service     loaded failed failed pmie farm service
● pmlogger.service      loaded failed failed Performance Metrics Archive Logger
● pmlogger_farm.service loaded failed failed pmlogger farm service

NOTE: I'm building for x86-64 and this log was taken when launched directly on target device. When I launch image in QEMU then systemctl --failed does not list those services as failed but services still do not start -> it seems that autorestart in QEMU takes more time and it does not trigger systemd autorestart limits so services just restarts endlessly.

pmcd.service:

# journalctl -xu pmcd
Aug 09 09:54:00 XXXX systemd[1]: Starting Performance Metrics Collector Daemon...
░░ Subject: A start job for unit pmcd.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit pmcd.service has begun execution.
░░ 
░░ The job identifier is 84.
Aug 09 09:54:00 XXXX pmcd[303]: Rebuilding PMNS ...
Aug 09 09:54:01 XXXX systemd[1]: pmcd.service: Failed to parse MAINPID= field in notification message, ignoring: 
Aug 09 09:54:01 XXXX systemd[1]: Started Performance Metrics Collector Daemon.
░░ Subject: A start job for unit pmcd.service has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit pmcd.service has finished successfully.
░░ 
░░ The job identifier is 84.
Aug 09 09:54:01 XXXX pmcd[691]: PMCD process ... 561
Aug 09 09:54:01 XXXX pmcd[691]: /usr/libexec/pcp/lib/pmcd:
Aug 09 09:54:01 XXXX pmcd[691]: Warning: process ID in /var/run/pmcd.pid () is different.
Aug 09 09:54:01 XXXX pmcd[691]:          Check logfile /var/log/pmcd/pmcd.log. When you are ready to proceed,
Aug 09 09:54:01 XXXX pmcd[691]:          remove /var/run/pmcd.pid before retrying.
Aug 09 09:54:01 XXXX systemd[1]: pmcd.service: Control process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ An ExecStop= process belonging to unit pmcd.service has exited.
░░ 
░░ The process' exit code is 'exited' and its exit status is 1.
Aug 09 09:54:01 XXXX systemd[1]: pmcd.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ The unit pmcd.service has entered the 'failed' state with result 'exit-code'.
# cat /var/log/pmcd/pmcd.log
Log for pmcd on XXXX started Tue Aug  9 09:54:02 2022

Cannot find pmcd DSO at "/var/lib/pcp/pmdas/pmcd/pmda_pmcd.so"

Cannot find pmproxy DSO at "/var/lib/pcp/pmdas/mmv/pmda_mmv.so"

Cannot find mmv DSO at "/var/lib/pcp/pmdas/mmv/pmda_mmv.so"

Cannot find jbd2 DSO at "/var/lib/pcp/pmdas/jbd2/pmda_jbd2.so"

pmcd: unexpected end-of-file at initial exchange with kvm PMDA

active agent dom   pid  in out ver protocol parameters
============ === ===== === === === ======== ==========
root           1 %5 2375   6   7 bin pipe cmd=/var/lib/pcp/pmdas/root/pmdaroot
proc           3 %5 2376   9  10 bin pipe cmd=/var/lib/pcp/pmdas/proc/pmdaproc -d 3
xfs           11 %5 2377  11  12 bin pipe cmd=/var/lib/pcp/pmdas/xfs/pmdaxfs -d 11
linux         60 %5 2378  13  14 bin pipe cmd=/var/lib/pcp/pmdas/linux/pmdalinux

Host access list:
00 01 Cur/MaxCons host-spec                               host-mask                               lvl host-name
== == =========== ======================================= ======================================= === ==============
 y  y     0     0 127.0.1.1                               255.255.255.255                           0 localhost
 y  y     0     0 /                                       /                                         1 unix:
    n     0     0 0.0.0.0                                 0.0.0.0                                   4 .*
    n     0     0 ::                                      ::                                        8 :*
User access list empty: user-based access control turned off
Group access list empty: group-based access control turned off


pmcd: PID = , PDU version = 2
pmcd request port(s):
  sts fd   port  family address
  === ==== ===== ====== =======
  ok     4       unix   /var/run/pmcd.socket
  ok     0 44321 inet   INADDR_ANY
  ok     3 44321 ipv6   INADDR_ANY

pmie.service:

# journalctl -xu pmie
Aug 09 09:54:01 XXXX systemd[1]: Starting Performance Metrics Inference Engine...
░░ Subject: A start job for unit pmie.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit pmie.service has begun execution.
░░ 
░░ The job identifier is 101.
Aug 09 09:54:01 XXXX systemd[1]: pmie.service: Failed with result 'protocol'.

pmlogger.service:

# journalctl -xu pmlogger
Aug 09 09:54:01 XXXX systemd[1]: Starting Performance Metrics Archive Logger...
░░ Subject: A start job for unit pmlogger.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit pmlogger.service has begun execution.
░░ 
░░ The job identifier is 107.
Aug 09 09:54:01 XXXX pmlogger[598]: /usr/libexec/pcp/lib/pmlogger: Warning: Performance Co-Pilot archive logger(s) not permanently enabled.
Aug 09 09:54:01 XXXX pmlogger[598]:     To enable pmlogger, run the following as root:
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc2.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc2.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc3.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc3.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc4.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc4.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc5.d/S94pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]:     # ln -sf ../init.d/pmlogger /etc/rc.d/rc5.d/K06pmlogger
Aug 09 09:54:01 XXXX pmlogger[598]: /usr/libexec/pcp/lib/pmlogger:
Aug 09 09:54:01 XXXX pmlogger[598]: Warning: Performance Co-Pilot installation is incomplete (at least the
Aug 09 09:54:01 XXXX pmlogger[598]:          script "pmlogger_check" is missing) and the PCP archive logger(s)
Aug 09 09:54:01 XXXX pmlogger[598]:          cannot be started.
Aug 09 09:54:01 XXXX systemd[1]: pmlogger.service: Failed with result 'protocol'.

I can see two things that could be possible issues here. The first one - it seems that PID for pmcd is not witten to /var/run/pmcd.pid file. The PID file is created with following permissions on pmcd startup:

-r--r--r--  1 root   root       0 Aug  9 10:29 pmcd.pid

but it's empty. PMCD daemon complains that it's PID is different than the one written to empty PID file and it exits. Additionally systemd unit file for pmcd.service points to this file as service PID so systemd could be mad about wrong pid file too.

The second issue could be the fact that those services' systemd unit files declares:

[Service]
Type=notify

but systemd complains things such as:
systemd[1]: pmcd.service: Failed to parse MAINPID= field in notification message, ignoring:
or:
systemd[1]: pmlogger.service: Failed with result 'protocol'..
Maybe pcp lacks systemd support and notify just won't work in that case?

@b1czu
Copy link
Author

b1czu commented Aug 9, 2022

@kraj

kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Sep 20, 2023
Update the inherit is use the poetry

Changelog
=========
What's Changed
Make cycle_time serialisation more consistent for DBC files by @mon in openembedded#592
User f-strings instead of str.format() by @zariiii9003 in openembedded#599
Add prog option to argparse for help messages by @jack-champagne in openembedded#600

NOTE: This is a major release change with the following API changes:
The initial attribute of Signal objects now always holds the initial signal value as a scaled quantity, unifying its semantics with that of Signal.minimum and Signal.maximum. Previously, initial used raw values for databases loaded from DBC files, while using scaled ones for databases loaded from ARXML. (The loaders for other file formats do not currently set the initial attribute.)
The machinery for storing decimal numbers without rounding errors (*.decimal attributes) has been removed. In its place small rounding errors in load-store-load cycles are now accepted. To remediate this, the resulting database objects can now be compared approximately using the Database.is_similar() method.

Signed-off-by: Derek Straka <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Sep 20, 2023
Update the inherit is use the poetry

Changelog
=========
What's Changed
Make cycle_time serialisation more consistent for DBC files by @mon in openembedded#592
User f-strings instead of str.format() by @zariiii9003 in openembedded#599
Add prog option to argparse for help messages by @jack-champagne in openembedded#600

NOTE: This is a major release change with the following API changes:
The initial attribute of Signal objects now always holds the initial signal value as a scaled quantity, unifying its semantics with that of Signal.minimum and Signal.maximum. Previously, initial used raw values for databases loaded from DBC files, while using scaled ones for databases loaded from ARXML. (The loaders for other file formats do not currently set the initial attribute.)
The machinery for storing decimal numbers without rounding errors (*.decimal attributes) has been removed. In its place small rounding errors in load-store-load cycles are now accepted. To remediate this, the resulting database objects can now be compared approximately using the Database.is_similar() method.

Signed-off-by: Derek Straka <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Sep 21, 2023
Update the inherit is use the poetry

Changelog
=========
What's Changed
Make cycle_time serialisation more consistent for DBC files by @mon in openembedded#592
User f-strings instead of str.format() by @zariiii9003 in openembedded#599
Add prog option to argparse for help messages by @jack-champagne in openembedded#600

NOTE: This is a major release change with the following API changes:
The initial attribute of Signal objects now always holds the initial signal value as a scaled quantity, unifying its semantics with that of Signal.minimum and Signal.maximum. Previously, initial used raw values for databases loaded from DBC files, while using scaled ones for databases loaded from ARXML. (The loaders for other file formats do not currently set the initial attribute.)
The machinery for storing decimal numbers without rounding errors (*.decimal attributes) has been removed. In its place small rounding errors in load-store-load cycles are now accepted. To remediate this, the resulting database objects can now be compared approximately using the Database.is_similar() method.

Signed-off-by: Derek Straka <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Jan 15, 2024
Version 1.78.2
--------------

- Closed bugs and merge requests:
  * Uninitialized memory in float out values can lead to crashes in mozjs gc
    code later on [openembedded#591, !902, Philip Chimento]
  * Garbage collection of Gdk surfaces [openembedded#592, !905, Philip Chimento]
  * gi/gerror: Fix version of the GIRepository typelib import [!906, Jordan
    Petridis]

Signed-off-by: Markus Volk <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Jan 16, 2024
Version 1.78.2
--------------

- Closed bugs and merge requests:
  * Uninitialized memory in float out values can lead to crashes in mozjs gc
    code later on [openembedded#591, !902, Philip Chimento]
  * Garbage collection of Gdk surfaces [openembedded#592, !905, Philip Chimento]
  * gi/gerror: Fix version of the GIRepository typelib import [!906, Jordan
    Petridis]

Signed-off-by: Markus Volk <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Jan 18, 2024
Version 1.78.2
--------------

- Closed bugs and merge requests:
  * Uninitialized memory in float out values can lead to crashes in mozjs gc
    code later on [openembedded#591, !902, Philip Chimento]
  * Garbage collection of Gdk surfaces [openembedded#592, !905, Philip Chimento]
  * gi/gerror: Fix version of the GIRepository typelib import [!906, Jordan
    Petridis]

Signed-off-by: Markus Volk <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Jan 19, 2024
Version 1.78.2
--------------

- Closed bugs and merge requests:
  * Uninitialized memory in float out values can lead to crashes in mozjs gc
    code later on [openembedded#591, !902, Philip Chimento]
  * Garbage collection of Gdk surfaces [openembedded#592, !905, Philip Chimento]
  * gi/gerror: Fix version of the GIRepository typelib import [!906, Jordan
    Petridis]

Signed-off-by: Markus Volk <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
kraj pushed a commit to YoeDistro/meta-openembedded that referenced this issue Jan 19, 2024
Version 1.78.2
--------------

- Closed bugs and merge requests:
  * Uninitialized memory in float out values can lead to crashes in mozjs gc
    code later on [openembedded#591, !902, Philip Chimento]
  * Garbage collection of Gdk surfaces [openembedded#592, !905, Philip Chimento]
  * gi/gerror: Fix version of the GIRepository typelib import [!906, Jordan
    Petridis]

Signed-off-by: Markus Volk <[email protected]>
Signed-off-by: Khem Raj <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant