Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to add aiida daemon system service? #1434

Closed
ltalirz opened this issue Apr 16, 2018 · 13 comments · Fixed by #2849
Closed

how to add aiida daemon system service? #1434

ltalirz opened this issue Apr 16, 2018 · 13 comments · Fixed by #2849
Labels
topic/documentation type/question may redirect to mailinglist
Milestone

Comments

@ltalirz
Copy link
Member

ltalirz commented Apr 16, 2018

For our quantum mobile machine, we'd like to add the aiida daemon as a system service so that it is always running when a user starts the machine.

I'm not familiar with daemonization, so perhaps some of the folks working on this can give a few pointers.

Here is an example aiida-daemon.service file for systemd (that doesn't work, however - I guess this will be obvious for people who know how this is supposed to work)

[Unit]
Description=AiiDA daemon service
After=network.target


[Service]
ExecStart={{ aiida_venv }}/bin/verdi daemon start
ExecStop={{ aiida_venv }}/bin/verdi daemon stop
User={{ ansible_user }}
Group={{ ansible_user }}
Restart=always
RestartSec=60                       # Restart service 60 seconds if it crashes
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=aiida-daemon

[Install]
WantedBy=multi-user.target
@ltalirz ltalirz added the type/question may redirect to mailinglist label Apr 16, 2018
@ltalirz
Copy link
Member Author

ltalirz commented Apr 16, 2018

I guess the problem is that the verdi daemon start command exits after the daemon has started, which leads systemd to believe that the service is down...

Apr 16 13:08:22 qmobile systemd[1]: Started AiiDA daemon service.
Apr 16 13:08:22 qmobile aiida-daemon[29158]: Clearing all locks ...
Apr 16 13:08:22 qmobile aiida-daemon[29158]: Starting AiiDA Daemon (log file: /home/max/.aiida/daemon/log/celery.log)...
Apr 16 13:08:22 qmobile aiida-daemon[29158]: Daemon started
Apr 16 13:08:22 qmobile aiida-daemon[29169]: Daemon not running (cannot find the PID for it)
Apr 16 13:08:22 qmobile aiida-daemon[29169]: AiiDA Daemon shut down correctly.
Apr 16 13:08:23 qmobile systemd[1]: aiida-daemon.service: Service hold-off time over, scheduling restart.
Apr 16 13:08:23 qmobile systemd[1]: Stopped AiiDA daemon service.
Apr 16 13:08:23 qmobile systemd[1]: Started AiiDA daemon service.
Apr 16 13:08:23 qmobile aiida-daemon[29174]: Clearing all locks ...
Apr 16 13:08:23 qmobile aiida-daemon[29174]: Starting AiiDA Daemon (log file: /home/max/.aiida/daemon/log/celery.log)...
Apr 16 13:08:23 qmobile aiida-daemon[29174]: Daemon started
Apr 16 13:08:23 qmobile aiida-daemon[29184]: Daemon not running (cannot find the PID for it)
Apr 16 13:08:23 qmobile aiida-daemon[29184]: AiiDA Daemon shut down correctly.
Apr 16 13:08:24 qmobile systemd[1]: aiida-daemon.service: Service hold-off time over, scheduling restart.
...

@ltalirz
Copy link
Member Author

ltalirz commented Apr 16, 2018

After reading the systemd docs I've made some progress, but for some reason the daemon still doesn't want to start properly, when called from systemd:

Apr 16 13:28:29 qmobile systemd[1]: Started AiiDA daemon service.
Apr 16 13:28:30 qmobile aiida-daemon[979]: Clearing all locks ...
Apr 16 13:28:30 qmobile aiida-daemon[979]: Starting AiiDA Daemon (log file: /home/max/.aiida/daemon/log/celery.log)...
Apr 16 13:28:30 qmobile aiida-daemon[979]: Daemon started
Apr 16 13:28:30 qmobile aiida-daemon[993]: Daemon not running (cannot find the PID for it)
Apr 16 13:28:30 qmobile aiida-daemon[993]: AiiDA Daemon shut down correctly.

The same works just fine, if I call it from the command line:

 max@qmobile:~/.aiida/daemon/log$ /home/max/.virtualenvs/aiida/bin/verdi daemon start
Clearing all locks ...
Starting AiiDA Daemon (log file: /home/max/.aiida/daemon/log/celery.log)...
Daemon started

here the updated .service file

[Unit]
Description=AiiDA daemon service
After=network.target

[Service]
type=forking
PIDFile={{ ansible_env.HOME }}/.aiida/daemon/log/celery.pid
ExecStart={{ aiida_venv }}/bin/verdi daemon start
ExecStop={{ aiida_venv }}/bin/verdi daemon stop
ExecReload={{ aiida_venv }}/bin/verdi daemon restart
User={{ ansible_user }}
Group={{ ansible_user }}
Restart=on-failure
RestartSec=60                       # Restart service after X seconds if crashes
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=aiida-daemon

[Install]
WantedBy=multi-user.target

@ltalirz
Copy link
Member Author

ltalirz commented Apr 17, 2018

In essence it was just a typo (Type).

Will add this to the docs now.

[Unit]
Description=AiiDA daemon service
After=network.target

[Service]
Type=forking
ExecStart={{ aiida_venv }}/bin/verdi daemon start
PIDFile={{ ansible_env.HOME }}/.aiida/daemon/log/celery.pid
# 2s delay to prevent read error on PID file
ExecStartPost=/bin/sleep 2

ExecStop={{ aiida_venv }}/bin/verdi daemon stop
ExecReload={{ aiida_venv }}/bin/verdi daemon restart

User={{ ansible_user }}
Group={{ ansible_user }}
Restart=on-failure
RestartSec=60       # Restart daemon after 1 min if crashes
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=aiida-daemon

[Install]
WantedBy=multi-user.target

@ltalirz
Copy link
Member Author

ltalirz commented Apr 17, 2018

@giovannipizzi asks about how to generalize this for a per-user service

There is such a thing as user services.
It seems to me that the aiida daemon should probably better run as a user service (although this means it will be killed when the user logs out).

It seems like the service should be put in ~/.config/systemd/user/aiida-daemon.service and enabled using systemctl --user enable aiida-daemon

@giovannipizzi
Copy link
Member

Are you sure it will be killed at logout? The note in "How it works" at your link says it's per-user and not per-session that makes me think the opposite, but I didn't test.
I think that if it is killed at user logout, it's not a good idea to use it (unless the user prefers so). Anyway your configuration runs as the correct user, so it should be ok. The only thing is to allow multiple users to configure their service, or actually even more importantly, a given user to run two or more daemons for different profiles (this is now possible in the new developments happening in the workflows branch)

@ltalirz
Copy link
Member Author

ltalirz commented Apr 18, 2018

Are you sure it will be killed at logout? The note in "How it works" at your link says it's per-user and not per-session that makes me think the opposite

The second sentence in "How it works" literally reads: "This process will survive as long as there is some session for that user, and will be killed as soon as the last session for the user is closed."

Anyway your configuration runs as the correct user, so it should be ok.

Well, it's still a system-wide service, meaning for example you can't run two identical ones on the same machine.
If there are multiple users on a server who want to use AiiDA, one might want to run things on a user-basis instead of having a system-wide service for everyone's daemons.
But I do also see the value of having the daemon running without the user being logged in...
so for the moment let's not worry about the multi-user scenario.

There is something called "unit templates" (see also here), which allow you to run a [email protected] and should solve the issue of running one daemon per profile.

We'll take care of this once the workflows branch has been merged (and then, this issue can be closed).

@giovannipizzi
Copy link
Member

meaning for example you can't run two identical ones on the same machine.
Yes exactly, but if you give two different names, you can have two services (e.g. aiida-user1-profile1 and aiida-user1-profile2)

Moreover, I think it is very useful to have AiiDA running when not logged in (I can log out and the daemon will still take care of my simulations). Imagine e.g. that AiiDA runs on a non-graphical server, where nobody is logged in but still you want AiiDA to check your calculations and submit new wf steps.

@giovannipizzi
Copy link
Member

This is a script that works for the AiiDA 1.0.0 series (there are some variables to change, this is taken from a template of a ansible script):

[Unit]
Description=AiiDA daemon service
After=network.target

[Service]
Type=forking
ExecStart={{ aiida_venv }}/bin/verdi daemon start
PIDFile={{ ansible_env.HOME }}/.aiida/daemon/circus-default.pid
# 2s delay to prevent read error on PID file
ExecStartPost=/bin/sleep 2

ExecStop={{ aiida_venv }}/bin/verdi daemon stop
ExecReload={{ aiida_venv }}/bin/verdi daemon restart

User={{ ansible_user }}
Group={{ ansible_user }}
Restart=on-failure
RestartSec=60       # Restart daemon after 1 min if crashes
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=aiida-daemon

[Install]
WantedBy=multi-user.target

This needs just to be updated in the docs

@giovannipizzi giovannipizzi self-assigned this May 23, 2018
@ltalirz
Copy link
Member Author

ltalirz commented May 23, 2018

In order to run the daemon per profile using unit templates, is it like this?

[Unit]
Description=AiiDA daemon service for profile %I
After=network.target

[Service]
Type=forking
ExecStart={{ aiida_venv }}/bin/verdi -p %i daemon start
PIDFile={{ ansible_env.HOME }}/.aiida/daemon/circus-%i.pid
# 2s delay to prevent read error on PID file
ExecStartPost=/bin/sleep 2

ExecStop={{ aiida_venv }}/bin/verdi -p %i daemon stop
ExecReload={{ aiida_venv }}/bin/verdi -p %i daemon restart

User={{ ansible_user }}
Group={{ ansible_user }}
Restart=on-failure
RestartSec=60       # Restart daemon after 1 min if crashes
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=aiida-daemon

[Install]
WantedBy=multi-user.target

Feel free to edit.

@giovannipizzi
Copy link
Member

Update this for 1.0

@giovannipizzi giovannipizzi removed their assignment Dec 3, 2018
@giovannipizzi
Copy link
Member

As soon as the beta is out, this can be fixed in the Quantum Mobile, and this issue closed. We have already prepared the correct script for the tutorial last May (and I think it is the code I pasted above).

@ltalirz
Copy link
Member Author

ltalirz commented Jan 10, 2019

Here the updated config for the aiida 0.12 series which fixes two issues (daemon starting before postgres, parsing of restartsec).

[Unit]
Description=AiiDA daemon service
After=network.target postgresql.service

[Service]
Type=forking
ExecStart={{ daemon_aiida_venv }}/bin/verdi daemon start
PIDFile={{ daemon_user_home }}/.aiida/daemon/log/celery.pid
# 2s delay to prevent read error on PID file
ExecStartPost=/bin/sleep 2

ExecStop={{ daemon_aiida_venv }}/bin/verdi daemon stop
ExecReload={{ daemon_aiida_venv }}/bin/verdi daemon restart

User={{ daemon_user }}
Group={{ daemon_user }}
Restart=on-failure
# Restart daemon after 1 min if crashes
RestartSec=60
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=aiida-daemon

[Install]
WantedBy=multi-user.target

This system service configuration is used in Quantum Mobile

@ltalirz
Copy link
Member Author

ltalirz commented Mar 23, 2019

And here comes the equivalent configuration for aiida 1.0

[Unit]
Description=AiiDA daemon service for profile %I
After=network.target postgresql.service rabbitmq-server.service

[Service]
Type=forking
ExecStart={{ daemon_aiida_venv }}/bin/verdi -p %i daemon start
PIDFile={{ daemon_user_home }}/.aiida/daemon/circus-%i.pid
# 2s delay to prevent read error on PID file
ExecStartPost=/bin/sleep 2

ExecStop={{ daemon_aiida_venv }}/bin/verdi -p %i daemon stop
ExecReload={{ daemon_aiida_venv }}/bin/verdi -p %i daemon restart

User={{ daemon_user }}
Group={{ daemon_user }}
Restart=on-failure
# Restart daemon after 1 min if crashes
RestartSec=60
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=aiida-daemon-%i

[Install]
WantedBy=multi-user.target

Name this unit template "[email protected]" and start the daemon for profile myprofile with:
sudo service aiida-daemon@myprofile start

This is tested on Ubuntu 16.04 and 1804 here

All that remains to be done is to add links to the docs

@sphuber sphuber modified the milestones: v1.0.0, v1.1.0 Apr 3, 2019
@sphuber sphuber removed this from the v1.1.0 milestone May 8, 2019
@sphuber sphuber added this to the v1.0.0 milestone May 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/documentation type/question may redirect to mailinglist
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants