Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tunneldigger process management broken (?): can end up with multiple tunneldigger running #148

Open
RalfJung opened this issue Jul 21, 2022 · 5 comments

Comments

@RalfJung
Copy link

The latest lead in a long-standing issue seems to indicate that tunneldigger process management sometimes goes wrong, and we can end up with 2 instances of tunneldigger running (ps showing 6 tunneldigger processes, rather than the usual 3). This then leads to those 2 instances interrupting each other all the time, which is essentially a DoS attack on the gateway.

I don't know how to reproduce this, and have not actually seen these 6 tunneldigger processes myself (I never managed to get SSH onto an affected device), but this is the best lead so far. So I wonder... how could a Gluon device end up in a situation where tunneldigger runs twice?

@T-X
Copy link

T-X commented Jul 21, 2022

Can you maybe reproduce something like this if you try to do multiple tunneldigger restarts at the same time? Something like:

for i in `seq 1 1000`; do
  /etc/init.d/tunneldigger restart &
  /etc/init.d/tunneldigger restart &
  /etc/init.d/tunneldigger restart &
  /etc/init.d/tunneldigger restart &
  wait
# sleep 1
done

I'm wondering if the tunneldigger-watchdog micron can sometimes result it multiple restarts being run in parallel? Just some weild guesses.

@RalfJung
Copy link
Author

RalfJung commented Aug 7, 2022

Hm, when I tried this even just with a loop count of 20, my device just reboots after a bit... nothing it prints via SSH shows any indication why.

It's a pretty weak device with very little RAM, so it's probably not good for such tests. It's the only one I have though...

@T-X
Copy link

T-X commented Aug 19, 2022

You should be able to find out if it's an out-of-memory or other crash via /sys/kernel/debug/crashlog after the device rebooted, as long as you don't power cycle it. Or via a serial console, of course. Not sure if that'd help for this issue, but maybe there could be some unexpected hints in there?

@valcryst
Copy link

We had this issue on alot of routers and it seems that this occours after a reboot (daily reboots).
Alot of our refugee routers where also affected so we needed a quick solution to overcome this
issue and adapted some of the old tunneldigger-watchdog code by @lcb01a

Patched this function

https://github.com/freifunk-gluon/gluon/blob/master/package/gluon-mesh-vpn-tunneldigger/luasrc/usr/bin/tunneldigger-watchdog#L5

to

local function restart_tunneldigger()
	os.execute('logger -t tunneldigger-watchdog "Restarting Tunneldigger."')
	os.execute('/etc/init.d/tunneldigger stop')
	os.execute('sleep 1')
	os.execute('killall -KILL tunneldigger')
	os.execute('rm -f /var/run/tunneldigger.mesh-vpn.pid'
	os.execute('sleep 5')
	os.execute('/etc/init.d/tunneldigger start')
end

With this change we dont have this issue anymore, but i still cant tell how the routers end up
running multiple Tunneldiggers without a PID, wich seems to happen here after reboots.

@rotanid
Copy link
Member

rotanid commented Dec 23, 2024

tunneldigger has been deprecated in gluon and removed in main branch: freifunk-gluon/gluon#3109
it is part of community packages repo now: https://github.com/freifunk-gluon/community-packages/tree/master/ff-mesh-vpn-tunneldigger

@rotanid rotanid transferred this issue from freifunk-gluon/gluon Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants