Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] dashboard occasionally unresponsive #2586

Open
dirkpetersen opened this issue May 20, 2024 · 1 comment
Open

[BUG] dashboard occasionally unresponsive #2586

dirkpetersen opened this issue May 20, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@dirkpetersen
Copy link
Contributor

dirkpetersen commented May 20, 2024

Within 5-6 days of operations my dashboard (2.4 installed with nvflare dashboard --cloud aws docker install) became unresponsive (https connection timeout) . Unfortunately there was nothing I could find in the logs so i went ahead a created a monitoring script, that restarts the container. Not ideal but works

https://github.com/dirkpetersen/nvflare-cancer#about-2-restart-on-error

#!/bin/bash

url="https://myproject.mydomain.edu"   # Set the website URL
search_string='name="viewport"'        # Set the search string in the HTML source
timeout_duration=15                    # Set the timeout duration in seconds
date=$(date)                           # Get the current date for logging

# Check if the search string exists in the HTML source code
if curl -k -s -m $timeout_duration $url | grep -q "$search_string"; then
    echo "${date}: OK ! The search string '$search_string' was found in the HTML source code of $url"
else
    echo "${date}: Error ! The search string '$search_string' was not found in the HTML source code of $url or the connection timed out after $timeout_duration seconds"
    # Run the commands if the search string is not found or the connection times out
    echo "${date}: Restarting NVFlare dashboard"
    $HOME/.local/bin/nvflare dashboard --stop
    sleep 3
    $HOME/.local/bin/nvflare dashboard --start -f ~
fi

adding an hourly cron job:

(crontab -l 2>/dev/null; echo "59 * * * * \$HOME/monitor.sh >> /var/tmp/nvflare-monitor.log 2>&1") | crontab
@dirkpetersen dirkpetersen added the bug Something isn't working label May 20, 2024
@chesterxgchen
Copy link
Collaborator

@dirkpetersen thanks for reporting this. We will take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants