Webserver and yagna autostart #69

approxit · 2023-09-20T09:49:00Z

What I've done:

Upgraded yagna autostart/stop in webserver - yagna is checked / started at webserver start, and stopped at webserver stop (if yagna was managed by webserver in the first place)
Enabled auto yagna payment fund, as its required each time after yagna restart, due to usage of goerli instead of rinkeby
Upgraded a little logs of few things to have them more explicit
Upgraded webserver autostart/stop in node provider - webserver is checked at before any webserver call, started if needed, and stopped after all nodes are terminated
Reformatted the code

Notable remarks:

Auto yagna payment fund should be tested / changed while using mainnet, as the is no reason to force the user to wait to collect testnet funds
As node provider is used by ray only for atomic cluster actions, we have no control over up and down commands, hence we don't know the true intent. This arises in webserver stopping, node termination is not a last commands that is ran at down. I've assumed that 10s delay from

shadeofblue · 2023-09-20T10:22:52Z

ray_on_golem/server/services/yagna.py

        if await self._check_if_yagna_is_running():
            logger.info("Yagna service is already running")
        else:
            await self._run_yagna_service()
+            await self._run_yagna_payment_fund()  # FIXME


yagna payment fund is needed only when yagna is launched the first time - or - when it depletes testnet fund (be it ETH or tGLM)

on a regular restart, only yagna payment init is currently required (and, actually, could be done without if we ported the mechanism currently used in yapapi)

Changed to payment init.

shadeofblue · 2023-09-20T10:36:49Z

ray_on_golem/client/client.py

@@ -19,6 +25,9 @@ def __init__(self, base_url: URL) -> None:

        self._session = requests.Session()

+        if not is_running_on_golem_network():
+            self._start_webserver()


wondering aloud... is it sensible to put such long-blocking operations into a constructor? my intuition suggests this should be a separate step...

plus, it sounds somewhat unintuitive in general to start a server when you instiantiate a client ...

Well... It sounded better at first, but good point. Moving this code to node provider fixes the problem of ray logs while starting webserver.

shadeofblue · 2023-09-20T10:37:34Z

ray_on_golem/client/client.py

@@ -19,6 +25,9 @@ def __init__(self, base_url: URL) -> None:

        self._session = requests.Session()

+        if not is_running_on_golem_network():
+            self._start_webserver()


on a separate note - question - what if the webserver startup fails when it's required? wouldn't we want to raise an explicit exception?

_start_webserver() already handles explicit exception for that case.

shadeofblue · 2023-09-20T10:56:49Z

ray_on_golem/client/client.py

+                start_new_session=True,
+            )
+
+            for _ in range(10):


nitpicking but possibly such magic values as the number of retries and the interval between them should be extracted to module-level constants ...

Good point, added configurable and time based waiting.

shadeofblue · 2023-09-20T10:57:38Z

ray_on_golem/client/client.py

@@ -192,3 +203,62 @@ def _make_request(
            raise RayOnGolemClientValidationError(
                "Couldn't validate response data",
            ) from e
+
+    def _start_webserver(self) -> None:
+        with cli_logger.group("Ray On Golem webserver"):


the name of the logging group should most likely be a constant defined somewhere...

shadeofblue · 2023-09-20T11:07:28Z

ray_on_golem/client/client.py

+                url=settings.URL_SELF_SHUTDOWN,
+                request_data=models.SelfShutdownRequestData(),
+                response_model=models.SelfShutdownResponseData,
+                error_message="Couldn't send self shutdown request",


Suggested change

error_message="Couldn't send self shutdown request",

error_message="Couldn't send a self-shutdown request",

shadeofblue · 2023-09-20T11:08:02Z

ray_on_golem/client/client.py

+            )
+
+            if response.shutdown_state == ShutdownState.NOT_ENABLED:
+                cli_logger.print("No need to stop webserver, as it was ran externally")


Suggested change

cli_logger.print("No need to stop webserver, as it was ran externally")

cli_logger.print("No need to stop the webserver, as it was started externally")

shadeofblue · 2023-09-20T11:08:14Z

ray_on_golem/client/client.py

+                cli_logger.print("No need to stop webserver, as it was ran externally")
+                return
+            elif response.shutdown_state == ShutdownState.CLUSTER_NOT_EMPTY:
+                cli_logger.print("No need to stop webserver, as cluster is not empty")


Suggested change

cli_logger.print("No need to stop webserver, as cluster is not empty")

cli_logger.print("No need to stop the webserver, as the cluster is not empty")

shadeofblue · 2023-09-20T11:09:52Z

ray_on_golem/client/client.py

+                cli_logger.print("No need to stop webserver, as cluster is not empty")
+                return
+
+            cli_logger.print("Requesting webserver done, will stop soon")


maybe this way?

Suggested change

cli_logger.print("Requesting webserver done, will stop soon")

cli_logger.print("Webserver shutdown request done, will stop soon")

There is a missing word, but I'm keeping symmetric logs text with line Requesting webserver shutdown... to keep logs more readable.

shadeofblue · 2023-09-20T11:10:59Z

ray_on_golem/client/client.py

+    def _is_webserver_running(self) -> bool:
+        try:
+            response = self._session.get(
+                str(self._base_url / URL_HEALTH_CHECK.lstrip("/")), timeout=2


I'd define this timeout also as some constant on a module level...

(or, we could have them as parameters provided to the class's constructor)

shadeofblue · 2023-09-20T11:40:24Z

ray_on_golem/server/models.py

@@ -14,6 +14,12 @@ class NodeState(Enum):
    stopping = "stopping"


+class ShutdownState(Enum):


maybe not now, but if, in the future, we want to support various transitions of the server/cluster state, this might benefit from being a state machine...

shadeofblue · 2023-09-20T11:51:03Z

ray_on_golem/server/run.py

+    app = create_application(args.port, args.self_shutdown)
+    web.run_app(app, port=args.port, print=None)


hmm, if we're already using the app's setitem/getitem interface to store arbitrary additional properties, we could store the port there as well ... and then, there would be just one place where we define the server's port...

Suggested change

app = create_application(args.port, args.self_shutdown)

web.run_app(app, port=args.port, print=None)

app = create_application(args.port, args.self_shutdown)

web.run_app(app, port=app["port"], print=None)

plus, (not now, just a thought) -> if we're allowing the port to be chosen, maybe we should also allow override of the host argument?

Well, good point, added.

But regarding host - I've never changed this from 0.0.0.0 to anything else in my whole career. Even more - every time where host was not 0.0.0.0 it causes some problems. Having host in that way is enough for my taste.

shadeofblue · 2023-09-20T11:52:34Z

ray_on_golem/server/services/yagna.py

 from ray_on_golem.exceptions import RayOnGolemError
 from ray_on_golem.server.settings import YAGNA_APPKEY
 from ray_on_golem.utils import run_subprocess

 logger = logging.getLogger(__name__)

 YAGNA_APPNAME = "ray-on-golem"
+YAGNA_API_URL = URL("http://127.0.0.1:7465")


don't we want to allow this to be overridden? by an environment variable perhaps?

shadeofblue · 2023-09-20T11:54:30Z

ray_on_golem/server/services/yagna.py

+
+    async def _stop_yagna_service(self):
+        if self._yagna_process is None:
+            logger.info("No need to stop Yagna service, as it was ran externally")


Suggested change

logger.info("No need to stop Yagna service, as it was ran externally")

logger.info("No need to stop Yagna service, as it was started externally")

shadeofblue · 2023-09-20T11:55:39Z

ray_on_golem/server/views.py

+        shutdown_state = ShutdownState.WILL_SHUTDOWN
+
+    if shutdown_state == ShutdownState.WILL_SHUTDOWN:
+        logger.info("Received self shutdown request, exiting in 10 seconds...")


again, a twice-hardcoded magic number...

shadeofblue

added my comments

shadeofblue

I believe it looks good now

approxit added 2 commits September 18, 2023 19:30

Better autorun of yagna

9353d26

Webserver autostart

4e7839e

approxit requested review from shadeofblue and lucekdudek September 20, 2023 09:49

shadeofblue reviewed Sep 20, 2023

View reviewed changes

review changes

237a8f5

approxit requested a review from shadeofblue September 20, 2023 13:06

shadeofblue approved these changes Sep 20, 2023

View reviewed changes

approxit merged commit ed39eb0 into main Sep 20, 2023

approxit deleted the approxit/webserver-and-yagna-autostart branch September 20, 2023 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Webserver and yagna autostart #69

Webserver and yagna autostart #69

approxit commented Sep 20, 2023

shadeofblue Sep 20, 2023

approxit Sep 20, 2023

shadeofblue Sep 20, 2023

approxit Sep 20, 2023

shadeofblue Sep 20, 2023

approxit Sep 20, 2023

shadeofblue Sep 20, 2023

approxit Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023

approxit Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023 •

edited

Loading

approxit Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue Sep 20, 2023

shadeofblue left a comment

shadeofblue left a comment

	error_message="Couldn't send self shutdown request",
	error_message="Couldn't send a self-shutdown request",

	cli_logger.print("No need to stop webserver, as it was ran externally")
	cli_logger.print("No need to stop the webserver, as it was started externally")

	cli_logger.print("Requesting webserver done, will stop soon")
	cli_logger.print("Webserver shutdown request done, will stop soon")

		@@ -14,6 +14,12 @@ class NodeState(Enum):
		stopping = "stopping"


		class ShutdownState(Enum):

		app = create_application(args.port, args.self_shutdown)
		web.run_app(app, port=args.port, print=None)

	logger.info("No need to stop Yagna service, as it was ran externally")
	logger.info("No need to stop Yagna service, as it was started externally")

Webserver and yagna autostart #69

Webserver and yagna autostart #69

Conversation

approxit commented Sep 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shadeofblue Sep 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shadeofblue left a comment

Choose a reason for hiding this comment

shadeofblue left a comment

Choose a reason for hiding this comment

shadeofblue Sep 20, 2023 •

edited

Loading