-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HTTP webserver interface #6
Conversation
thank you for this @15532th. I think it would be greatly appreciated if you could integrate it inside the main script. this way we can keep the logic common between the server web API and the one shot script. |
Updated the main script. Output on successful token retrieval should be the same, however it might print different messages on errors. I also fixed a few issues I came across while updating it, see individual commit messages for comments on the changes. |
@WardPearce Could you help me review? Thank you :) |
more then happy to, I'll have a look over tmrw |
@unixfox @15532th Wouldn't updating Po tokens be better done using a in memory cache, what will expire after X seconds. Meaning that after every hour or so the request is slightly slower or possibly using aiocron (https://github.com/gawel/aiocron) if we must schedule in the background. |
@WardPearce The second reason is that the typical environment I expect myself to end up using this token-retrieving functionality in, which is the end user machine, often running Windows. I don't want to ask my users to deal with complicated dependencies installation or setting up and running a docker image. In the current state of the token generator it only has a single pure Python dependency, which makes it easy to publish on pypi or package as an executable without running into obscure cross-platform compatibility issues (I am leaving out the biggest dependency, the Chrome itself, because a user machine is quite likely going to have a fresh version installed). Regarding the "run update inside the Nth requests" vs "do updates in background" choice, my reasons for deciding on the latter were the following: while the usual delay on a fast machine would be only a few seconds, the worst case is currently a 30 seconds timeout, which some clients might not be prepared to handle, since it is not a delay you would normally expect from a web-service running on localhost. And if the issue that caused the timeout was only temporary, it means client has received an unnecessarily error while it could have been operating fine if it was served the old cached version of the token until a new update succeeds. |
The point in having an API is to fetch the po_token from an external tool that is not designed to do it by itself. A po_token currently works for an indefinite amount of time, it's not a token that will expire after 1 hour. So serving a po_token generated from one hour ago is perfectly fine. I do share the same vision as @15532th, this API won't be used at high scale and won't be accessible from the whole internet. Worst case scenario, someone runs 10 Invidious processes, it's just 10 requests done per hour, it's meaningless. |
Turns out TemporaryDirectory initiates cleanup on interpreter exit even when it is used without context manager, which might interfere with nodriver trying to delete the directory at the same time.
It might happen if loaded page does not contain a video player or fails to load at all. Explicitly specify 10 seconds timeout for the selector, wrap it in a try-catch block, add separate error messages for the selector and handler timeouts.
It allows client code to control logging format and level when the module is used as a library and not a standalone script.
If connection to the launched browser gets interrupted or the browser itself exits abruptly before the video tab got loaded, the update procedure might hang up indefinitely.
Remove unnecessarily f-strings and double quotes, fix a few other issues.
It leaves less time for client to hit 503 error on startup, but more importantly, it deals with race condition in startup sequence: if extractor_task gets interrupted after webserver startup was initiated but before make_server() returned, then potoken_server.stop() becomes a no-op, leaving background task with webserver to run indefinitely.
This PR implements enhancements discussed in #1, namely serving token as a JSON endpoint. It is meant to be a starting point for discussing the HTTP interface, specifically endpoints names and responses formats, so all the new code is placed into a separate file to avoid affecting any existing functionality.
Token generation runs independently from the webserver process, launching an update every 12 hours by default. Webserver uses port
8080
(configurable), with the latest token value being available on the/token
endpoint:Additionally, it is possible to initiate immediate unscheduled token generation by accessing
/update
endpoint. While handy for testing, this feature seems to add more complications than it's worth, so it probably should be removed, unless it finds some practical use.