-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow launcher to tracker processes across restarts #521
Comments
Having now read the code, the protocol stores a deferred, such that when the subprocess returns, it runs the _process_finished callback. Attaching this callback can be done on restart. The harder part is reestablishing whatever calls the protocol's outReceived, errReceived and processEnded methods (maybe originally done by We would need to change the model significantly to resolve this issue. So, closing unless there are new ideas on how to resolve. Edit: This was also discusssed starting from this comment #12 (comment) and one idea was to somehow use Scrapy's JOBDIR logic - I haven't looked into it. Anyway, I'll at least re-open in case anyone has new ideas, but for now this seems hard to resolve. |
Okay, so I did a little experimenting and wrote this small app that can run test files. The test files are import time
while True:
print(1)
time.sleep(1) and print(1) The app is: import sys
from twisted.application.service import Application, Service
from twisted.internet import protocol, reactor
from twisted.python.runtime import platformType
from twisted.python import log
from twisted.scripts import twistd
from scrapyd.launcher import ScrapyProcessProtocol
if platformType == "win32":
from twisted.internet._dumbwin32proc import Process
else:
from twisted.internet.process import Process
def finish(proto):
print("done")
class Launcher(Service):
def startService(self):
args = [sys.executable, "print1.py"]
env = {}
process = ScrapyProcessProtocol("p1", "s1", "j1", env, args)
process.deferred.addBoth(finish)
# reactor.spawnProcess(process, sys.executable, args=args, env=env)
Process(reactor, sys.executable, args, env, path=None, proto=process)
def app():
app = Application("test")
launcher = Launcher()
launcher.setServiceParent(app)
return app
application = app() And the app is run with: twistd -n -y spawn.py To explain: Importing twisted.internet.reactor calls twisted.internet.default.install(), which in turn calls one of:
which all inherit from twisted.internet.posixbase.PosixReactorBase. This class implements spawnProcess, which, for Scrapyd, instantiates twisted.internet.process.Process if Anyway, |
The default Launcher stores
self.processes
as a dict in memory, so when Scrapyd restarts, it loses track of the processes. The processes are stored asScrapyProcessProtocol
instances.A new Launcher could perhaps store these in a backend, that it loads at startup.
Scrapyd only needs to restart rarely (e.g. version upgrade, configuration change, or server restart). If #519 is done, then configuration changes won't be a concern here.
The text was updated successfully, but these errors were encountered: