Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network issue #125

Open
carlosribas opened this issue Feb 21, 2024 · 0 comments
Open

Network issue #125

carlosribas opened this issue Feb 21, 2024 · 0 comments

Comments

@carlosribas
Copy link
Contributor

A network problem in the middle of a search can not only interrupt operation, but also leave a VM stuck even when the network returns to operation.

The message below shows a VM that was unable to connect to the database and therefore was unable to update the search status. In the database, this VM remained in the "busy" status until it was manually changed.

This is another example where transactions can help.

DEBUG:root:Nhmmer job chunk timeout out: job_id = f033ed19-42e4-46af-8985-691ef8ba8730, database = all-except-rrna-12.fasta
ERROR:asyncio:Job processing failed
job: <Job coro=<<coroutine object nhmmer at 0x7f9ddcee25f0>>>
Traceback (most recent call last):
  File "/srv/sequence_search/consumer/views/submit_job.py", line 68, in nhmmer
    await asyncio.wait_for(task, MAX_RUN_TIME)
  File "/usr/local/lib/python3.7/asyncio/tasks.py", line 449, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/sequence_search/db/job_chunks.py", line 124, in set_job_chunk_status
    async with engine.acquire() as connection:
  File "/usr/local/lib/python3.7/site-packages/aiopg/utils.py", line 94, in __aenter__
    self._obj = await self._coro
  File "/usr/local/lib/python3.7/site-packages/aiopg/sa/engine.py", line 165, in _acquire
    raw = await self._pool.acquire()
  File "/usr/local/lib/python3.7/site-packages/aiopg/pool.py", line 164, in _acquire
    await self._fill_free_pool(True)
  File "/usr/local/lib/python3.7/site-packages/aiopg/pool.py", line 199, in _fill_free_pool
    **self._conn_kwargs)
  File "/usr/local/lib/python3.7/site-packages/aiopg/connection.py", line 43, in connect
    **kwargs
  File "/usr/local/lib/python3.7/site-packages/aiopg/connection.py", line 78, in __init__
    self._conn = psycopg2.connect(dsn, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/psycopg2/__init__.py", line 122, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "192.168.0.6", port 5432 failed: Network is unreachable
	Is the server running on that host and accepting TCP/IP connections?


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/srv/sequence_search/consumer/views/submit_job.py", line 81, in nhmmer
    await set_job_chunk_status(engine, job_id, database, status=JOB_CHUNK_STATUS_CHOICES.timeout)
  File "/srv/sequence_search/db/job_chunks.py", line 169, in set_job_chunk_status
    "set_job_chunk_status, job_id = %s, database = %s" % (job_id, database)) from e
sequence_search.db.DatabaseConnectionError: Failed to open connection to the database in set_job_chunk_status, job_id = f033ed19-42e4-46af-8985-691ef8ba8730, database = all-except-rrna-12.fasta
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant