Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove shared memory from MPFuture, fix minor bugs #317

Merged
merged 26 commits into from
Jul 13, 2021

Conversation

justheuristic
Copy link
Member

@justheuristic justheuristic commented Jul 13, 2021

  • re-written MPFuture to use pipe-only communication instead of SharedMemory
    • rationale: each shared memory object is a file, using thousands of them floods the open files limit
    • one alternative was to group multple shared memory objects into one file with different offsets, but that would require heavier engineering
  • fixed coroutine set_event_threadsafe is never awaited

# Conflicts:
#	hivemind/moe/server/task_pool.py
#	hivemind/utils/mpfuture.py
#	tests/test_util_modules.py
@justheuristic
Copy link
Member Author

justheuristic commented Jul 13, 2021

[not black-compatible yet] converted below

@codecov
Copy link

codecov bot commented Jul 13, 2021

Codecov Report

Merging #317 (f147d18) into master (d97fede) will decrease coverage by 0.02%.
The diff coverage is 88.27%.

@@            Coverage Diff             @@
##           master     #317      +/-   ##
==========================================
- Coverage   81.97%   81.95%   -0.03%     
==========================================
  Files          66       66              
  Lines        5898     5918      +20     
==========================================
+ Hits         4835     4850      +15     
- Misses       1063     1068       +5     
Impacted Files Coverage Δ
hivemind/moe/server/task_pool.py 43.16% <0.00%> (ø)
hivemind/utils/mpfuture.py 89.83% <86.86%> (-4.86%) ⬇️
hivemind/p2p/p2p_daemon.py 92.40% <100.00%> (+1.11%) ⬆️
hivemind/p2p/servicer.py 92.45% <100.00%> (-0.14%) ⬇️
hivemind/dht/__init__.py 89.38% <0.00%> (+1.76%) ⬆️
hivemind/averaging/key_manager.py 97.72% <0.00%> (+2.27%) ⬆️

@justheuristic justheuristic linked an issue Jul 13, 2021 that may be closed by this pull request
hivemind/utils/mpfuture.py Outdated Show resolved Hide resolved
hivemind/utils/mpfuture.py Outdated Show resolved Hide resolved
hivemind/utils/mpfuture.py Outdated Show resolved Hide resolved
hivemind/utils/mpfuture.py Outdated Show resolved Hide resolved
@mryab mryab changed the title Remove sharedmemory from MPFuture, fix minor bugs Remove shared memory from MPFuture, fix minor bugs Jul 13, 2021

def _set_event_threadsafe(self):
def _set_event_if_necessary(self):
if self._aio_event is None or self._aio_event.is_set():
Copy link
Member

@borzunov borzunov Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: self._aio_event.is_set() is not guaranteed to be thread-safe. This check can probably be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decided to keep is_set for performance reasons

Copy link
Member

@borzunov borzunov Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[It's thread-safe in the current implementation, and we've agreed that it's hard to imagine that it will stop being thread-safe.]

name=f"{__name__}.BACKEND",
daemon=True,
)
cls._pipe_waiter_thread.start()

@classmethod
def _process_updates_in_background(cls, receiver_pipe: mp.connection.Connection):
pid = os.getpid()
while True:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This loop never stops gracefully. Ideally, it would be cool to stop it via a threading.Event in the __del__ finalizer of the last active future in the current process.

However, graceful shutdown may be out of scope of this PR.

@justheuristic justheuristic merged commit 197666c into master Jul 13, 2021
@justheuristic justheuristic deleted the make-mpfuture-great-again branch July 13, 2021 22:37
self._state_cache[self._state], self._result = base.FINISHED, result
self._send_update(UpdateType.RESULT, result)
self._send_update(MessageType.RESULT, result)
super().set_result(result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: In Python 3.7, this may raise RuntimeError (instead of InvalidStateError) if set_result is called concurrently inside one process.

This is a minor issue and we've agreed it won't be fixed.

borzunov added a commit that referenced this pull request Jul 23, 2021
@borzunov borzunov mentioned this pull request Jul 23, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Tests freeze without increasing ulimit -n
3 participants