-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bbpp134 2298/distributed bluenaas #45
base: develop
Are you sure you want to change the base?
Conversation
update: extend protection of a worker based on queue depth
Signed-off-by: Dinika Saxena <[email protected]>
… and non-realtime sims
then update with results on finish The idea was to already create a distribution in s3 for a simulation as soon as we get a request to launch a simulation. This distribution would later be updated by the celery task to also include simulation result. However, it looks like it's not possible to update a file stored in s3 (see https://bluebrainproject.slack.com/archives/G013PKBUHT2/p1728567806810799) so I will create a distribution at the end, when the simulation is complete. This distribution will contain the simulation config, simulation result as well as stimulus plot data.
Reponse: id: str status: SimulationStatus results: Any name: str description: str created_by: str simulation_config: Optional[SingleNeuronSimulationConfig] type: SimulationType me_model_self: str synaptome_model_self: Optional[str] Signed-off-by: Dinika Saxena <[email protected]>
Signed-off-by: Dinika Saxena <[email protected]>
update: add filter by creation date for list simulations
9d75052
to
53ebaf8
Compare
53ebaf8
to
1a41fea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Bilal, this is very impressive. I've added some comments/suggestions.
Feel free to decide yourself what's important to address and when.
volumes: | ||
nfs_share: | ||
driver_opts: | ||
type: "nfs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit to have NFS for local development?
From the application perspective, the model cache is just another folder in the filesystem that the app can use.
In AWS deployment it is implemented via NFS, but locally we can use normal docker volume/mount. Maybe there is something I'm missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just mimicking the same arch as prod (next step is to use swarm or other tech for replicating workers in local dev)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, the idea of using normal docker volumes instead of nfs locally is worth considering.
It helps to keep the local setup easy (no need to setup an nfs server locally), similar between developers and OS, and having fs locally or remotely doesn't change anything from docker or application perspective (so we don't gain much from having a more complicated local setup).
I used normal docker volumes when working on the new endpoints for ML team and it did seem to work without any issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the details, Bilal.
I'm not sure how we can benefit from this (closer to production) implementation, especially given that the NFS server platform/version/configuration is going to be still different in our local machine and AWS.
The current implementation has few limitations:
- Works only in macOS.
- Relies on local NFS server and updates it's global configuration.
A better approach would be to have another container with NFS server with docker mount for local filesystem - so that everything is containerised and easy to manage.
At the same time, I think, as Dinika mentioned, it's best to leave NFS out of scope just to keep everything simple.
|
||
[tool.poetry.group.types.dependencies] | ||
types-requests = "^2.32.0.20240712" | ||
|
||
[build-system] | ||
requires = ["poetry-core"] | ||
build-backend = "poetry.core.masonry.api" | ||
|
||
[tool.pyright] | ||
venvPath = "/Users/meddah/Library/Caches/pypoetry/virtualenvs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed something that probably we should not have there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, it's the absolute path from your machine: "/Users/meddah/Library/Caches/pypoetry/virtualenvs".
There should be a way not to hardcode it for everybody.
e95decb
to
fd79114
Compare
/core/stimulation.
/core/stimulation/utils.
/core/stimulation/runners.
/core/stimulation/common
./scripts
folder contains theentrypoints
formain
application andcelery
workers containers .redis-insight
,flower
andredis
containerredis
as broker and backend (for transmitting state).redis-insight
for debugging the queue/state of the application.flower
better to use it to track the broker queue and tasks status.monitor
container to track events of the workers (DEV only).nfs
storage to share the models between main app and workers.init/saving/update
simulation innexus
.start/stop
simulations).