-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚡ Improve performance of ln.connect()
, lamin connect
, and lamin load
for a notebook
#84
Conversation
lamin load
for a notebooklamin load
for a notebook and ln.connect()
for an instance
The roughly ~4sec have been what we've seen for a while. https://laminlabs.slack.com/archives/C03P6D8U1PC/p1727270298402879?thread_ts=1727186286.045239&cid=C03P6D8U1PC |
lamin load
for a notebook and ln.connect()
for an instancelamin load
for a notebook and ln.connect()
Using the code here: Now profiling From the below profiling run it's clear that loading bionty sources and loading all schema modules add up to ~3s. It doesn't seem necessary to have all schema modules imported off the bat. Hence, I'll try to only load lamindb. Details
|
ln.connect()
is rather slow and affects lamin load
for a notebook or script
laminlabs/lamindb#1997
Refactored considerably, both in the lamindb-setup PR and in bionty.
Will now run profiling again hoping that bionty isn't reloaded upon merely importing lamindb. |
With the below repo states checked out, we're now at ~2s for Within One way to escape this would be to no longer fully import lamindb but only use Django + a few helper functions. lamin connect 2.1s
lamin connect 2.8s
lamin load ~4.9s
lamin load 4.2s
|
With this refactor, we're now down to ~2.6s for lamin load. Let's now move the bionty source load into bionty. lamin load 2.58s
lamin load 2.67s
|
Moved loading bionty sources to bionty brings us to ~2.2s
This is because before we paid the price for establishing the very first database connection through Django during The two biggest attack points are now:
lamin connect 1.2s
lamin connect 1.35s
lamin load 2.2s
lamin load 2.43s
|
lamin load
for a notebook and ln.connect()
ln.connect()
, lamin load
and lamin load
for a notebook
ln.connect()
, lamin load
and lamin load
for a notebookln.connect()
, lamin connect
and lamin load
for a notebook
Now removing profiling 3 functions that are no longer bottlenecks and adding new functions. First question is the time it takes to connect via the edge function or not. The edge function seems to slow down things by 1sec. I will work with the non-edge function implementation for now. Query through edge function: ~2s
Query through postgrest directly: ~1s
|
Looking at these new data it's clear that not relying on Django at all for this will spare us
There is another 0.4s for getting instance metadata though |
Hence, within the Django framework, we're close to what's possible now. I'll end this current optimization and will make a another PR to refactor via LaminHub REST. |
ln.connect()
, lamin connect
and lamin load
for a notebookln.connect()
, lamin connect
, and lamin load
for a notebook
This PR improves the CLI and documents profiling for the main PR below:
ln.connect()
,lamin connect
, andlamin load
for a notebook lamindb#1998The initial profiling run below shows that
lamin load
for a notebook takes 4.7s for an instance (laminlabs/lamindata
) that's in us-east-1 when called from a laptop in Munich.In there,
ln.connect()
contributes almost all waiting time with 4.3s.Details