Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More REST-server call-avoidance #223

Merged
merged 2 commits into from
Jul 19, 2022
Merged

More REST-server call-avoidance #223

merged 2 commits into from
Jul 19, 2022

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Jul 16, 2022

Summary

More performance optimization along the lines of #222.

Details

Re the call to exists() in _open():

  • This was introduced on Better group-member handling on non-existent paths #126 to avoid a very confusing exception which only happened before the with-open-as syntax was added to TileDB-Py on Add __enter__/__exit__ for TileDB Groups TileDB-Inc/TileDB-Py#1124.
  • But now, this exists() call can be completely omitted in with no negative effects: the exception wording is quite clear and appropriate now.
  • This was a double-whammy since the exists() (no longer called here) was doing tiledb.object_type() which makes two REST-server calls: open-array to return is-array (resulting in a 404 for us), and open-group to return is-group (resulting in a 200 for us). Now we avoid the 404 as well as the 200 (both of which had server-side time costs in auth).

Re _get_child_uris():

  • We still need some kind of exists-or-not handling, since we need to obtain post-creation member URIs when the group exists, or pre-creation member URIs so that we can populate it.
  • But we can fold the exists-checking into group-open -- as long as we appropriately check the exception type that comes back on group-does-not-exist-yet.

Profiling

time-soma-ctor.py

#!/usr/bin/env python

import tiledbsc
import tiledb
import time

uris = [
  '/Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs',
  '/Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow',
  's3://tiledb-singlecell-data/soco/soco3/HSCs',
  's3://tiledb-singlecell-data/soco/soco3/Krasnow',
  'tiledb://johnkerl-tiledb/HSCs',
  'tiledb://johnkerl-tiledb/Krasnow',
]

for uri in uris:
    t1 = time.time()
    soma = tiledbsc.SOMA(uri)
    t2 = time.time()
    print("SOMA CTOR %6.3f URI %s" % (t2-t1, uri))

Before:

$ python time-soma-ctor.py
SOMA CTOR  0.012 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs
SOMA CTOR  0.003 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow
SOMA CTOR  0.594 URI s3://tiledb-singlecell-data/soco/soco3/HSCs
SOMA CTOR  0.554 URI s3://tiledb-singlecell-data/soco/soco3/Krasnow
SOMA CTOR  2.415 URI tiledb://johnkerl-tiledb/HSCs
SOMA CTOR  2.969 URI tiledb://johnkerl-tiledb/Krasnow

After:

$ python time-soma-ctor.py
SOMA CTOR  0.004 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs
SOMA CTOR  0.001 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow
SOMA CTOR  0.223 URI s3://tiledb-singlecell-data/soco/soco3/HSCs
SOMA CTOR  0.108 URI s3://tiledb-singlecell-data/soco/soco3/Krasnow
SOMA CTOR  1.096 URI tiledb://johnkerl-tiledb/HSCs
SOMA CTOR  1.070 URI tiledb://johnkerl-tiledb/Krasnow

time-print-uris.py

#!/usr/bin/env python

import tiledbsc
import tiledb
import time

uris = [
  '/Users/johnkerl/tiledb-singlecell-data/soco/soco3',
  's3://tiledb-singlecell-data/soco/soco3',
  'tiledb://johnkerl-tiledb/soco3',
]

for uri in uris:
    t1 = time.time()
    soco = tiledbsc.SOMACollection(uri)
    for soma in soco:
        print(soma.uri)
    t2 = time.time()
    print("MEMBER-URI PRINT %6.3f URI %s" % (t2-t1, uri))

Before:

$ python time-print-uris.py
file:///Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow
file:///Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs
file:///Users/johnkerl/tiledb-singlecell-data/soco/soco3/STPericytes
MEMBER-URI PRINT  0.014 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3
s3://tiledb-singlecell-data/soco/soco3/Krasnow
s3://tiledb-singlecell-data/soco/soco3/HSCs
s3://tiledb-singlecell-data/soco/soco3/STPericytes
MEMBER-URI PRINT  2.142 URI s3://tiledb-singlecell-data/soco/soco3
tiledb://johnkerl-tiledb/d1291d0b-2e6c-40eb-87bd-9cf32401441c
tiledb://johnkerl-tiledb/71f67bc8-d498-4dd3-bd39-453c9262bd89
tiledb://johnkerl-tiledb/43d4a215-1a9c-4e5f-9fa9-d99a3bed93f7
MEMBER-URI PRINT  7.215 URI tiledb://johnkerl-tiledb/soco3

After:

$ python time-print-uris.py
file:///Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow
file:///Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs
file:///Users/johnkerl/tiledb-singlecell-data/soco/soco3/STPericytes
MEMBER-URI PRINT  0.008 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3
s3://tiledb-singlecell-data/soco/soco3/Krasnow
s3://tiledb-singlecell-data/soco/soco3/HSCs
s3://tiledb-singlecell-data/soco/soco3/STPericytes
MEMBER-URI PRINT  0.547 URI s3://tiledb-singlecell-data/soco/soco3
tiledb://johnkerl-tiledb/d1291d0b-2e6c-40eb-87bd-9cf32401441c
tiledb://johnkerl-tiledb/71f67bc8-d498-4dd3-bd39-453c9262bd89
tiledb://johnkerl-tiledb/43d4a215-1a9c-4e5f-9fa9-d99a3bed93f7
MEMBER-URI PRINT  3.635 URI tiledb://johnkerl-tiledb/soco3

time-obs-shape.py

#!/usr/bin/env python

import tiledbsc
import tiledb
import time

uris = [
  '/Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs',
  '/Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow',
  's3://tiledb-singlecell-data/soco/soco3/HSCs',
  's3://tiledb-singlecell-data/soco/soco3/Krasnow',
  'tiledb://johnkerl-tiledb/HSCs',
  'tiledb://johnkerl-tiledb/Krasnow',
]

for uri in uris:
    soma = tiledbsc.SOMA(uri)
    t1 = time.time()
    shape = soma.obs.shape()
    t2 = time.time()
    print("OBS SHAPE %6.3f URI %s" % (t2-t1, uri))

Before:

$ python time-obs-shape.py
OBS SHAPE  0.104 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs
OBS SHAPE  0.004 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow
OBS SHAPE  0.358 URI s3://tiledb-singlecell-data/soco/soco3/HSCs
OBS SHAPE  0.342 URI s3://tiledb-singlecell-data/soco/soco3/Krasnow
OBS SHAPE  3.288 URI tiledb://johnkerl-tiledb/HSCs
OBS SHAPE  3.437 URI tiledb://johnkerl-tiledb/Krasnow

After:

$ python time-obs-shape.py
OBS SHAPE  0.093 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/HSCs
OBS SHAPE  0.003 URI /Users/johnkerl/tiledb-singlecell-data/soco/soco3/Krasnow
OBS SHAPE  0.392 URI s3://tiledb-singlecell-data/soco/soco3/HSCs
OBS SHAPE  0.306 URI s3://tiledb-singlecell-data/soco/soco3/Krasnow
OBS SHAPE  3.121 URI tiledb://johnkerl-tiledb/HSCs
OBS SHAPE  3.065 URI tiledb://johnkerl-tiledb/Krasnow

@johnkerl johnkerl force-pushed the kerl/rest-avoidance branch from cfa59c7 to 79928b7 Compare July 16, 2022 15:07
@johnkerl johnkerl changed the title Some REST-server call-avoidance [WIP] More REST-server call-avoidance Jul 16, 2022
@johnkerl johnkerl requested review from Shelnutt2 and aaronwolen July 16, 2022 19:51
@johnkerl johnkerl marked this pull request as ready for review July 16, 2022 19:56
@johnkerl johnkerl requested review from gspowley and ihnorton July 18, 2022 21:59
Copy link
Member

@aaronwolen aaronwolen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement!

@johnkerl johnkerl merged commit 9c3fb2d into main Jul 19, 2022
@johnkerl johnkerl deleted the kerl/rest-avoidance branch July 19, 2022 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants