Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SOMA-level dimension-slicing #102

Merged
merged 2 commits into from
May 20, 2022
Merged

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented May 19, 2022

Context: #95

Examples (see also test cases on this PR):

$ ./tools/ingestor anndata/pbmc-small.h5ad ./tiledb-data/pbmc-small
...
>>> import tiledbsc

>>> soma = tiledbsc.SOMA('./tiledb-data/pbmc-small')

>>> soma.obs.ids()
[b'AAATTCGAATCACG', b'AAGCAAGAGCTTAG', b'AAGCGACTTTGACG', b'AATGCGTGGACGGA', [snip] b'TTGCATTGAGCTAC', b'TTGGTACTGAATCC', b'TTTAGCTGTACTCT']

>>> soma.var.ids()
[b'AKR1C3', b'CA2', b'CD1C', b'GNLY', b'HLA-DPB1', b'HLA-DQA1', b'IGLL5', b'MYL9', b'PARVB', b'PF4', b'PGRMC1', b'PPBP', b'RP11-290F20.3', b'RUFY1', b'S100A8', b'S100A9', b'SDPR', b'TREML1', b'TUBB1', b'VDAC3']

>>> soma.obs.dim_select([b'AAGCGACTTTGACG', b'AATGCGTGGACGGA'])
                orig.ident  nCount_RNA  nFeature_RNA  RNA_snn_res.0.8  letter.idents groups  RNA_snn_res.1
obs_id
AAGCGACTTTGACG           0       443.0            77                1              1     g1              1
AATGCGTGGACGGA           0       389.0            73                1              1     g1              1

>>> soma.var.dim_select([b'AKR1C3', b'MYL9'])
        vst.mean  vst.variance  vst.variance.expected  vst.variance.standardized  vst.variable
var_id
AKR1C3    0.2625      1.132753               0.553424                   2.021191             1
MYL9      0.2875      1.321361               0.614125                   1.938228             1

>>> soma.obsm.keys()
['X_tsne', 'X_pca']

>>> soma.obsm['X_tsne'].dim_select([b'AAGCGACTTTGACG', b'AATGCGTGGACGGA'])
           obs_id   X_tsne_1   X_tsne_2
0  AAGCGACTTTGACG -31.919623  -0.864304
1  AATGCGTGGACGGA  -1.839384  22.204006

>>> soma.obsm['X_pca'].dim_select([b'AAGCGACTTTGACG', b'AATGCGTGGACGGA'])
           obs_id   X_pca_1   X_pca_2   X_pca_3   X_pca_4  ...  X_pca_15  X_pca_16  X_pca_17  X_pca_18  X_pca_19
0  AAGCGACTTTGACG -1.380380  1.284101  1.918055  1.247647  ...  0.117602  0.169661  0.130810 -0.041855  0.027550
1  AATGCGTGGACGGA -1.494413  1.783583  0.661433 -0.584200  ... -0.163601 -0.082049 -0.145453 -0.045108  0.055206

[2 rows x 20 columns]

>>> soma.X.data.dim_select([b'AAGCGACTTTGACG'], [b'AKR1C3'])
           obs_id  var_id     value
0  AAGCGACTTTGACG  AKR1C3 -0.325888

>>> soma.X.data.dim_select([b'AAGCGACTTTGACG'], None)
            obs_id         var_id     value
0   AAGCGACTTTGACG         AKR1C3 -0.325888
1   AAGCGACTTTGACG            CA2 -0.346938
2   AAGCGACTTTGACG           CD1C -0.253005
3   AAGCGACTTTGACG           GNLY -0.528670
4   AAGCGACTTTGACG       HLA-DPB1  0.412720
5   AAGCGACTTTGACG       HLA-DQA1 -0.671707
6   AAGCGACTTTGACG          IGLL5 -0.193495
7   AAGCGACTTTGACG           MYL9 -0.301435
8   AAGCGACTTTGACG          PARVB -0.346179
9   AAGCGACTTTGACG            PF4 -0.370707
10  AAGCGACTTTGACG         PGRMC1 -0.329147
11  AAGCGACTTTGACG           PPBP -0.422769
12  AAGCGACTTTGACG  RP11-290F20.3  2.782381
13  AAGCGACTTTGACG          RUFY1 -0.409833
14  AAGCGACTTTGACG         S100A8  1.007046
15  AAGCGACTTTGACG         S100A9  1.127458
16  AAGCGACTTTGACG           SDPR -0.388176
17  AAGCGACTTTGACG         TREML1 -0.328625
18  AAGCGACTTTGACG          TUBB1 -0.350375
19  AAGCGACTTTGACG          VDAC3 -0.524551

>>> soma.X.data.dim_select(None, [b'AKR1C3'])
            obs_id  var_id     value
0   AAATTCGAATCACG  AKR1C3 -0.325888
1   AAGCAAGAGCTTAG  AKR1C3 -0.325888
2   AAGCGACTTTGACG  AKR1C3 -0.325888
3   AATGCGTGGACGGA  AKR1C3 -0.325888
4   AATGTTGACAGTCA  AKR1C3 -0.325888
..             ...     ...       ...
75  TTACGTACGTTCAG  AKR1C3 -0.325888
76  TTGAGGACTACGCA  AKR1C3 -0.325888
77  TTGCATTGAGCTAC  AKR1C3 -0.325888
78  TTGGTACTGAATCC  AKR1C3 -0.325888
79  TTTAGCTGTACTCT  AKR1C3 -0.325888

[80 rows x 3 columns]

@johnkerl johnkerl force-pushed the kerl/dimension-select branch 3 times, most recently from 55409de to a42c345 Compare May 19, 2022 18:58
@johnkerl johnkerl force-pushed the kerl/dimension-select branch from a42c345 to ef8ed02 Compare May 19, 2022 18:59
@johnkerl johnkerl requested review from aaronwolen and Shelnutt2 May 19, 2022 19:26
@johnkerl johnkerl force-pushed the kerl/dimension-select branch 2 times, most recently from 4605fbf to 40c012b Compare May 19, 2022 19:37
@johnkerl johnkerl force-pushed the kerl/dimension-select branch from 40c012b to 62b5068 Compare May 19, 2022 19:37
@johnkerl johnkerl marked this pull request as ready for review May 19, 2022 19:37
@johnkerl johnkerl changed the title Implement SOMA_level dimension-slicing [WIP] Implement SOMA_level dimension-slicing May 19, 2022
@johnkerl johnkerl changed the title Implement SOMA_level dimension-slicing Implement SOMA-level dimension-slicing May 19, 2022
@johnkerl johnkerl merged commit be67297 into main May 20, 2022
@johnkerl johnkerl mentioned this pull request May 24, 2022
61 tasks
@johnkerl johnkerl deleted the kerl/dimension-select branch June 1, 2022 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants