Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r] Port blockwise sparse iterator from Python to R #1853

Closed
johnkerl opened this issue Nov 3, 2023 · 1 comment · Fixed by #2152
Closed

[r] Port blockwise sparse iterator from Python to R #1853

johnkerl opened this issue Nov 3, 2023 · 1 comment · Fixed by #2152

Comments

@johnkerl
Copy link
Member

johnkerl commented Nov 3, 2023

Port #1792

@johnkerl
Copy link
Member Author

@mojaveazure @eddelbuettel my understanding is this should be sequenced before #1819 -- my apologies for not having articulated this sooner

mojaveazure added a commit that referenced this issue Feb 17, 2024
Initial support for blockwise reader/iteration
New classes:
- `CoordsStrider`: new class to iterate through coordinate similar to Python's `_coords_strider`
- `SOMASparseNDArrayReadBase`: base class for sparse array reads
- `SOMASparseNDArrayBlockwiseRead`: new reader class for blockwise iterated reads

New SOMA methods:
- `SOMASparseNDArrayRead$blockwse()`: perform a blockwise read

addresses #1853
mojaveazure added a commit that referenced this issue Feb 19, 2024
Initial support for blockwise reader/iteration
New classes:
- `CoordsStrider`: new class to iterate through coordinate similar to Python's `_coords_strider`
- `SOMASparseNDArrayReadBase`: base class for sparse array reads
- `SOMASparseNDArrayBlockwiseRead`: new reader class for blockwise iterated reads

New SOMA methods:
- `SOMASparseNDArrayRead$blockwse()`: perform a blockwise read

addresses #1853
mojaveazure added a commit that referenced this issue Mar 1, 2024
Initial support for blockwise reader/iteration
New classes:
- `CoordsStrider`: new class to iterate through coordinate similar to Python's `_coords_strider`
- `SOMASparseNDArrayReadBase`: base class for sparse array reads
- `SOMASparseNDArrayBlockwiseRead`: new reader class for blockwise iterated reads

New SOMA methods:
- `SOMASparseNDArrayRead$blockwse()`: perform a blockwise read

addresses #1853
mojaveazure added a commit that referenced this issue Mar 7, 2024
Initial support for blockwise reader/iteration
New classes:
- `CoordsStrider`: new class to iterate through coordinate similar to Python's `_coords_strider`
- `SOMASparseNDArrayReadBase`: base class for sparse array reads
- `SOMASparseNDArrayBlockwiseRead`: new reader class for blockwise iterated reads

New SOMA methods:
- `SOMASparseNDArrayRead$blockwse()`: perform a blockwise read

addresses #1853
mojaveazure added a commit that referenced this issue Mar 28, 2024
Initial support for blockwise reader/iteration
New classes:
- `CoordsStrider`: new class to iterate through coordinate similar to Python's `_coords_strider`
- `SOMASparseNDArrayReadBase`: base class for sparse array reads
- `SOMASparseNDArrayBlockwiseRead`: new reader class for blockwise iterated reads

New SOMA methods:
- `SOMASparseNDArrayRead$blockwse()`: perform a blockwise read

addresses #1853
mojaveazure added a commit that referenced this issue Apr 3, 2024
Initial support for blockwise reader/iteration
New classes:
- `CoordsStrider`: new class to iterate through coordinate similar to Python's `_coords_strider`
- `SOMASparseNDArrayReadBase`: base class for sparse array reads
- `SOMASparseNDArrayBlockwiseRead`: new reader class for blockwise iterated reads

New SOMA methods:
- `SOMASparseNDArrayRead$blockwse()`: perform a blockwise read

addresses #1853
mojaveazure added a commit that referenced this issue Apr 4, 2024
* [r] [WIP] Blockwise Reader
Initial support for blockwise reader/iteration
New classes:
- `CoordsStrider`: new class to iterate through coordinate similar to Python's `_coords_strider`
- `SOMASparseNDArrayReadBase`: base class for sparse array reads
- `SOMASparseNDArrayBlockwiseRead`: new reader class for blockwise iterated reads

New SOMA methods:
- `SOMASparseNDArrayRead$blockwse()`: perform a blockwise read

addresses #1853

* Plumb through iterated coordinate strider

* Better support for `bit64::integer64`

* Add test for iterated coords

* Fix creating blockwise reader from `SOMASparseNDArrayRead$blockwise()`

* Update docs

* Better handling of resetting the stride of a `CoordsStrider`
New strides are always cast to `integer64`
`CoordsStrider` in a `SOMASparseNDArrayRead` or
`SOMASparseNDArrayBlockwiseRead` can have their strides reset in-place

* Correct stride display via format() inside cat()

* Propagate stride size

* Expose reset() and set_dim_points() for use by blockwise()

* Simplify `ReadIter$read_next()`
Add private helper method to throw read complete warnings

* Add blockwise iterators
New classes:
- `BlockwiseReadIterBase`: base blockwise iterator, handles coordinate
  fetching and `sr` resetting
- `BlockwiseTableReadIter`: blockwise iterator to return arrow tables
- `BlockwiseSparseReadIter`: blockwise iterator to return sparse
  matrices

* Plumb blockwise iterators into blockwise reader
Update docs

* Expose SOMAArray object resetter and 're-constrainer' for blockwise

* Iterating, WIP

* Fix warning with `warningCondition()`

* Fix typo

* Add new utility function for `TableIter$concat()` and `BlockwiseTableIter$concat()`
Plumb through `BlockwiseTableIter$conat()` and `BlockwiseTableIter$private$soma_reader_transform()`
Slight rejiggering of `read_next()` to avoid multiple `$read_complete()` checks
Improve `BlockwiseReadIterBase$read_next()` checks

* Use new utility function shared w/ blockwise table iter

* Update docs

* Fix typo
Amazingly, I can't spell 'array' properly 🤦

* Move `$reset()` and `$set_dim_points()` to `BlockwiseReadIterBase$private`
Update docs

* Add `$length()` method for `CoordsStrider`

* sparse (via new length()) and concat for blockwise read iters

* Move iterators and itertools to Suggests
Delay registration of `nextElem.CoordsStrider()` and `hasNext.CoordsStrider()`

* New tests

* Update docs

* Switch to snake-case for `CoordsStrider$has_next()` and `CoordsStrider$next_element()`

* Make tests more explicit with `expect_s3_class()`/`expect_s4_class()`

* Clean up `soma_array_to_sparse_matrix_concat()`
Have `SparseReadIter$concat()` use new helper function

* Fix typo

* Add error messages to `stopifnot()` calls

* Correct typo spotted by @mlin

* Fix rebase errors

* Update changelog
Bump develop version

---------

Co-authored-by: Dirk Eddelbuettel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants