Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP/Proposal] Add Stream/Iterator based versions of methods with paginated results #166

Merged
merged 26 commits into from
Apr 19, 2021

Conversation

icewind1991
Copy link
Contributor

@icewind1991 icewind1991 commented Dec 16, 2020

Note, this currently only implements the streaming version for one method to avoid wasting work until we're certain of the design.

Description

Adds an alternative version of methods that return paginated results, providing instead an Iterator or Stream (for sync/async respectively)

Motivation and Context

Manually iterating over paginated results is pretty annoying to do, by providing an streaming alternative we can make this a lot easier for consumers of the api.

Dependencies

New dependencies:

  • for async:
    • async-stream for creating the stream from async method calls
    • futures provides the Stream trait

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Added examples using the new methods and running them

@marioortizmanero
Copy link
Collaborator

Hi! Thanks a lot for the contribution.

I definitely like the idea. It's a big step towards #124. Here are a few questions:

  • We've been trying to avoid the itertools dependency for a while to try to reduce the number of dependencies. Would it be acceptable to not use it? At this point it might be worth importing, but I'd rather not.
  • AFAIK async-stream (or futures-util, not sure), just like async-trait, is something that may be replaced in the future by the standard library, right? All I've found about this is a blog post from withoutboats: https://boats.gitlab.io/blog/post/for-await-i/ and the GAT tracking issue: 🔬 Tracking issue for generic associated types (GAT) rust-lang/rust#44265. I would prefer a more native approach if possible.
  • Shouldn't the 50 used to continue the pagination be more flexible? Some endpoints return more results than other so we should be able to tweak it.
  • Should we straight up remove the manual methods for paginating? IMO there's no point in using them after this.

@icewind1991
Copy link
Contributor Author

  • removed itertools
  • replaced futures-util with futures since that is the actual source of the Stream trait, futures-util remains a dev-dependency for using the stream in the example
  • I don't think GAT makes async-stream irrelevant, afaik, the alternative would be to manually implement Stream::poll_next, either way since the api only specifies impl Stream the specific stream implementation can be replaced later if a better approach is found without breaking compatibility.
  • removing the paged api makes sense to me, only reason to keep them would be for backwards compatibility

@kstep
Copy link
Contributor

kstep commented Dec 17, 2020

I'd prefer to have paginated API reachable. Sometimes it's good to have more low-level control over API calls than have an abstraction. For example, I want to keep a local database of track metadata and update it in batches from time to time. It's easier to do it if I'm able to ask API for a single page of tracks, the exact slice I want to update locally. With streaming API only it would be a little inconvenient and hacky.

@icewind1991
Copy link
Contributor Author

  • Switched to using a trait "alias" to prevent having to write duplicate async/sync versions of the streaming methods

@ramsayleung
Copy link
Owner

It's a great idea to have some new ergonomics API, but I have it's reasonable to keep the paginated API, since some developers use rspotify to develop Web Application, they probably need to store the track and album information into database, and the paginated information is like an anchor for those Web Application to tell them where it's start point.

Maybe, we could add a new feature named streamify(or other names) to enable stream/iterator based version of methods with paginated result. We could keep them both, and offer different choices for different developers.

@marioortizmanero
Copy link
Collaborator

marioortizmanero commented Dec 28, 2020

I agree. Maybe this could be feature-gated. That way we can avoid the _stream prefix, kinda like what I want to do with the cache files at #135. How about a streams feature, disabled by default?

@marioortizmanero
Copy link
Collaborator

marioortizmanero commented Mar 9, 2021

I've updated this PR to include the latest changes in master. We should work on merging it before continuing with others to avoid having stale PRs.

Also, I think the module/feature name should be pagination, if anything. Otherwise streams is confusing because that's the name of the async implementation. pagination is broader and works for both streams and iterators. (I think?). If you prefer streams let me know and I'll change it back.

So, what's missing:

  • Figure out how to include this in the library:
    • Toggleable feature
    • Feature that enables the methods with _stream/_paginated/_auto suffixes
    • Same as the previous, but without a feature
  • Implement the rest of the paginated endpoints (should be easy, and I can take care of that).

In my opinion, we should would only about the user experience. Using a feature for pagination doesn't really matter for compilation time/size; the important part is that they're easy to use. And we can't rely on default features either because users are expected to disable them when configuring a different client. I suggest we just create methods with _auto suffixes. Or perhaps have automatically paginated endpoints be the default and add a _manual suffix to the originals, which I think is a saner default and the best choice.


I also tried to generalize the manual/iterator/stream behavior with types but we need GATs for that, so it's currently impossible. But it would have been great to just pass a Paginator trait object to the endpoint, which can be ManualPaginator, IterPaginator or StreamPaginator. The thing is that they would return ManualWrapper<T>, Iterator<T> and Stream<T>, which is currently impossible with stable Rust. Here's how it would work with nightly Rust, and you can see how switching to stable throws an error.

Maybe I can open an issue to implement that in the future, once we have GATs? In that case we wouldn't even need the _auto suffixes and it would be quite nice to use and configure. For example:

endpoint(ManualPaginator::new(50));  // Returns a Page over the return type of `endpoint`
endpoint(ManualPaginator::with_offset(50, 150));  // Same, but it can be configured to start at 150
endpoint(IterPaginator::new());  // Returns an iterator over the return type of `endpoint`
endpoint(StreamPaginator::new());  // Returns a stream over the return type of `endpoint`

@marioortizmanero marioortizmanero mentioned this pull request Apr 13, 2021
4 tasks
src/pagination/iter.rs Outdated Show resolved Hide resolved
src/pagination/stream.rs Show resolved Hide resolved
Cargo.toml Show resolved Hide resolved
src/client.rs Outdated
/// library.
///
/// [Reference](https://developer.spotify.com/web-api/get-users-saved-tracks/)
pub fn current_user_saved_tracks_stream(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this PR merged, what's the next step? Planing to add a new method with _stream suffix for each endpoint except current_user_saved_tracks?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was explaining and questioning in this comment - #166 (comment)

Seeing as there isn't anyone else giving their opinion I would just come to an agreement between you and I and just use that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer _stream suffix to _auto suffix, since the latter one is not that self-explanatory as _stream suffix, and stream is defined as a sequence of elements supporting sequential and parallel aggregate operations in a we-all-know way, such as Java's Stream and Rust's Future stream.

And let me know what you think now :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but the thing is that _stream might be confusing, as it returns an Iterator when sync mode is enabled, and it returns a Stream when async mode is enabled. _auto works for both, and so does Paginator, IMO. Perhaps there's a better alternative than _auto, but my concern was that we were mixing up the async implementation with the generalized one.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I get you point! It makes sense. Perhaps there's indeed a better choice than _auto. But in this moment, no clue comes to my mind

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I am thinking about is it possible to generate the new method with _stream or _auto suffix by macro, since the code pattern is similar.

This is non-trivial to implement because of generic types in functions. A macro that takes these into account is quite complicated and not worth the effort in this case, unfortunately.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another suffix name came to my mind, _range suffix, inspiration from C++ Range library :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another idea I added to my comment, which I modified: what if the endpoints are automatically paginated by default, and you can use the _manual-suffixed endpoints in case you want more control over them? It's a saner default, as most times you just want an iterator, and we solve the naming issue pretty easily :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it makes much sense. We could make the endpoints paginated by default, since the developers want iterator at most times as you point out. My concern is that if we change all manual endpoints to be paginated and then add a _manual suffix for the original one, then we will change the function signature of all endpoints, it does break the compatibility.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol we're completely breaking compatibility with this release anyway haha

@ramsayleung ramsayleung merged commit 3bcad4f into ramsayleung:master Apr 19, 2021
@marioortizmanero
Copy link
Collaborator

marioortizmanero commented Apr 19, 2021

Nooo @ramsayleung don't merge this! It was only a showcase of how it should work. I still had to implement the rest of the endpoints and use _manual. It was marked as a draft and had [WIP/Proposal] in the title

@marioortizmanero
Copy link
Collaborator

Anyway, I'll just open a new PR.

@ramsayleung
Copy link
Owner

ramsayleung commented Apr 19, 2021

Ooooh, I though this PR is a prototype, and it's ready to merge, my fault :(

It seems the works has been moved to this #201 PR, then we need to switch to this PR.

@marioortizmanero
Copy link
Collaborator

Yup, don't worry. I'll finish everything there and after we're done with that we can finally work on the auth rewrite!

@icewind1991 icewind1991 deleted the streaming branch August 27, 2021 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants