Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt pagination standard with a default page size for methods which support it #343

Closed
3 tasks done
hparadiz opened this issue May 11, 2017 · 4 comments
Closed
3 tasks done
Assignees
Labels
enhancement M-T: A feature request for new functionality good first issue semver:minor

Comments

@hparadiz
Copy link

  • I've read and understood the Contributing guidelines and have done my best effort to follow them.
  • I've read and agree to the Code of Conduct.
  • I've searched for any related issues and avoided creating a duplicate issue.

Description

Large slack organizations with many thousands of users cause the slack API to throw an HTTP 500 error when running users.list through the WebAPI.
The API runs with no options by default and other slack OSS projects use the api call with default settings.

Reproducible in:

hubot-slack version: latest
OS version(s): any
Device(s): any

Steps to reproduce:

  1. Run users.list against an organization with thousands of users
  2. See users.list throw a 500 http error
  3. See users.list timeout without handling the error

Expected result:

e.g. Users.list to add a limit field to it's options by default and detect the output containing offset in the json return. Then hit the api again until the user list does not return offset signifying that you have reached the last page.

Actual result:

e.g. Does not handle http 500 and times out at 300 seconds

Notes

I reported this issue in the hubot-slack repo but after reading the source code I feel that the fix is more appropriate in this repo instead.

hubot-slack loads the entire users list into memory on load which seems insane to me when our organization has thousands of users and my bot is only in one or two channels with maybe 20 users. A user list by channel that loads on channel join would be way better.

@aoberoi
Copy link
Contributor

aoberoi commented May 17, 2017

thanks for the issue report @Akujin. i agree that 500 errors on large teams is quite unacceptable. unfortunately, without a pagination solution as you recommended at the API Platform level, there's nothing we can do in the SDK to mitigate this situation. but be aware that we are working on a pagination solution at Slack so we can prevent these types of issues from happening.

i also agree that the fix should be made here, and not at the hubot-slack adapter. i'll keep this issue open until we have a pagination solution and as a place to update users about where the implementation is.

as far as the data list being loaded into memory, there are two issues here:

  1. this SDK uses a concept of a DataStore, of which the default implementation is the MemoryDataStore. this abstraction has become less and less useful, especially for those using large teams. we've started a discussion about this issue here in Future of the DataStore API #330. you can get all the user data out of memory by simply setting the dataStore option to null. this is available to you in hubot-slack too by setting the SLACK_CLIENT_OPTIONS environment variable.
  2. hubot itself offers on a "brain" abstraction and the users are being stored there too. we used to opt out of populating the users into the brain, but that meant that users of the hubot-slack adapter ended up with a deficient hubot and they couldn't take advantage of all its features. recently we made a release that restored that functionality. you can get all that user data out of memory by using one of hubot's brain persistence adapters (there are a few here: https://www.npmjs.com/browse/keyword/hubot-brain, and hubot-redis-brain is quite popular too).

@aoberoi aoberoi changed the title users.list hangs when server response is http 500 Adopt fourthcoming pagination to prevent web client requests from timing out (e.g. user.list) May 17, 2017
@hparadiz
Copy link
Author

hparadiz commented Jun 2, 2017

Thank you for the detailed response.

So when I submitted a support ticket initially I was actually told of undocumented API variables for users.list.

  1. Pass a limit argument to restrict the number of entries returned. This should just be a number — "1000" is probably a good one to start with.
  2. When a subsequent page exists, we will return an offset
    field in the response. You can then call the next set of results with the
    offset from the previous response to get to the next page. If no offset is provided, you have reached the last page. This way, you can recursively call the API for as long as an
    offset exists in our response.

I know this isn't true pagination since you can't just call page N but it would allow users.list to handle large orgs and at least in this use case you're trying to pull the entire list anyway. And it can be done now with no changes to the slack api.


I'm a little perplexed by SLACK_CLIENT_OPTIONS. I can't find where it reads process.env.SLACK_CLIENT_OPTIONS.

I tried searching for it in https://github.com/slackapi/hubot-slack/ but only found it as a regular variable and not an environmental variable.

Right now I'm just commenting out these three lines of code to prevent users.list from being called
https://github.com/slackapi/hubot-slack/blob/master/src/bot.coffee#L56-L58

I want to avoid forking hubot-slack just to edit SLACK_CLIENT_OPTIONS so if there's some example usage of SLACK_CLIENT_OPTIONS as an environmental variable that'd be great.

@aoberoi
Copy link
Contributor

aoberoi commented Jun 19, 2017

@hparadiz that's my fault actually, i always thought it was read from an environment variable and now i see that it's not. i've created slackapi/hubot-slack#421 to track getting this implemented. if you have the ability and would like to help, i'd be happy to assign the task to you. otherwise it might take a few weeks to get this out given current priorities.

the undocumented pagination technique is not something i'm willing to implement in the SDK because it will break and the new pagination technique is right around the corner. in the meantime, i'm thinking the following changes would comprise support for this feature:

  • change the implementation of paginated endpoints to use a reasonable default page size (e.g. 1000) and only invoke the callback or resolve their promise once all pages are iterated.
  • if pagination options are passed into the individual method calls (e.g. { limit: 20 }), then invoke the callback or resolve the promise after the first call. this should allow the user to iterate with custom logic.

feedback on this feature is welcome!

@aoberoi aoberoi changed the title Adopt fourthcoming pagination to prevent web client requests from timing out (e.g. user.list) Adopt pagination standard with a default page size for methods which support it Sep 12, 2017
@aoberoi
Copy link
Contributor

aoberoi commented Sep 12, 2017

This has become more relevant since the pagination spec is now standard across API methods. See https://slack.engineering/evolving-api-pagination-at-slack-1c1f644f8e12.

Here are a few of the methods that could benefit:

  • conversations.history
  • conversations.list
  • conversations.members
  • conversations.replies

@aoberoi aoberoi added enhancement M-T: A feature request for new functionality good first issue and removed feature labels Sep 12, 2017
@aoberoi aoberoi added this to the v3.15.0 milestone Oct 26, 2017
@aoberoi aoberoi modified the milestones: v3.15.0, v4.0.0 Dec 18, 2017
@aoberoi aoberoi removed this from the v4.0.0 milestone Mar 8, 2018
@aoberoi aoberoi self-assigned this Jun 1, 2018
@aoberoi aoberoi mentioned this issue Jul 25, 2018
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement M-T: A feature request for new functionality good first issue semver:minor
Projects
None yet
Development

No branches or pull requests

2 participants