Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future of the DataStore API #330

Closed
3 tasks done
aoberoi opened this issue Mar 30, 2017 · 7 comments
Closed
3 tasks done

Future of the DataStore API #330

aoberoi opened this issue Mar 30, 2017 · 7 comments
Labels
discussion M-T: An issue where more input is needed to reach a decision

Comments

@aoberoi
Copy link
Contributor

aoberoi commented Mar 30, 2017

Description

The dataStore API in this SDK is great at aggregating a model that applications can count on for state, as if the application was connected as any Slack client would be. It's super convenient.

But it has some problems.

  1. The SlackDataStore interface is cumbersome to implement. There's very little to suggest that any developer use anything besides the default SlackMemoryDataStore implementation that ships with the SDK (an exception: Support for a data store that is implemented via promises #246). Potentially, very simple CRUD operations are all that should be needed.

  2. It's only useful for apps that use an RTM client. In general (but not 100% of the time) apps would work better, faster, and more secure if they used a combination of the Events API and the Web API. So essentially the dataStore is built for a very small and specific population of this SDK's users, but all apps using this SDK are paying the cost in terms of memory and performance.

  3. The SlackDataStore interface is not asynchronous. This means either the data must be in memory or you must block the node event loop to use it. Both bad options for apps that need scale. In Support for a data store that is implemented via promises #246 a developer asks for a Promise-based async interface.

  4. On Enterprise workspaces and workspaces with shared channels, you cannot depend on the dataStore to know about all users. This is because the client is connected to one workspace at a time, and it might see events in shared channels or on files that do not originate from users in that workspace (see: Hubot Slack cannot connect due to "Cannot set property 'room' of undefined" hubot-slack#404).

  5. On Enterprise workspaces and workspaces with shared channels, its difficult (maybe impossible) to de-duplicate events to consistently update the state in the data store. Slack has added an event_id property (when they are delivered using the Events API) that could be used to determine duplication, but as stated in point 2, the data store is mostly used with RTM. Besides, the Events API is already deduplicated.

This ticket is meant to be a home to discussion around what should be done about this next. I've got a few options I'd like to lay out now:

Plan A Deprecate the current dataStore API, and replace it with something better in the next major version of this SDK. We could get an easier to implement interface, with async operations out of this. At a minimum we should provide a migration guide for current users to help them move to the next major version.

Plan B Make some larger sweeping architectural changes that break apart the Web client, RTM client, and the Incoming Webhook parts of this SDK into more modular packages. Leave this package around as-is. Provide a new umbrella package that helps you avoid all the wiring up of the packages as an 80/20 solution with sane defaults. Try to get the community to migrate to the new umbrella package and/or the modular packages its composed from.

IMHO, I'd like to see Plan B happen, because I think it's more aligned with the Node Way™. We already have a decent start with the Events API module. I do want to acknowledge that it is more development work and it's no easy feat to move the community from this package to a new alternative.

  • I've read and understood the Contributing guidelines and have done my best effort to follow them.
  • I've read and agree to the Code of Conduct.
  • I've searched for any related issues and avoided creating a duplicate issue.
@aoberoi aoberoi added the question M-T: User needs support to use the project label Mar 30, 2017
@clavin
Copy link
Contributor

clavin commented Mar 30, 2017

As someone who's been around through all of the JS era (RIP Leah) of this package and a little bit of the CoffeeScript era, I'm going to say Plan B® is the way to go.


Maybe its the overdosing of babel (presets) and webpack (loaders/plugins) I'm accustomed to, but I very much favor that second option for a few reasons:

  1. If you only need the RTM package or the web + events packages, you can get just that. No need for the other 3/5ths of the pie taking up storage space in your poor, already-crowded node_modules.
  2. Most of the code here doesn't really warrant being in the same repository. Sure, the RTM and web clients share the inheritance of a lovely BaseAPIClient class, but there's no reason behind it, especially after you consider both clients only use one method from it (_makeAPICall) and, rather abstractly, one property/option (transport).

I do see one issue, though: splitting up a partially-inactive repository is like chopping up a partially-dying person; it'll just kill it. If you're only planning on working on the datastore implementation (which, from what you said, isn't necessarily the case: performance concerns), then plan B might be more work for both you and users than is warranted.

Still, in the long run B is the fad to go with, and I hope that it's what happens.

(side note/plea: please check usage statistics and/or consider the feasibility of migrating to es6. please.)


Other food for thought relating to Plan B®:

  • What do we call and version it?
    • If we keep the name, what do we do with the versioning?
    • If we change the name, should we reset the version? (yes. the answer is yes.)
  • What is being split? Obviously the different clients are separated, but where do the shared models and helpers go?
  • Where does documentation live? Per-package or in its own package (for general concepts like choosing packages/clients)?

@aoberoi
Copy link
Contributor Author

aoberoi commented Apr 7, 2017

@everesty thanks so much for your thorough and well structured thoughts. i tend to agree with what you've said.

my biggest fear with Plan B is that we will build it and they won't come. but i'm optimistic because users get so many advantages beyond getting a better Data Store API from Plan B. also, since Slack is involved here, there's a decent chance that we can get the community to migrate: we can add links to the official documentation site (api.slack.com) to drive new users to the preferred package, we can hopefully get more people to write tutorials and blog posts that use those modules, and maybe since the lower level modules will mostly be utilities and opinion-less, we can get 3rd party tools to build on top and depend on them.

this discussion is definitely still open to more opinions and i can be persuaded otherwise.


my thoughts on some of the points you raised:

  • migrating to ES6: the @slack/events-api module is already written in ES6. i already live in an ES6 world, so i'm not going to start new packages that don't hit that minimum bar. that being said, until node core transitions the ES Modules EPS to ACCEPTED status, and even for some time after, we'll need to continue to transpile down to ES5. i'd like to adopt the minimum target of node 4.x across all newer modules, and if your app has to run on something less, then this module might be the answer for you. there's actually one more build change i'm considering and that's authoring in TypeScript. its become apparent to me that the community wants types and doing them out of band of the authoring of the package is a pain from a maintenance point of view, but we might need to save that for another discussion.
  • what do we call it? how about @slack/sdk? also, @brianleroux owns the slack package on npm and has previously offered for us to use it for another purpose, so that could be possible. if we use a new name, we'll start the version number over.
  • what's being split? here's a hypothetical list:
    • RTM client (potentially use slack implementation, which i think is great)
    • Web client (potentially use slack implementation, which i think is great)
    • Events API
    • Interactive Messages (TBD, but similar to Events API)
    • Incoming Webhook (for migratability, could prob just use code from here)
    • Slash Command + Outgoing Webhook (prob a ~30 line helper, plugs in like Events API)
    • OAuth helpers (@aoberoi/passport-slack can be cleaned up and reused)
    • Slack Store (inspired by express-session with pluggable backends and options to turn on/off certain models, behave as a team cache versus an org cache)
    • Message helpers (similar to a templating library specific to outputting JSON suitable for Slack messages)
  • where do docs live? i think each module needs reference docs and a good README at a minimum. the umbrella module should get a little more in depth treatment with guides for different kinds of usage (similar to the gh-pages site for this module). the umbrella module could also have a listing of its individual deps and a short description about what they are used for.

@aoberoi
Copy link
Contributor Author

aoberoi commented Apr 7, 2017

maybe instead of an umbrella package, there's a generator for yeoman. or, maybe its something more like create-react-app.

@brianleroux
Copy link

cc'ing @ryanblock for context and further deep thots

Going to spend some time collecting some (not too deep) thoughts in this conversation myself this weekend. Quickly tho…

Definitely open to, and would like to encourage, the basis of https://github.com/smallwins/slack for the lower level layers. Its already built in a modular fashion enabling opt-in to specific bits and that should probably continue. Having some extra eyeballs would be great tho we have found it really easy to maintain since its 99% generated code. (I've had a long desire to make that 100% by way of a bot just because it'd be fucking awesome and meta.)

WRT to esmodules. I regret supporting them but feel that lib should continue to do so. I hesitate to introduce a breaking change for aesthetics which in itself is ironic because esmodules of the future will break current transpiled semantics. Very unfortunate for all of JavaScript. Good news is require works now with billions of packages, and will continue to do so for the foreseeable future so transpile to ES5 really isn't that bad of a thing. Its boring. And boring is good when it comes to module systems! ;) Besides all this banter there is no minification tooling for esnext yet so a transpile is required to ES5 for any app serious about performance (which…should be all of them).

That leads me to another high level goal I'd like to propose: small payload!

We are based on AWS Lambda and keeping our cold execution startup times reasonable means we need to keep the JS lean. I had to rip out request a while back because it bloated our lib up to 10mb, making our cold start on lambda 3s, making Slack Events API retry our endpoint, making our bot respond to everything three times. Not awesome. Replacing it was easy though and now the lib has only two dependencies. Super tiny: and it would be awesome to keep it that way; not just for Lambda but browser based apps too.

More soon! But have a rad weekend all. 💕

@clavin
Copy link
Contributor

clavin commented May 18, 2017

one more build change i'm considering and that's authoring in TypeScript

I was reading the other discussion on TS and I just remembered that this issue exists.

I honestly also agree with making a change like this--in the scope of this issue specifically. If we're planning on transpiling down to ES5 (because of writing in ES6), we may as well throw in TypeScript to help ease development; both add virtually the same build step. Even more, when working on a large number of packages, I feel TypeScript would help with avoiding type issues between the packages.

This would also be consistent with the fact that Slack uses TypeScript internally, regardless of the fact that this package isn't (and likely won't be) used for the desktop client.


One other issue that comes to mind is where the packages'/modules' sources will be. Will each module get its own repository on this account? If so, does that not somewhat undermine the other projects this organization has to show? Or is that not really important? I'm leaning to separate repositories, especially since pinning repositories is a feature that exists, but I feel that having all modules in one repository may have its own other benefits, and I feel it should be thought about.

Also, assuming Plan B® happens, how will initial development work? I believe it's important to have community input and help while working on the overwhelming task that is Plan B®, so my question is more along the lines of where will the development occur?

My best guess and recommendation is to host a repository/repositories on this organization (@slackapi), or wherever the modules will end up living. I don't really see a reason behind repeating the same approach that Leah did when working on this SDK originally in which the repository, while in infancy, was on her account, and was eventually migrated here.

@aoberoi
Copy link
Contributor Author

aoberoi commented Oct 3, 2017

i wanted to leave some updates here regarding how some of these thoughts have evolved.

migrating the community: i now believe this is a larger challenge than initially anticipated. i'd like to use a stronger hand to guide users to the latest and greatest. i still prefer Plan B, but when the umbrella package is ready, i'd like to officially deprecate @slack/client (so that npm warns you on install).

repository migration : i'd like to reuse this GitHub repository for the umbrella package so we don't end up losing the 100+ watchers and collaborators. the npm name would change, but the github repo would stay the same, we would just leave the old sources in a branch that isn't master. this way all the links on the internet (first party and third party) will bring users to the latest and greatest.

modular repositories: there's a non-trivial amount of overhead in the tooling that comes with each new repository. we're in agreement that transpiling (with TypeScript) will be done. there is some efficiency we can gain by only doing this once and publishing the individual modular npm packages and the umbrella package from the same github repo.


now to bring back to the initial topic: we should begin designing the Slack Store module mentioned above. here are some questions:

  • how do we deal with relational and key-value type backends with one interface? should it actually be two backend interfaces?
  • which backends are most important to address at first? initial thoughts: mysql, postgresql, redis, mongodb.

@aoberoi aoberoi added discussion M-T: An issue where more input is needed to reach a decision and removed question M-T: User needs support to use the project labels Oct 3, 2017
@aoberoi aoberoi changed the title Problems with the DataStore API - Do we Deprecate, Remove, etc.? Future of the DataStore API Oct 3, 2017
@aoberoi aoberoi mentioned this issue Oct 27, 2017
6 tasks
@aoberoi
Copy link
Contributor Author

aoberoi commented Mar 10, 2018

it seems appropriate to close this issue and redirect any conversation about a new API into #410.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion M-T: An issue where more input is needed to reach a decision
Projects
None yet
Development

No branches or pull requests

3 participants