-
Notifications
You must be signed in to change notification settings - Fork 5
Plugin Capabilities #379
Comments
Thinking this through, this is how things currently look: We need to figure out two things: How to represent capabilitiesCapabilities ought to be a property of We do want to search and sort plugins by capabilities, but at the same time, given a 1,000 is a reasonable upper limit on the plugins, searching via a linear scan is OK, which means we don't necessarily need an indexed column to filter by capability in a DB query. (a.k.a. capabilities can a An alternative is a new table that links to How to infer capabilitiesNow, let's say we've chosen a model to represent capabilities. Given a plugin, how do we figure out what capabilities it has?
I think long term, (2) makes sense. However, I'm not yet sure how much we're going to use the capabilities information, so I'm in favour of doing (3) now: seeing how much we use this, and eventually, once we have source testing as well, and capabilities seem useful enough to not put developers through the pain of the extra config item, we can integrate them into (2). Thoughts? cc: @yakkomajuri @macobo @Twixes |
Thanks for writing this out @neilkakkar! I need to think about this a bit more before I can give you a strong opinion on approach for inferring capabilities. I am quite averse to asking plugin creators to do this though - I think it'd give a bad impression: "how come you don't know what this can do? you run these functions!" Regarding representing them, I also think one extra column to the plugins table should do it. |
My 2c:
One scheme that might work and gets us closer would be:
pubSub.on('message', async (channel: string, message) => {
if (channel === server!.PLUGINS_RELOAD_PUBSUB_CHANNEL) {
status.info('⚡', 'Reloading plugins!')
await piscina?.runTask({ task: 'calculatePluginCapabilities' })
await piscina?.broadcastTask({ task: 'reloadPlugins' })
await scheduleControl?.reloadSchedule()
}
})
It would be slightly racy in that it would trigger in all plugin server instances but should be correct/good enough to get started. With #165 we can then refactor for this to happen once/outside the main flow completely. |
This is an interesting idea, too! Following up on both your points, not pushing it onto developers makes sense, specially with regards to the extra debugging required with mistakes: yeah, that's not great. This made me think of another interesting middle ground: We let things run as is, and on loading the plugin, if there's no capabilities, they get populated. It's implicit on setup, which means it happens inside the The only issue here is with partial upgrades: during migration to this new version, if you don't restart the plugin server, some running plugins won't have their capabilities updated. But that's easily resolvable. I think this gives us the best of everything so far? |
Realised during implementation: one thing I missed above is the lazy loading, since inferring capabilities on reload means the VMs would be loaded on all threads where the plugin has changed / new plugin has been introduced :/ This won't be a problem if it were a test worker setup, but maybe does become a problem on the regular workers. Separating it out into a task solves this above problem. Another problem we have is that if the task runs before the reload, we're dealing with stale data, which isn't good. If it runs after the reload, the plugins object becomes stale, so we need to reload again (or dynamically modify the plugins object and defer reload to the next time something changes via the posthog API - this is messy, and doesn't percolate up to the other threads). The crux: These issues arise because we're trying to set plugin data not on creation but on load, which means there is a load -> setCap -> load again loop; which would exist until we can somehow move setCap back to PostHog (with #165). Hmmm, hard to say which way is better for now, since it's unclear which route we go for plugin testing. So, choosing the one that takes less work, while keeping things as separate from the rest of the code as possible, i.e. the non-task approach. |
Here's my pragmatic view: Let's get the ball rolling and iterate as we go - the best thing right now is probably to get a PR out and we can go from there. Will also put the discussion into context quite nicely. Either way we'll tackle the capabilities representation "problem", even if the inference occurs "too late". We can then iterate to get the inference to happen at an earlier stage/with a better setup, probably integrating with some #165 work. |
Indeed. I'm being verbose here because I like keeping a log of what decisions we make, and how we make them. Helps me revisit when we get around to refactoring to figure out the mistakes I made. |
Oh yeah I'm all for that! It helps me (and everyone else) get context too. |
Agree with the above and great work digging into all of it @neilkakkar Like @yakkomajuri said, I think the first step should be to just capture the capabilities of a plugin when we can, probably after the plugin loads if no capabilities have been recorded... or if they differ from the reality. Later we'll address #165 and do this capability recording in a separate test or installation step. As for storing it, a simple JSON object (with some structure) on Plugin makes sense for now. It lets us move fast. |
Splitting this from #165
It would help with distributing and optimising worker pools inside the plugin server, if we knew what each plugin can do. Basically, what it exports. This knowledge can be used to help make a nicer interface (no need to reorder export plugins that run at the same time) and it'll help us split processEvent/onEvent tasks (which need to finish very fast) from the rest (scheduled jobs and other async tasks)
The text was updated successfully, but these errors were encountered: