Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bsky: Add caller did in grpc calls to dataplane #2828

Open
wants to merge 43 commits into
base: main
Choose a base branch
from

Conversation

matthieusieben
Copy link
Contributor

@matthieusieben matthieusieben commented Sep 24, 2024

Adds a bsky-caller-did header to the GRPC calls to the dataplane that contains the did of the authenticated user making the inbound HTTP request.

Does so by creating a new HydrateCtx per HTTP request that contains a wrapped dataplane instance (that will add the header), and that is used from controllers (including pipelines).

@matthieusieben matthieusieben changed the title wip Add caller did in grpc calls to dataplane Sep 24, 2024
@matthieusieben matthieusieben marked this pull request as draft September 24, 2024 20:19
@matthieusieben matthieusieben marked this pull request as ready for review September 26, 2024 14:30
@devinivy
Copy link
Collaborator

devinivy commented Oct 3, 2024

This is a much simpler change to getLists and unrelated to the rest of the work— could we peel it out of this PR so that we can ship it separately? e21ada7

Comment on lines 185 to 200
createHandler<
ReqCtx extends HandlerRequestContext<unknown>,
Output extends void | HandlerOutput<unknown>,
>(
view: (ctx: HydrateCtx, reqCtx: ReqCtx) => Awaitable<Output>,
options: CreateHandlerOptions = {},
) {
const { authVerifier } = this
const {
// Store as const to allow code path optimizations by compiler
allowInclude3pBlocks = false,
allowIncludeTakedowns = false,
canPerformTakedownRequired = false,
exposeRepoRev = false,
} = options

Copy link
Collaborator

@devinivy devinivy Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight concern around getting rid of all handler function bodies and replacing them with this method, which is that it becomes awkward to add a custom behaviors to a given route. Instead, all the custom behavior ends-up needing to get expressed within this method, which seems like it will end-up accumulating many special cases over time.

A good example is the canPerformTakedownRequired option— this clearly stands out as a behavior of some specific endpoint that performs a takedown, but it ends-up folded into this method next to other unrelated behaviors from all other routes. I find that admin.updateSubjectStatus was easier to reason about and to maintain in its previous state.

Overall it seems more natural to me to place the special behaviors directly where they're need, and instead create utilities if necessary to make the handlers more concise. By leaving function bodies around for each route's handler, they'd all remain open to extension. I am interested to hear your thoughts on all of this!

Copy link
Contributor Author

@matthieusieben matthieusieben Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree with your point of view. Special behaviors should be visible where they are needed. And ideally, those special code paths should not impact places where they are not needed.

  • The case of canPerformTakedownRequired is quite isolated and was easily reverted back to the way it was.

  • allowInclude3pBlocks and allowIncludeTakedowns are options needed to create the HydrateCtx object, so it is quite natural to put them here IMO. We can rename them if you think there might be a better name.

  • inputHeaders and ouputHeaders were removed.

  • Regarding exposeRepoRev, there are lots of alternative solution, some of which are detailed below. In my opinion, it comes down to two things:

    1. This is needed in about 50% of the methods. Having an explicit boolean makes it easier to distinguish omissions from explicit opt-in.
    2. Most options that make this available as a function make the code more verbose & significantly increase repetition.
alternatives to the exposeRepoRev option

we could easily re-write it as a withExposedRepoRev wrapper function. With this, the following:

  server.app.bsky.feed.getAuthorFeed({
    auth: ctx.authVerifier.optionalStandardOrRole,
    handler: ctx.createPipelineHandler(
      skeleton,
      hydration,
      noBlocksOrMutedReposts,
      presentation,
      {
        allowIncludeTakedowns: true,
        exposeRepoRev: true
      },
    ),
  })

would be replaced by:

  const getAuthorFeed = createPipeline(
    skeleton,
    hydration,
    noBlocksOrMutedReposts,
    presentation,
  )

  server.app.bsky.feed.getAuthorFeed({
    auth: ctx.authVerifier.optionalStandardOrRole,
    handler: ctx.createHandler(withExposedRepoRev(getAuthorFeed), {
      allowIncludeTakedowns: true,
    }),
  })

I personally think this is not really better in terms of readability. It increases repetition (ctx.createHandler + createPipeline instead of ctx.createPipelineHandler) & makes the code either more "dense" (like above) or more repetitive (if introducing intermediary variables).

A more readable alternative, that would be more aligned with the current implementation, could look like this:

  const getAuthorFeed = createPipeline(
    skeleton,
    hydration,
    noBlocksOrMutedReposts,
    presentation,
  )

  server.app.bsky.feed.getAuthorFeed({
    auth: ctx.authVerifier.optionalStandardOrRole,
    handler: async (reqCtx) => {
      const hCtx = await ctx.createHydrateCtx(reqCtx, {
        allowIncludeTakedowns: true,
      })

      hCtx.exposeLabelers()
      await hCtx.exposeRepoRev()

      return getAuthorFeed(hCtx)
    },
  })

I find that this notation has several drawbacks:

  • exposeLabelers() must basically always be called. This yields a lot of repetition that adds un-necessary noise to the function.
  • exposeRepoRev() is async and exposeLabelers() not. could be error prone (forget to await in some instances)
  • As maintainer, it is not clear when a function does not contain exposeRepoRev() whether it is intentional or an omission ({ exposeRepoRev: boolean } makes this explicit)

Copy link
Contributor Author

@matthieusieben matthieusieben Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I refactored that function to be much more lightweight and easier to use in a raw handler function:

return async (
reqCtx: HandlerRequestContext<Params, Auth, Input>,
): Promise<Output> => {
const ctx = await this.createHydrateCtx(reqCtx, options)
// Always expose the labelers that were actually used to process the request
ctx.setLabelersHeader()
// Conditionally expose the repo revision
if (exposeRepoRev) await ctx.setRepoRevHeader()
return view(ctx)
}

I believe that this helper function still has value, and should be used instead of duplicating its logic because:

  1. It avoids having to set the Atproto-Content-Labelers header in every controller
  2. It makes it explicit whether Atproto-Repo-Rev by having to opt-into it through the options.

@matthieusieben matthieusieben changed the title Add caller did in grpc calls to dataplane bsky: Add caller did in grpc calls to dataplane Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants