Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISOM-1006: Optimize verifySiteMember() #1332

Merged
merged 6 commits into from
Apr 25, 2024

Conversation

timotheeg
Copy link
Contributor

@timotheeg timotheeg commented Apr 25, 2024

Problem

verifySiteMember() is a potential cause for the DB load we see. It is very costly for what it does, and we run it all the time as part of the auth flow.

verifySiteMember() is a potential cause for the DB load we see

Some flaws exhibited

  • Multiple consecutive query on the isomeradmin tables for the same users
  • Fetching more data than needed when we actually care for a exists() boolean output
  • getRole() issues an extremely costly query, involving multiple full table inner joins and json processing.

Sample trace
image

The 3 queries are the most costly operations IsomerCMS runs, both in term of number of queries and in term of time spent:
image

Solution

This PR will be fully backward compatible but aims to reduce the cost of verifySiteMember().

Expected outcomes

  • No more duplicate queries for the same isomeradmin user
  • Faster middleware processing for verifySiteMember() (because query in getRole will be 4x+ faster)
  • Reduced DB CPU load
  • Less time spent overall by the app over these queries, visible in this view in DataDog

Note:

  • Through this PR, I noticed that the CollaboratorsService is instantiated twice in the codebase (in server.ts and in common/index.ts), which seems wrong, I have opened a separate linear ticket to have that cleaned up)
  • When actual isomer admins visit a site they are not a member of, there will still be 2 duplicated query. Further refactor and optimization work could be undertaken:
    • Implement local cache for site memberships
    • Add local cache for getUserId()
    • Refactor code: is it right that getRole() for a site can return isomerAdmin? And if the role is admin, perhaps further processing in verifySiteMember()

Verified on staging:
Sample trace here
image

Reference:

Copy link

linear bot commented Apr 25, 2024

@timotheeg timotheeg force-pushed the ISOM-1006_Optimize_verifySiteMember branch 4 times, most recently from 0bc8633 to 34becf8 Compare April 25, 2024 05:20
@timotheeg timotheeg force-pushed the ISOM-1006_Optimize_verifySiteMember branch from 34becf8 to 0fd73a3 Compare April 25, 2024 05:43
@timotheeg
Copy link
Contributor Author

timotheeg commented Apr 25, 2024

I tried 2 possible query optimizations (see details in this commit for details)

Results are as below in this sample trace
image

original: 15.8ms
optimization option 1: 3.49ms
optimization option 2: 2.07ms

Optimization 1 retains the check for soft deletion of users, repos, and sites, while optimization 2 makes an efficient query on just repos and site_members, and assumes both the site and the user are not soft deleted (which may or may not cause trouble).

Optimization 1 is still 4x more performant than the original, so we could start by releasing that and we can keep optimization 2 as a potential improvement for the future.

@harishv7 any comment on the continuous check for soft deletions in every query?

@timotheeg
Copy link
Contributor Author

Query plan reports from postgres on the 3 queries:

original:
image

optimization 1:
image

optimization 2:
image

@timotheeg
Copy link
Contributor Author

timotheeg commented Apr 25, 2024

Also for reference, below is an example of the actual query sequelize sent to the DB (on staging)
(It's easier to read than the normalized version in the DD traces)

SELECT 
    "Site".*,
    "site_members"."id" AS "site_members.id",
    "site_members"."email" AS "site_members.email",
    "site_members"."github_id" AS "site_members.githubId",
    "site_members"."contact_number" AS "site_members.contactNumber",
    "site_members"."last_logged_in" AS "site_members.lastLoggedIn",
    "site_members"."created_at" AS "site_members.createdAt",
    "site_members"."updated_at" AS "site_members.updatedAt",
    "site_members"."deleted_at" AS "site_members.deletedAt",
    "site_members->SiteMember"."id" AS "site_members.SiteMember.id",
    "site_members->SiteMember"."user_id" AS "site_members.SiteMember.userId",
    "site_members->SiteMember"."site_id" AS "site_members.SiteMember.siteId",
    "site_members->SiteMember"."role" AS "site_members.SiteMember.role",
    "site_members->SiteMember"."created_at" AS "site_members.SiteMember.createdAt",
    "site_members->SiteMember"."updated_at" AS "site_members.SiteMember.updatedAt"
FROM
(
    SELECT
        "Site"."id",
        "Site"."name",
        "Site"."site_status" AS "siteStatus",
        "Site"."job_status" AS "jobStatus",
        "Site"."is_private" AS "isPrivate",
        "Site"."created_at" AS "createdAt",
        "Site"."updated_at" AS "updatedAt",
        "Site"."deleted_at" AS "deletedAt",
        "Site"."creator_id" AS "creatorId",
        "repo"."id" AS "repo.id",
        "repo"."name" AS "repo.name",
        "repo"."url" AS "repo.url",
        "repo"."created_at" AS "repo.createdAt",
        "repo"."updated_at" AS "repo.updatedAt",
        "repo"."deleted_at" AS "repo.deletedAt",
        "repo"."site_id" AS "repo.siteId"
    
    FROM
        "sites" AS "Site" INNER JOIN "repos" AS "repo"
            ON
                "Site"."id" = "repo"."site_id"
                AND ("repo"."deleted_at" IS NULL AND "repo"."name" = 'kishore-test')
            
    WHERE
        ("Site"."deleted_at" IS NULL) 
        AND ( 
            SELECT
                "SiteMember"."id"

            FROM "site_members" AS "SiteMember" INNER JOIN "users" AS "User"
                ON 
                    "SiteMember"."user_id" = "User"."id"
                    AND ("User"."deleted_at" IS NULL AND "User"."id" = '119')

            WHERE ("Site"."id" = "SiteMember"."site_id")
            LIMIT 1
        ) IS NOT NULL
    LIMIT 1

)
AS "Site" INNER JOIN
(
    "site_members" AS "site_members->SiteMember" INNER JOIN "users" AS "site_members"
        ON 
            "site_members"."id" = "site_members->SiteMember"."user_id"
)
ON 
    "Site"."id" = "site_members->SiteMember"."site_id"
    AND ("site_members"."deleted_at" IS NULL AND "site_members"."id" = '119');

@timotheeg timotheeg marked this pull request as ready for review April 25, 2024 06:32
@timotheeg timotheeg requested a review from a team April 25, 2024 06:32
@timotheeg timotheeg changed the title ISOM-1006: Optimize verifyASiteMember() ISOM-1006: Optimize verifySiteMember() Apr 25, 2024
Copy link
Contributor

@kishore03109 kishore03109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can either limit 1 or can just add a comment stating that the if statement is done defensively, practically we only expect to see one or 0 elements in the array

@timotheeg
Copy link
Contributor Author

timotheeg commented Apr 25, 2024

Hey @kishore03109 btw, I'm a litttle confused with something: the original query had a check for site_members.deleted_at, but I do not see that field in prod DB.. How can this be? 🤔

Never mind! OMG sequelize does this INNER JOIN "users" AS "site_members"
which is a very confusing alias 😅

Oh I also note from looking at the query plans that we are missing an index for repos.name. I opened a ticket to add it. @alexanderleegs FYI

@timotheeg timotheeg force-pushed the ISOM-1006_Optimize_verifySiteMember branch from deb58f5 to 218cb12 Compare April 25, 2024 09:50
@timotheeg timotheeg merged commit 3762a9b into develop Apr 25, 2024
23 checks passed
@timotheeg timotheeg deleted the ISOM-1006_Optimize_verifySiteMember branch April 25, 2024 10:44
@alexanderleegs alexanderleegs mentioned this pull request Apr 26, 2024
3 tasks
@timotheeg
Copy link
Contributor Author

I updated the IsomerCMS dashboard graph to show the DB CPU load this week versus last week, so we can see at glance whether we're doing better or worse.

Direct link to widget here

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants