-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Distribution 3.20 Tracking issue #12836
Comments
This week: Overall not as productive a week as I would've liked, but I'm content. Addressed debt (Alpine 3.12 update in all images, unifying base docker images, adding linters to use it always), investigated baremetal buildkite agents with Dave and settled on final solution for running e2e and Docker tests, discussed next steps on Firecracker code intel, reviewed some RFCs, learned some Dhall to keep up to speed with it, and helped customers via https://github.com/sourcegraph/customer/issues/96 https://github.com/sourcegraph/customer/issues/99 https://github.com/sourcegraph/customer/issues/90 and had syncs/calls with https://sourcegraph.slack.com/archives/CMB6K7SMN/p1598639367073200 https://sourcegraph.slack.com/archives/CMB6K7SMN/p1598644460077400 as well as an interview and others. Next week: More code reviews and RFCs to review, then hopefully I'll start on my 3.20 planned work. |
update: worked on dhall deploy-sourcegraph. have to admit, it's a lot of fun. geoffrey and i landed the migration tool. playing with the idea of ingesting a k8s dhall schema in golang and transforming it so that lists become records with keys extracted from some specific fields in the list elements. we need this to get to these specific elements. you can access list elements by index but finding out which element you are currently looking at is tricky because dhall intentionally limits you so you are forced to compose stronger schemas (lists are weak). so far i think this is the only substantial hurdle. we also want to decide if we want to be as detailed as the current k8s schemas or more simplified. both approaches have advantages and disadvantages. |
Update:
|
last week Worked on a GitHub Actions-based workflow for driving updates from deploy-sourcegraph to deploy-sourcegraph-k8s-dogfood-2 (including feedback on failure), which is pretty much complete. Worked with Gonza on identifying relevant changes for the original dogfooding environment. Attempted to update the k8s.sgdev.org, but was unable to get the deploy to work - will probably abandon the effort in favour of deploying the new dogfood config. Set up PRs for the remaining issues in the dogfooding project (https://github.com/sourcegraph/sourcegraph/pull/13449 , https://github.com/sourcegraph/deploy-sourcegraph-dot-com/pull/3304 ) but this requires a bit more work / validation - spent a lot of time going over renovate docs and trying to figure out how to best set up deploy-sourcegraph being kept up to date. Looked into various alerts that have been coming through #opsgenie and tried to follow up on some of them. Looked at dhall stuff too as Geoffrey noted this week I've been waiting for an update to come in to deploy-sourcegraph to try out the test PR to mark it as done, but I'll figure out a good way to just go ahead and verify that this week and mark it as complete. Will also be trying to wrap up work on the new dogfood environment (including deploying it). I would like to write some docs for sourcegraph/about#1468 as well, since following up on the new alerts has been a bit of a topic recently, and possibly treating this as my "debt" ticket. I am also transitioning to a new part-time schedule |
Last week Worked on closing out #12101 and resolving the subsequent issues with virtualbox running on GCP. The decision to sue the vagrant-google plugin has been made as it's a lot more reliable, requires less resources than local testing and is a lot quicker. Met with Geoffrey and began proper on boarding to work on Dhall. Some addition work to upgrade the bigdata cluster and assist $CUSTOMER with an upgrade This week working on any last minute feedback for sourcegraph/deploy-sourcegraph-docker#141 and beginning work on Dhall with the help of Uwe and Geoffrey. This will intially be a smaller project then onto more assigned issues as delegated by Geoffrey. |
Last weekLast week was mostly a technical week, I worked through bugs, issues and alerts to try and reduce the number of events we receive daily. We identified a problem with some searches which were used in saved searches causing them scan all repositories. This weekIll continue to work on how to improve our API for support requests ass well as how we prioritize them to ensure we are working effectively. On a related task I would like to define how we prioritize the backlog, which relates to this as it affects how we revisit tasks/issues that were not set as high priority. Team updateWe have closed the per-team alerts project 🎉! |
This week Learned some more Dhall and discussed architecture plan with everyone, put forward RFCs to deprecate single container deployments, sync'd with Bunny on proposal to stop versioning our docs by branches, and had other regular meetings. Around Wed I had some personal / cat issues and had to take off Thur and Fri. Next week I am hoping to make forward progress on my assigned issues with the aim of completing everything assigned to me, I think it is still a reasonable workload currently and am optimistic things in my personal life will calm down soon, but will have to play it by ear a bit. |
This week Closed out the first phase of e2e testing for deploy-sourcegraph-docker with the help of Gonza and Stephen. The rest of my time has been spent learning dhall and syncing with Geoffrey and Uwe who have been really helpful in helping me level up Next Week There are still some outstanding tasks related to e2e testing, which I will clarify with Stephen. I'm hoping to attack some of the tasks in the the Dhall POC. In addition to this I want to work with Gonza and Robert on our next steps related to monitoring (site24x7 and blackbox exporer) and how we get the best out of both. |
this week Attempting my new part-time schedule. Verified dogfood PR automation is working (https://github.com/sourcegraph/deploy-sourcegraph-dogfood-k8s-2/pull/20) and landed some work on improving Sourcegraph -> deploy-sourcegraph image updates, but this doesn't seem to be working as expected currently. Made various changes to docs (updated release process, fixing links, reorganizing to add space for more deployment details). Did some Dhall learning (task, reading up about ideas for architecture) next week Wrap up work on dogfooding environment - finalize the image update changes, and have scheduled time with @pecigonzalo to run through deployment of the new environment. Did not get around to sourcegraph/about#1468 this week, and still have my eyes on that, as well as exploring Dhall PoC tasks. |
Last week:
next week:
|
last week:
next week:
|
Last weekI started to work on our long-term objectives and scaling the team, but did not make the progress I wanted to. Ill continue to work on it this week. Dan opened an initial draft of our escalation process and I intend to draft a PR to update our incident response and support "on-call/hero" rotation as discussed during our sync. This weekAs I was unable to make sufficient progress last week, this week's focus remain the same for the most part. Team updateWe will additionally kick-off planning 3.21. |
Last week More or less fully on e2e testing. Setting up the e2e tests running on a vagrant box instead of the unreliable docker in docker solution. A little bit of dhall but to be honest, I feel as though I've dropped the ball on that and given my own desire in getting e2e running, I have been more of a passenger with Dhall. This week Close out e2e testing and catch any low hanging fruit with Dhall. Also spoke with Gonza about our next steps with site24x7 vs blackbox exporter which may be pulled in for 3.20 more than likely pushed out to 3.21 as tech debt. |
Dear all, This is your release captain speaking. 🚂🚂🚂 Branch cut for the 3.20 release is scheduled for tomorrow. Is this issue / PR going to make it in time? Please change the milestone accordingly. Thank you |
Last week This week Team update |
Last week Interviewed Cloud and CE candidates, overhauled customer-facing managed instance docs, brainstormed LSIF postgres move with Eric. Chatted with https://app.hubspot.com/contacts/2762526/company/861679490/ about multi-region deployments and more. Figured out next steps of Dhall with Uwe and Geoffrey and swapped some of my planned work to help out further there, spending about ~6h total on my Dhall PoC. This week Close out my planned work for this iteration, 3.21 planning. |
Last week Got really sucked up into dogfooding with various issues with GitHub Actions payloads and formatting, Renovate confusions and bug, and actually deploying the whole thing with @pecigonzalo . The first two ended up being very time consuming due to the tedious nature of testing this kind of stuff (act not being a perfect simulation of Actions, Renovate needing to be set up in the target repository before things can be tested, bug was rather hard to trace down). This week Finalize dogfood deployment (https://github.com/sourcegraph/sourcegraph/issues/13792) and landing the Renovate bug fix (renovatebot/renovate#7274) to close out the remaining issues for this iteration, and start looking at 3.21 tasks. Also merge the new release steps (sourcegraph/about#1517) |
Last week:
This week:
|
last week I spent most of the week finalizing the work on the new k8s.sgdev.org deployment with @pecigonzalo 's help, and have wrapped up most of that work and made the DNS switch to have the k8s.sgdev.org domain point to the new deployment (announcement). I wrote up documentation updating our information about our existing deployments as well as adding details about the new dogfood cluster. this week I took a look at some of the 3.21 tasks for single-day releases, and will start tackling some of them this week. I'll also figure out how to fix one last outstanding issue with the k8s.sgdev.org deployment (thread) and given no complaints, spin down the old cluster to close out https://github.com/sourcegraph/sourcegraph/issues/13792 . |
last week:
next week:
|
Last week Next week Team updates |
Plan
Support new and existing deployments
This is an ongoing expense, we anticipate this taking no more than 10d of work spread across the entire team.
Support teams migration to per-team alerts
We have enabled per-team alerts and on-call rotations on 3.19, as teams onboard to this new workflow we will need to provide support and guide them through the transition.
Reduce upgrade overhead
We decided to move forward with the Dhall implementation and we will work on defining a roadmap for it (spike). Additionally, we will work on a "
yaml-to-dhall
" and define a Dhall architecture that supports customizations (spike).Enable failed e2e test notifications and blocking the pipeline
Although we run the e2e tests daily, their failures are not visible to all which means at the end of the iteration we must ask teams to pitch in on short notice.
We will inline our e2e tests and notify engineers when a merge breaks our e2e tests to ensure our
main
branch is always in a working state.Dogfood Kubernetes deployments
Deployments to dogfood-k8s are automated from our latest images and reflect our customer’s workflow. deploy-sourcegraph is kept up to date with our latest images.
Availability
Period is from August 20th to September 19th (22 working days). Please write the days you won't be working and the number of working days for the period.
Tracked issues
@bobheadxi
on-call: document actions to follow up on critical alerts#1468dogfood-k8s: finalize migration over to new cluster#13792@davejrt
@daxmc99
@ggilmore
ci: build and pin tool apks in CI for release#13297 🧶@keegancsmith
@pecigonzalo
dogfood-k8s: finalize migration over to new cluster#13792@slimsag
sourcegraph/customer#71 👩alpine
with a linter #13247 🎩sourcegraph/customer#90 🐛👩Run e2e tests on bare-metal Buildkite agents on every commit to master (non-blocking)#12339Run e2e "regression" tests on bare-metal Buildkite agents on every commit to master (non-blocking)#12340License report for syntect_server & its dependencies; remove syntaxes with questionable licenses#11269 1d 👩sourcegraph/customer#97 👩@uwedeportivo: 2.00d
Repo-updater component always outputs debug logs#13191 1d 👩🎩Legend
The text was updated successfully, but these errors were encountered: