Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Release v1.39.0 #775

Closed
9 tasks done
elisabettai opened this issue Nov 22, 2022 · 9 comments
Closed
9 tasks done

🚀 Release v1.39.0 #775

elisabettai opened this issue Nov 22, 2022 · 9 comments
Assignees
Labels
release Preparation for pre-release/release

Comments

@elisabettai
Copy link
Contributor

elisabettai commented Nov 22, 2022

In preparation for release. Here an initial (incomplete) list of tasks to prepare before releasing:

  • Prepare staging
  • No blockers
  • Check changelog 🚨
  • Check devops ⚠️
  • Test assessment: e2e-testing
  • Test assessment: targeted-testing
  • Test assessment: user-testing ✅
  • Release summary
  • Release assessment

Prepare staging

  • Motivation: release outcome from switzer sprint .
    • all features/enhancements from the sprint (already in staging)
    • fixes since the sprint (some in staging and @pcrespov will cherry pick the rest to staging)
  • Release date: week 48

Blocker

Changelog

## Added / Changed / Removed
- ✨ O2IL:  ooil executable in a docker image (#3458)
- ✨ O2IL: Is3418/validation with ``ooil test my/osparc/service`` (#3479)
- ✨ I/O: Is686/api port schemas public api: api-server 0.4.1 (#3485) ✅ 
- ✨ I/O: Is686/list_service_ports in catalog service API: catalog 0.4.0 (#3484)
- ♻️  I/O: Is3517/refactor service io and 🔨diagnostics tool concept (#3537)
- ✨ Optimizer: Is355/optimizer projects ports (#3504)
- ✨ Services deprecation: Deprecated and Retired (#3512)
- ✨ Computation: Ensure memory swap for computational services is same as memory (⚠️ devops) (ITISFoundation/osparc-simcore#3510) ✅ 
- ✨♻️ Email: Differentiate between TLS and STARTTLS in web-mailserver  (⚠️ devops)  (#2965)
- ✨ UI App: Tooltip on node links (ITISFoundation/osparc-simcore#3441)
- ✨ UI: Sort files and NodeTreeItem menu's bgColor (#3523)
- ✨ UI S4L Lite: 11.08 meeting feedback (⚠️ devops) (#3534)
- ✨ UI:S4L-lite product (ITISFoundation/osparc-simcore#3503)(#3508)(#3511)
- ✨ UI TIP: TIP Follow up I (ITISFoundation/osparc-simcore#3466)
- ✨ UI App: Add instructions to App Mode steps (#3491)
- ✨ Dy-Services: Adding agent service with dyv volumes removal (#3465)(#3513)
- ✨ Dy-Services: Allow Starting dynamic services when idle OR failed (#3501)  ✅  
- ✨ Dy-Services: Allow selective start/stop of dynamic services (⚠️ devops) (#3449) ✅
- 🗑️ Dy-Services: removing dynamic_sidecar_network from dy-sidecar (#3467)

## Fixed
- 🐛 UI: optional instructions (#3593)
- 🐛 UI: force progress value to be between 1 and 99 (#3560)
- 🐛 Agent rclone configuration fixes (#3580)
- 🐛 Agent running in production now has access to rclone (#3571)
- 🐛 DatCore: datcore-adapter stops calling into pennsieve after too many requests are done? (#3473) [📌 ``staging_switzer_3``]
- 🐛 DataCore listing makes pennsieve client fail (#3464)
- 🐛 fix/low-CPU-load healthcheck for migration service (#3477) [📌 ``staging_switzer_3``]
- 🐛 Study: Ensure adding/deleting node is thread safe (#3490)
- 🐛 Sharing: Fix/sanitize old data for usergroups.thumbnail (#3498) ✅  [ 📌 ``v1.38.4`` ]
- 🐛 Storage: Ensure uploaded outputs always have a unique S3 object name (#3462)  ✅
- 🐛 Storage: crash when not a file in the project (#3483)  [📌 ``v1.38.1``]
- ♻️ Dy-Services: changed dierctor-v2 -> dy-sidecar API retry policy (#3583)
- 🐛 Dy-Services: Stopping container without starting them no longer raises error (#3589)
- 🐛 Dy-Services: No more /health errors when starting sidecars (#3586)
- ⚗️🐛 Dy-services: Add a test for reproducing potential 400 issue with upload to AWS (#3538)
- 🐛 Dy-services: Fix s4l-lite test (#3539)
- 🐛 Dy-services: adds tests for S3TransferError; refactoring flaky CI test; better logging for long running task errors (#3525)
- 🐛 Copy: Project copy failing when pennsieve token is active (#3509)  [ 📌 ``v1.38.3`` ]

## Security / Maintenance

- ♻️ Rerevise docker networks dk (bis)  (⚠️ devops) (#3564)
- ♻️ dont add /var/lib/docker/volumes in global docker-compose file (#3563)
- 🔨CI: Ensure CI uses the correct ENVs in master (#3482)
- ♻️ CI: Only run tests jobs on path changes in pull requests (#3429)
- 🔨CI: Maintenance/typecheck steps in CI (#3475)
- 🔨CI: Fix/CI build&deploy jobs with integration-library image (#3474) [📌 ``staging_switzer_3``]
- 🔨CI Ensure built images are used for testing (#3481)
- 🔨CI: Fixes CI test issues introduced by faulty #3524 (#3527)
- 🔨CI: integration tests not run when they should (#3529)
- 🔨CI: Adds CI ``python-linting`` job in python 3.11 (#3489)
- 🔨test: Maintenance/fix registry tests (#3553)
- 🔨test: Reduce test flakyness (#3542)
- ♻️ test: Fixes flaky test_update_profile and cleanup tests (#3528)
- ♻️ test: Fixing webserver 02 unit test (#3532)
- 🔨 e2e: open outputs folder in some cases (#3495)
- ♻️ Revise docker networks (#3543) and reverted (#3556)
- 🔨 Github template for maintenance issues
- ⬆️ Update datcore-adapter requirements (#3463)
- ⬆️ Upgrade aio-pika to latest version 8.2.4 (#3492)
- ⬆️ Upgrade postgres to 14.5 alpine⚠️ devops (#3500) ✅  
- ⬆️ Upgrades tests+tooling requirements (#3524)
- ⬆️ 🔨 Workaround to avoid test failures due to pytest-sugar (#3514)
- ⬆️ 🔒️ Upgrades pytest, aiohttp, jupyter-core and pillow (#3497)
- ⬆️ Use latest rabbit MQ service (#3496)

**Legend**

- ✨ New feature
- 🐛 Fixes bugs
- ♻️ Refactors code
- ⬆️ Upgrades dependencies
- 🔒️ Fixes security issues
- 🔨 Adds or updates development scripts or CI.
- 📌 can be cherry-picked to production or staging
- ✅ Target/User tests done

Check devops ⚠️

  • @mrnicegyu11 something to add here?
  • Switch t2.medium instances on aws : ops issue 336
  • Migrate PGSQL database : ops issue 314

Test assessment: e2e-testing

  • Mon: occasional failures (not critical)
  • Tue: same

Test assessment: targeted-testing ✅

DONE

Test assessment: user-testing

  • Done on Fri. 25 with @newton1985 .
  • Video of the session can be found in itis-osparc/DEVELOPERS/TESTS_SESSIONS/2022-11-25_usertest_TN.mp4
  • New issues created and linked to this issue.

Release summary

  • what: make release-prod version=1.39.0 git_sha=d64136858355270b1c8977efecea0cb1df59b261
  • who: @Surfict @mrnicegyu11
  • when: THURSDAY Dec.1

@Surfict @mrnicegyu11 when running the make recipe, the list of commits should start and end as:

- ♻️ changed dierctor-v2 -> dy-sidecar API retry policy (#3583)
- 🐛 Stopping container without starting them no longer raises error (#3589)
- 🐛♻️ No more /health errors when starting sidecars (#3586)
 ...

discard the end of the list (there is a bug in the recipe that I need to discuss with @sanderegg ).

🚨 After this release, we need create a hotfix so that we can include hotfix 1.38.5. This change is only in master (next time I will make sure I hotfix in staging and production sequentially).

Release assessment

@mrnicegyu11
Copy link
Member

mrnicegyu11 commented Dec 2, 2022

Write Up:

  • Issue: There where issues in the github actions CI building the docker images, the upload of the release images failed
  • Issue: There was an issue in the call used by the deployment agent to determine the last-changed-tie ordering of git tags. The hotfix v1.38.5 was determined to be "more recent" than the release v1.39.0. I guess this can only happen if less recent code is deployed in the minor-version release compared to the patch-versioned hotfix... This needs an investigation
  • Due to the above issue, in some deployments the version-selecting regex of the deployment agent was modified and v.1.39 was hardcoded into this. Once the previous bug is resolved/analyzed, this needs to be reverted. At this time, a minor-version or major-version changed release will not deploy on production
  • Issue: During the upgrade of the postgres database, the passwords of the auxiliary users grafanareader, rds_admin and postgres might have been changed accidentally, and this needs validation on all 3 production platforms
  • Issue: There is a wrong mail-sender in tip.itis.swiss that cannot be resolved by DevOps, it might be a simcore bug?
  • Issue: Graylog has multiple times reported corrupted data on aws-prod and lost all past logs. This needs investigations and fixes.
  • Issue: The GPU grafana exporters are not running on tip.itis.swiss due to a placement constraint mismatch
  • Issue: Graylog dashboard provisioning does not work
  • Minor Issue: The redis-commander on tip.itis.swiss shows staging_ databases, which are not present on tip.itis.swiss. This is a slight misconfiguration

Spontaneous changes:

  • As requested by @sanderegg , the placement constraint for legacy services, supplied to director-v0 as an env-var, has been changed from dynamicsidecar==true to the more restrictive standardworker==true

CC @Surfict @pcrespov @sanderegg

@sanderegg
Copy link
Member

@mrnicegyu11 :

  • which image did not upload?
  • after some checking the code you showed me with sort=creatordate has a flaw as it takes the commit date. so the 1.39.0 commit date is older than the 1.38.5 one, which can happen since the commit in master might be older than the last release hotfix. Therefore we should investigate whether there is a way to have the tag creation date instead?
  • Postgres: the grafanareader password was indeed changed, you can check the osparc-metrics e2e that shows red as of now
  • Thanks for the configuration change with the legacy services!

@mrnicegyu11
Copy link
Member

@sanderegg:

@sanderegg
Copy link
Member

sanderegg commented Dec 2, 2022

@mrnicegyu11
thanks!

  • concerning the links you passed me I do not think that is an issue. Here is what happened:
    • @Surfict created a hotfix branch, and ran the release process before the hotfix branch CI had completed. So there was no image and that makes perfect sense
    • Then, the hotfix branch CI failed because a PR was missing in the branch (--> pylint jobs failed) I cherry picked the fix but I did not hear back from @Surfict whether that worked or not
    • let's discuss on Monday whether I'm missing something there.
  • I think it is correct that something released might be older (in the use case of hotfixing in the middle, since some commits in a hotfix may very well be more recent than the next production release).
  • But I agree we could/should check the sort=taggerdate (in a way simulating what the deployment agent does) to ensure that the deployment will take the correct tag, very good idea! Let's create an issue out of this.

@pcrespov
Copy link
Member

pcrespov commented Dec 2, 2022

  • Issue: There is a wrong mail-sender in tip.itis.swiss that cannot be resolved by DevOps, it might be a simcore bug?

@mrnicegyu11 @Surfict i wonder if this can be solved by ITISFoundation/osparc-simcore#3576

@Surfict
Copy link
Contributor

Surfict commented Dec 2, 2022

  • Issue: There is a wrong mail-sender in tip.itis.swiss that cannot be resolved by DevOps, it might be a simcore bug?

@mrnicegyu11 @Surfict i wonder if this can be solved by ITISFoundation/osparc-simcore#3576

It seems that it's actually already using the product email. But from the wrong product

@pcrespov
Copy link
Member

pcrespov commented Dec 2, 2022

@mrnicegyu11
Copy link
Member

@pcrespov @Surfict :
In the end, the SMTP_SENDER env-var in the webserver turned out to be non-mandatory and it was unset. The tip.itis.swiss email works again :--)

I will now create ops-tickets from the encountered issues to tackle them 1by1.

@pcrespov
Copy link
Member

pcrespov commented Dec 5, 2022

@pcrespov @Surfict : In the end, the SMTP_SENDER env-var in the webserver turned out to be non-mandatory and it was unset. The tip.itis.swiss email works again :--)

I will now create ops-tickets from the encountered issues to tackle them 1by1.

@mrnicegyu11
Yes, you can see in the code that it had a default.
In addition, as I mentioned above, it will be removed in this PR ITISFoundation/osparc-simcore#3576

@pcrespov pcrespov closed this as completed Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release Preparation for pre-release/release
Projects
None yet
Development

No branches or pull requests

5 participants