Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store deployments in a common bucket #1424

Merged
merged 5 commits into from
Sep 13, 2022
Merged

Conversation

amarthadan
Copy link
Contributor

Close #1402

So sorry about the massive commit 😞 I was changing stuff all over the place and didn't really split it into some meaningful pieces.

I would suggest reviewing the code in airnode-deployer/src/infrastructure as a whole, not looking at the diff, as those files were pretty much rewritten.

I'm still missing tests (and the current ones are broken) but I wanted to open a draft PR for review while I finish them.

@amarthadan amarthadan requested review from aquarat, Siegrift and a team September 7, 2022 09:32
@amarthadan amarthadan self-assigned this Sep 7, 2022
@Siegrift Siegrift mentioned this pull request Sep 8, 2022
@martinkolenic
Copy link
Contributor

martinkolenic commented Sep 8, 2022

Hi @amarthadan, I've run multiple deployments with different stages both on GCP and AWS and it seems the changes are working fine. I was able to complete a request and the shared bucket now replaces the set of buckets we had previously.

@amarthadan
Copy link
Contributor Author

Hi @amarthadan, I've run multiple deployments with different stages both on GCP and AWS and it seems the changes are working fine. I was able to complete a request and the shared bucket now replaces the set of buckets we had previously.

Just to be clear, you tried multiple deployments at the same time, right? Have you also tried multiple deployments with a different Airnode address? That's important as well.
Also, you now shouldn't be able to redeploy if you change the region in the configuration file. Please, test that as well.

@martinkolenic
Copy link
Contributor

martinkolenic commented Sep 8, 2022

Hi @amarthadan, I've run multiple deployments with different stages both on GCP and AWS and it seems the changes are working fine. I was able to complete a request and the shared bucket now replaces the set of buckets we had previously.

Just to be clear, you tried multiple deployments at the same time, right?

If running the deploy commands shortly one after another counts, then yes.

Have you also tried multiple deployments with a different Airnode address? That's important as well.

No, I haven't - will do.

Also, you now shouldn't be able to redeploy if you change the region in the configuration file. Please, test that as well.

Will do.

@amarthadan
Copy link
Contributor Author

If running the deploy commands shortly one after another counts, then yes.

No, I mean having actually 2 or more Airnodes deployed (with different stage names) at the same time, not being deployed at the same time. So they are running and have resources in the cloud at the same time. That's the main thing.

Copy link
Contributor

@Siegrift Siegrift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, but I've only did a shallow review. Will try the feature properly later.

packages/airnode-deployer/src/infrastructure/aws.ts Outdated Show resolved Hide resolved
cloudProvider: config.nodeSettings.cloudProvider as CloudProvider,
stage: config.nodeSettings.stage,
nodeVersion: config.nodeSettings.nodeVersion,
timestamp,
timestamp: new Date().toISOString(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhhh, this will make the timestamp on cloud provider different to what is in the receipt. Shouldn't those two be the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that and didn't know what approach would be better. These two can vary by up to 15 min (GCP so slooow). I think the timestamp in the receipt should be of the time the deployment ended. But I guess that's up to interpretation. It can be the time it started 🤷‍♂️ But that's odd, isn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't users identify a particular deployment by that timestamp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean in the cloud? I don't think they will. First of all, they should not need to deal with the cloud resources manually. And second, the deployment is distinguished by Airnode address and stage. And you're currently deployed version is the one with the latest timestamp, no matter what's in your receipt. But I get what you're saying, I can sync those two timestamps, I don't have a strong preference.

packages/airnode-deployer/src/utils/infrastructure.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/utils/infrastructure.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/infrastructure/gcp.ts Outdated Show resolved Hide resolved
@amarthadan amarthadan force-pushed the deployments-common-bucket branch 4 times, most recently from 62cbf7c to e8276b8 Compare September 12, 2022 07:58
@amarthadan amarthadan marked this pull request as ready for review September 12, 2022 07:58
@amarthadan amarthadan force-pushed the deployments-common-bucket branch 2 times, most recently from 5b587bd to 697d24f Compare September 12, 2022 09:31
Copy link
Contributor

@aquarat aquarat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this change exciting and it's a lot of work. A very interesting read 🔥.
I enjoyed the path traversal reduce and comments around it.
I don't see any major issues, just questions and perhaps enhancements that may save us pain in future.

packages/airnode-deployer/src/cli/commands.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/utils/infrastructure.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/utils/infrastructure.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/terraform/gcp/main.tf Outdated Show resolved Hide resolved
Copy link
Contributor

@Siegrift Siegrift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I encountered an issue when running this which @amarthadan reproduced, will take another look when he debugs the issue.

@martinkolenic
Copy link
Contributor

Hi @amarthadan and @Siegrift, it seems that despite my best effort I did something wrong during testing and I cannot reproduce the functioning results now. I'm currently investigating it.

@amarthadan
Copy link
Contributor Author

@Siegrift It should be fixed now.

@martinkolenic
Copy link
Contributor

martinkolenic commented Sep 12, 2022

@Siegrift It should be fixed now.

I managed to deploy and send a request now, it seems to be working correctly now. The fact that I did not catch the issue might be because of a human error on my side, launching something different.

Copy link
Contributor

@Siegrift Siegrift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR:

All basic ops seems to be working, but there were some unexpected failures along the way. I wrote notes for the full journey, maybe it helps... Will take a look at the code once more tomorrow.

Notes

  • Deployed Airnode with both gateways
    • success
  • I forgot to specify HTTP trigger, so I redeployed with modified config (same stage)
    • failed (because it was already there) and autoremove removed the original Airnode
    • this is an awkward side effect of autoremove, but it’s the technically correct thing
    • BUG: This should not have happened (but later this exact thing works again)

Pasted Graphic 3

  • Deployed again
    • success, both gateways work and RRP request works
  • Changed stage, deployed again
    • failed, because gateway keys already exist, but autoremove was sucessfull
    • the error is kinda annoying, but we I don't think we can do anything about it
    • previous airnode works
  • Changed secrets and redeployed
    • failed, got message “Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its minimum value of [100].”
    • The error is correct. The account unreserved concurency is 189, the example reserves 100 for RRP and 20 per gateway = 140
    • TODO(new task): Decrease examples reserved concurrency to 1
  • Changed “maxConcurrency” to 1
    • success, gateways + RRP works, aws folder structure is correct
    • I actually also changed the mnemonic so it deployed a completely different Airnode

Objects Properties

  • Created new secrets again and changed stage, but preserved mnemonic, deployed again
    • success, everything works

0x6abEdc0A4d1A79eD62160396456¢95C5607369D3

  • Changed mnemonic to the old Airnode, but preserved the same stage (and forgot to update gateway keys)
    • wanted to stop deployment by quickly pressing ctrl+z, deployer got stuck so I killed it the hard way (stopped the docker)
    • TODO(task): The docker (or the spinner) should not get frozen when pressing ctrl+z
    • the original airnode is still operational (both gateways, didn’t try RRP)

Pasted Graphic 2

  • Created new gateway keys, trying to redeploy
    • failed, error: “Failed to copy file '0x5691ceD2b03593a8d67BdfF4Fe96107bfe603825/dev3/1663007611553/default.tfstate' to file '0x5691ceD2b03593a8d67BdfF4Fe96107bfe603825/dev3/1663007887640/default.tfstate' within S3 bucket 'airnode-6d46164e9ddd': NoSuchKey: The specified key does not exist.”
    • No idea what it means, autoremove worked
  • Trying again the same thing
    • both gateways work, aws structure is correct
  • removing the last airnode (and lost the receipts for others)
    • success
  • deploying again
    • success
  • deploying again (to verify re-deployment bug)
    • this time success

Properties

  • trying to deploy again 2x
    • success, success
  • change region, deploy
    • deploy fail, autoremove fail (which is actually a success)

Pasted Graphic 6

  • changed config.json by disabling http gateway, deploy
    • deploy fail, autoremove success (so it removed the airnode)

image

  • changed the config back, successfully redeployed, changed the config to disable one of the gateways, deployed again
    • worked

@amarthadan
Copy link
Contributor Author

@Siegrift Thanks for the thorough test.

As far as I can tell these are the issues that need to be investigated/fixed:

  1. The docker (or the spinner) should not get frozen when pressing ctrl+c
  2. Failed to copy file issue - This most likely happened because you forcefully stopped the deployment. But I agree that we can prevent at least some cases.
  3. changed config.json by disabling http gateway, deploy -> deploy fail - I have no idea what this is, will investigate.

As for the topmost issue (screenshot), I don't think that happened because the Airnode was already there. From the screenshot, it looks like a network error. There's not much we can do about that. Or are they two different issues?

If you're ok with it I would address these in separate PRs. Also, do these need to be fixed for 0.9 to be released? That's more of a question for @aquarat I guess. I would like to address them but if we're waiting for 0.9 they are not critical and can be fixed later.

@Siegrift
Copy link
Contributor

If you're ok with it I would address these in separate PRs.

Yeah. Btw. as you can see I only tested the AWS part. Leaving the GCP testing to @martinkolenic .
I'll take a one more look at code and give the final review...

Copy link
Contributor

@Siegrift Siegrift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. I've found a few comments and typos and I am not sure about updating using a different nodeVersion...

@@ -15,7 +15,7 @@
"clean": "rimraf -rf *.tsbuildinfo ./dist ./.webpack ./build *.tgz",
"cli": "ts-node bin/deployer.ts",
"compile": "tsc --build tsconfig.json",
"copy:terraform": "copyfiles terraform/**/**/**/*.tf terraform/**/**/**/*.tpl dist/",
"copy:terraform": "copyfiles terraform/**/**/*.tf terraform/**/**/*.tpl dist/",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mhh, is the pattern "/**/" (x2) necessary? The ** stands for any nested directory so / feels redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would think so but it doesn't work that way for copyfiles. I can doublecheck but I'm pretty sure I had to go as deep as the directory structure was in order to not miss any files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it needs to be like this 🤷‍♂️

packages/airnode-deployer/src/infrastructure/aws.test.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/infrastructure/aws.test.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/infrastructure/aws.test.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/infrastructure/index.ts Outdated Show resolved Hide resolved
packages/airnode-deployer/src/infrastructure/index.ts Outdated Show resolved Hide resolved
@martinkolenic

This comment was marked as resolved.

@amarthadan
Copy link
Contributor Author

Hi @amarthadan, with the newest commit fa7340ca, running yarn build in the root directory of the repo results in a failure to build airnode-deployer, see screenshot.

That was just a temporary state as I accepted the suggested changes. Should be fixed now.

Copy link
Contributor

@Siegrift Siegrift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM, but haven't tested the GCP side.

btw. thx for not squashing the last commits.

@martinkolenic

This comment was marked as resolved.

@amarthadan

This comment was marked as resolved.

@martinkolenic

This comment was marked as resolved.

@martinkolenic
Copy link
Contributor

Hi, @amarthadan, my test results (GCP):

  • deploy Airnode with both gateways enabled - PASS
  • redeploy with a changed stage (and new API keys) - PASS
  • redeploy with a changed mnemonic (same stage) - FAIL
    • just like @Siegrift, I forgot to change the API keys so I tried to stop the deployment with Ctrl+Z, which did not work (the spinner kept spinning though) and so I killed the process after about 34 seconds.
    • I hit the quota for regional per-project gateways, screenshot below
    • Autoremove succeeded though; only the latest Airnode things were removed
      image

The folder structure was correct, though, all Airnodes were kept in the same bucket and all the functions in another bucket. That error I encountered was only due to a quota set on GCP's side so that's not us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rework deployments to use a common bucket
4 participants