-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store deployments in a common bucket #1424
Conversation
Hi @amarthadan, I've run multiple deployments with different stages both on GCP and AWS and it seems the changes are working fine. I was able to complete a request and the shared bucket now replaces the set of buckets we had previously. |
Just to be clear, you tried multiple deployments at the same time, right? Have you also tried multiple deployments with a different Airnode address? That's important as well. |
If running the deploy commands shortly one after another counts, then yes.
No, I haven't - will do.
Will do. |
No, I mean having actually 2 or more Airnodes deployed (with different stage names) at the same time, not being deployed at the same time. So they are running and have resources in the cloud at the same time. That's the main thing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments, but I've only did a shallow review. Will try the feature properly later.
cloudProvider: config.nodeSettings.cloudProvider as CloudProvider, | ||
stage: config.nodeSettings.stage, | ||
nodeVersion: config.nodeSettings.nodeVersion, | ||
timestamp, | ||
timestamp: new Date().toISOString(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhhh, this will make the timestamp on cloud provider different to what is in the receipt. Shouldn't those two be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about that and didn't know what approach would be better. These two can vary by up to 15 min (GCP so slooow). I think the timestamp in the receipt should be of the time the deployment ended. But I guess that's up to interpretation. It can be the time it started 🤷♂️ But that's odd, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't users identify a particular deployment by that timestamp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean in the cloud? I don't think they will. First of all, they should not need to deal with the cloud resources manually. And second, the deployment is distinguished by Airnode address and stage. And you're currently deployed version is the one with the latest timestamp, no matter what's in your receipt. But I get what you're saying, I can sync those two timestamps, I don't have a strong preference.
62cbf7c
to
e8276b8
Compare
5b587bd
to
697d24f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this change exciting and it's a lot of work. A very interesting read 🔥.
I enjoyed the path traversal reduce and comments around it.
I don't see any major issues, just questions and perhaps enhancements that may save us pain in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I encountered an issue when running this which @amarthadan reproduced, will take another look when he debugs the issue.
Hi @amarthadan and @Siegrift, it seems that despite my best effort I did something wrong during testing and I cannot reproduce the functioning results now. I'm currently investigating it. |
f1fc7d3
to
e82a04a
Compare
@Siegrift It should be fixed now. |
I managed to deploy and send a request now, it seems to be working correctly now. The fact that I did not catch the issue might be because of a human error on my side, launching something different. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR:
All basic ops seems to be working, but there were some unexpected failures along the way. I wrote notes for the full journey, maybe it helps... Will take a look at the code once more tomorrow.
Notes
- Deployed Airnode with both gateways
- success
- I forgot to specify HTTP trigger, so I redeployed with modified config (same stage)
- failed (because it was already there) and autoremove removed the original Airnode
- this is an awkward side effect of autoremove, but it’s the technically correct thing
- BUG: This should not have happened (but later this exact thing works again)
- Deployed again
- success, both gateways work and RRP request works
- Changed stage, deployed again
- failed, because gateway keys already exist, but autoremove was sucessfull
- the error is kinda annoying, but we I don't think we can do anything about it
- previous airnode works
- Changed secrets and redeployed
- failed, got message
“Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its minimum value of [100].”
- The error is correct. The account unreserved concurency is 189, the example reserves 100 for RRP and 20 per gateway = 140
- TODO(new task): Decrease examples reserved concurrency to 1
- failed, got message
- Changed “maxConcurrency” to 1
- success, gateways + RRP works, aws folder structure is correct
- I actually also changed the mnemonic so it deployed a completely different Airnode
- Created new secrets again and changed stage, but preserved mnemonic, deployed again
- success, everything works
- Changed mnemonic to the old Airnode, but preserved the same stage (and forgot to update gateway keys)
- wanted to stop deployment by quickly pressing ctrl+z, deployer got stuck so I killed it the hard way (stopped the docker)
- TODO(task): The docker (or the spinner) should not get frozen when pressing ctrl+z
- the original airnode is still operational (both gateways, didn’t try RRP)
- Created new gateway keys, trying to redeploy
- failed, error: “Failed to copy file '0x5691ceD2b03593a8d67BdfF4Fe96107bfe603825/dev3/1663007611553/default.tfstate' to file '0x5691ceD2b03593a8d67BdfF4Fe96107bfe603825/dev3/1663007887640/default.tfstate' within S3 bucket 'airnode-6d46164e9ddd': NoSuchKey: The specified key does not exist.”
- No idea what it means, autoremove worked
- Trying again the same thing
- both gateways work, aws structure is correct
- removing the last airnode (and lost the receipts for others)
- success
- deploying again
- success
- deploying again (to verify re-deployment bug)
- this time success
- trying to deploy again 2x
- success, success
- change region, deploy
- deploy fail, autoremove fail (which is actually a success)
- changed config.json by disabling http gateway, deploy
- deploy fail, autoremove success (so it removed the airnode)
- changed the config back, successfully redeployed, changed the config to disable one of the gateways, deployed again
- worked
@Siegrift Thanks for the thorough test. As far as I can tell these are the issues that need to be investigated/fixed:
As for the topmost issue (screenshot), I don't think that happened because the Airnode was already there. From the screenshot, it looks like a network error. There's not much we can do about that. Or are they two different issues? If you're ok with it I would address these in separate PRs. Also, do these need to be fixed for 0.9 to be released? That's more of a question for @aquarat I guess. I would like to address them but if we're waiting for 0.9 they are not critical and can be fixed later. |
e82a04a
to
da28176
Compare
Yeah. Btw. as you can see I only tested the AWS part. Leaving the GCP testing to @martinkolenic . |
da28176
to
c82a68b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. I've found a few comments and typos and I am not sure about updating using a different nodeVersion...
@@ -15,7 +15,7 @@ | |||
"clean": "rimraf -rf *.tsbuildinfo ./dist ./.webpack ./build *.tgz", | |||
"cli": "ts-node bin/deployer.ts", | |||
"compile": "tsc --build tsconfig.json", | |||
"copy:terraform": "copyfiles terraform/**/**/**/*.tf terraform/**/**/**/*.tpl dist/", | |||
"copy:terraform": "copyfiles terraform/**/**/*.tf terraform/**/**/*.tpl dist/", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhh, is the pattern "/**/" (x2) necessary? The ** stands for any nested directory so / feels redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would think so but it doesn't work that way for copyfiles
. I can doublecheck but I'm pretty sure I had to go as deep as the directory structure was in order to not miss any files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it needs to be like this 🤷♂️
This comment was marked as resolved.
This comment was marked as resolved.
That was just a temporary state as I accepted the suggested changes. Should be fixed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM, but haven't tested the GCP side.
btw. thx for not squashing the last commits.
bb4c8fd
to
4c7b013
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Hi, @amarthadan, my test results (GCP):
The folder structure was correct, though, all Airnodes were kept in the same bucket and all the functions in another bucket. That error I encountered was only due to a quota set on GCP's side so that's not us. |
Close #1402
So sorry about the massive commit 😞 I was changing stuff all over the place and didn't really split it into some meaningful pieces.
I would suggest reviewing the code in
airnode-deployer/src/infrastructure
as a whole, not looking at the diff, as those files were pretty much rewritten.I'm still missing tests (and the current ones are broken) but I wanted to open a draft PR for review while I finish them.