Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAQ: AWS private regions #396

Merged

Conversation

cgwalters
Copy link
Member

This has come up a few times.

@cgwalters
Copy link
Member Author

Didn't test this e2e, but it's how it should work.

@cgwalters
Copy link
Member Author

Some people are hitting:

IMPORTIMAGETASKS        x86_64  rhcos-43.81.201912030353.0-aws.x86_64.vmdk import-ami-fg6w6vod     Linux   deleted ClientError: EFI partition detected. UEFI booting is not supported in EC2.

offhand, I don't know why our pipeline isn't hitting this. One theory is that it's somehow specific to the aws CLI tool but that seems unlikely. Another possibility is that it's only newer EC2 regions which implement this check - we're uploading to the One True Region (us-east-1) and replicating from there.

@cgwalters
Copy link
Member Author

This all said, maybe we should just delete the UEFI partition in our EC2 images. Part of me feels that it unnecessarily breaks the uniformity we have, and it's also explicitly against where we want to go in the future (using UEFI more across the board) but...

@darkmuggle
Copy link

The uniformity for a dead UEFI partition that we know won't be used doesn't really buy us much. In this case, the uniformity is academic. Dropping the UEFI partition seems like a minor thing, when we can just delete the partition and not change anything else. And with GPT partitions we can still keep root on part 4.

@jlebon
Copy link
Member

jlebon commented Jan 14, 2020

I think the property of having each platform image be a simple transform step away from each other is really nice, but I'm not strongly opposed to it. There's something subtle going on here though if neither RHCOS nor FCOS hit this. Probably worth investigating a bit before using the nuclear option?

Note also if we do this, we'll probably have to adapt the mount generator too.

@jaredhocutt
Copy link

Due to the UEFI issue described by @cgwalters I've been working to find a workaround for getting a RHCOS image into AWS manually (especially in private AWS regions). Here are the details for how I used the RHCOS bare metal BIOS raw image and modified it to work: https://github.com/jaredhocutt/openshift4-aws/tree/master/rhcos#how-we-got-it-to-work

It's not ideal and I wouldn't expect anyone to actually do it that way if they want a supported cluster, but I did want to pass along what I've figured out.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 22, 2020
@cgwalters
Copy link
Member Author

Here are the details for how I used the RHCOS bare metal BIOS raw image and modified it to work: https://github.com/jaredhocutt/openshift4-aws/tree/master/rhcos#how-we-got-it-to-work

Eek, no please don't do it that way. By snapshotting a booted system, you've saved things like SSH keys (so each machine will have the same host key, random seed, etc.)

What you want to do is zap the partition offline. You should be able to do this by getting the raw vmdk file and using any partition program (fdisk etc.) on it.

You'll then have a failed systemd unit on startup looking for it as mentioned above so you'd probably need to do something like actually replace the partition with a non-FAT. (Or disable the unit, but that's a bit awkward to do in a way persistent across upgrades)

We're discussing potential upstream fixes here.

@jaredhocutt
Copy link

Eek, no please don't do it that way. By snapshotting a booted system, you've saved things like SSH keys (so each machine will have the same host key, random seed, etc.)

I mounted it as a secondary disk and did not boot it. So did exactly what you said, just by mounting it to an EC2 instance instead of doing it on my laptop.

@cgwalters
Copy link
Member Author

I mounted it as a secondary disk and did not boot it. So did exactly what you said, just by mounting it to an EC2 instance instead of doing it on my laptop.

Got it, sorry. Yes, that's fine.

@jaredhocutt
Copy link

I was also able to figure out how to get the 4.3 AWS VMDK image to work, which I've added to the same GitHub page just below my details for the bare metal image in 4.2.

The big issue is that with the current 4.3 AWS VMDK, you cannot use aws ec2 import-image to import that image as-is because it gives you the EFI partition detected error. It seems the aws ec2 import-image command tries to "help" you by checking things and as soon as it sees the EFI partition, it fails.

However, I was able to import the image just as a simple snapshot using aws ec2 import-snapshot and then use aws ec2 register-image to create the AMI. That worked and I was able to boot the image with a bootstrap.ign file (I didn't go through with a full install though).

So this works for now, but we really need to have an image that we can use aws ec2 import-image with as-is.

@jlebon
Copy link
Member

jlebon commented Jan 22, 2020

However, I was able to import the image just as a simple snapshot using aws ec2 import-snapshot and then use aws ec2 register-image to create the AMI

Ahh yup, this matches up with what ore aws upload does, so it makes sense that this works (and actually, anyone can use ore to do this but clearly the aws CLI is the standard tool).

So this works for now, but we really need to have an image that we can use aws ec2 import-image with as-is.

Hmm, might be worth asking AWS to refine that API to not erroneously reject images that have EFI partitions if they also have a BIOS boot partition. Or barring that, some kind of "I know what I'm doing" flag.

@cgwalters
Copy link
Member Author

However, I was able to import the image just as a simple snapshot using aws ec2 import-snapshot and then use aws ec2 register-image to create the AMI. That worked and I was able to boot the image with a bootstrap.ign file (I didn't go through with a full install though).

If that works consistently, then I think it's much simpler to just document it. I'll update this PR. And further for OpenShift, the installer should have a high level command for this.

This has come up a few times.
@cgwalters cgwalters force-pushed the faq-aws-private-region branch from 12aa9ad to 67f1cb7 Compare January 23, 2020 16:28
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 23, 2020
@jaredhocutt
Copy link

If that works consistently, then I think it's much simpler to just document it. I'll update this PR. And further for OpenShift, the installer should have a high level command for this.

It may be simpler to document, but it's not how users of AWS expect it to work. The AWS documentation describes using aws ec2 import-image as the way to import AMIs. The RHCOS AWS image should be in a format that works with that command.

@jlebon
Copy link
Member

jlebon commented Jan 24, 2020

It may be simpler to document, but it's not how users of AWS expect it to work. The AWS documentation describes using aws ec2 import-image as the way to import AMIs. The RHCOS AWS image should be in a format that works with that command.

I've opened a support case with AWS to fix the problematic ImportImage heuristic.

@jlebon
Copy link
Member

jlebon commented Jan 24, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 24, 2020
@openshift-merge-robot openshift-merge-robot merged commit a9f3a42 into openshift:master Jan 24, 2020
@jlebon
Copy link
Member

jlebon commented Jan 27, 2020

Got a response from AWS about this. Essentially, the ImportImage API isn't just the equivalent of ImportSnapshot + RegisterImage. It's more part of the VM import/export path for people migrating their workloads to AWS: https://docs.aws.amazon.com/vm-import/latest/userguide/vmie_prereqs.html

As such, the API is much more invasive. For example, for Windows images, it'll detect UEFI boot partitions and convert them to MBR. It doesn't support Linux UEFI images.

But the point is that there's a mismatch of intent. Its goal is to implement automatic conversion heuristics, which I don't think we want. So overall, I think we should stick with the ImportSnapshot workflow to be sure the final image is exactly as we intend.

@jaredhocutt
Copy link

@jlebon Thanks for the update. In that case, when we document this method, it would be nice to do 2 things.

  1. Describe the steps for getting an RHCOS image into AWS, but also link out to this page in the AWS documentation as well for reference: https://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-import-snapshot.html

  2. Specifically call out that using the VM import documentation in AWS not work. I suspect that a lot of people will give that a try first and when they finally come back to actually read the documentation, it would be nice for them to find a specific statement confirming that the VM import method will not work and that they should follow different instructions.

jlebon added a commit to jlebon/os that referenced this pull request Jan 31, 2020
Flesh things out a bit more based on discussions in
openshift#396.
jlebon added a commit to jlebon/os that referenced this pull request Jan 31, 2020
Flesh things out a bit more based on discussions in
openshift#396.
@jlebon
Copy link
Member

jlebon commented Jan 31, 2020

@jaredhocutt I posted a follow-up here: #398.

@jaredhocutt
Copy link

@jaredhocutt I posted a follow-up here: #398.

Awesome! Thanks @jlebon :)

@dmc5179
Copy link

dmc5179 commented May 18, 2020

The export process of AMIs fails for the same reason, the UEFI partitions. This means that it is not possible to get RHCOS images onto AWS SnowBall edge devices. I've spoken with the AWS TAMs at the NGA and they have said it is not possible to import snapshots and register images against the SnowBall edge devices as can be done for standard AWS like described in this issue.

cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this pull request May 18, 2020
Nothing in the OS touches the ESP by default, so there's
no reason to mount it by default, particularly wriable.
This is good for avoiding wear&tear on the filesystem, but
I am specifically doing this as preparation for potentially
removing the ESP from AWS images, because AWS `ImportImage`
chokes on its presence:
openshift/os#396
cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this pull request May 18, 2020
Nothing in the OS touches the ESP by default, so there's
no reason to mount it by default, particularly writable.
This is good for avoiding wear&tear on the filesystem, but
I am specifically doing this as preparation for potentially
removing the ESP from AWS images, because AWS `ImportImage`
chokes on its presence:
openshift/os#396
cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this pull request May 25, 2020
Preparation for potentially removing the ESP from AWS images,
because AWS `ImportImage` chokes on its presence:
openshift/os#396
@jlebon
Copy link
Member

jlebon commented Jan 14, 2021

Re. Snowball, looks like the EFI partition is no longer an issue now as Dan mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1794157#c14.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants