Skip to content

Commit

Permalink
Add the OpenStack UPI enhancement proposal
Browse files Browse the repository at this point in the history
  • Loading branch information
tomassedovic authored and pierreprinetti committed Dec 3, 2019
1 parent f796b93 commit eb2a0d3
Showing 1 changed file with 317 additions and 0 deletions.
317 changes: 317 additions & 0 deletions openstack-upi/openstack-upi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,317 @@
---
title: openstack-upi
authors:
- "@tomassedovic"
reviewers:
- "@mandre"
- "@luis5tb"
- "@wking"
- "@cuppett"
approvers:
- TBD
creation-date: 2019-09-10
last-updated: 2019-12-03
status: implementable
---

# OpenStack UPI

## Release Signoff Checklist

- [x] Enhancement is `implementable`
- [x] Design details are appropriately documented from clear requirements
- [ ] Test plan is defined
- [x] Graduation criteria
- [ ] User-facing documentation is created in [openshift/docs]

## Summary

The initial OpenStack support for OpenShift 4 was around the IPI
workflow. While this is the most convenient approach for the end-users
of OpenStack, there are situations where it falls short. People who
want to use OpenStack UPI today have to fall back on
the [bare metal][baremetal-upi] document or try to translate concepts
from the AWS one.

[baremetal-upi]: https://github.com/openshift/installer/blob/master/docs/user/metal/install_upi.md

Similar to AWS and GCP, we want to provide the documentation and
automation (via Ansible) that is specific to OpenStack -- using the
terminology and tooling that OpenStack people are familiar with and
highlighting considerations specific to that platform.

## Motivation

In addition to the general use cases supported by UPI (e.g. control
over the creation of the resources of the cloud provider, custom
Ignition configuration), there are additional reasons for having an
OpenStack-specific UPI.

First, unlike for completely hosted solutions such as AWS or GCP there
is no single OpenStack.

Some deployments are public clouds akin to AWS while others are only
available within a single organisation, possibly disconnected from the
internet. Moreover, the actual services and configuration can vary
greatly between deployments.

The OpenStack UPI should be able to support scenarios such as these:

* Lack of Swift Object Storage (which we use for serving the bootstrap
Ignition config as well as the registry storage)
* No floating IP addresses (e.g. in deployments with provider
networks)
* Disconnected from the internet
* Desire to integrate all traffic with a load balancer or DNS that
already exists within the organisation running the OpenStack

### Goals

* OpenStack UPI documentation available under:
https://github.com/openshift/installer/blob/master/docs/user/openstack/install_upi.md
* Ansible playbooks for automating the OpenStack resource creation
available under:
https://github.com/openshift/installer/tree/master/upi/openstack
* CI job executing these playbooks running wherever the AWS and GCP
jobs run
* Optional documentation and playbook sections for Kuryr

### Non-Goals

Configurations and integrations that are not currently supported by
the OpenStack IPI. For example:

Things such as detailed steps for the provider network support, using
Octavia or Designate for the load balancer / DNS, custom storage
integration and so on can be added as optional sections once the
initial UPI work is complete.


## Proposal

This is where we get down to the nitty gritty of what the proposal
actually is.

* Write the Ansible playbook that automate the OpenStack resource
creation (networks, subnets, ports, servers, etc.)
* Add the Ansible dependencies to the OpenStack CI image
* Write the UPI document, linking the Ansible templates
* Create an UPI job in the [Release repository][openshift-release]
* Add Kuryr documentation and playbooks as an optional SDN alternative

[openshift-release]: https://github.com/openshift/release/

### User Stories

#### Deployment without Swift

In OpenStack systems without Swift and the `tempurl` support, the IPI
deployment will fail when it tries to upload the bootstrap ignition
file to the object store. We can use UPI to generate the Ignition
configs separately, uploading them to a different location and
configuring the servers that way.

The person following the UPI process could upload the bootstrap
ignition into any other location and specify that URL when booting up
the bootstrap node:

1. Create `install-config.yaml`, `manifests` and `ignition-configs`
1. Upload the bootstrap Ignition config file to a location accessible
by the OpenStack servers you are going to create
* This can be an internal object storage service, a local HTTP server, etc.
1. Run the UPI Ansible playbooks that create the networking, security
groups and create OpenStack servers following the UPI documentation
1. After all the servers boot up, feel free to remove the Ignition
configs


#### Deployment without Floating IP addresses

If the OpenStack cloud uses [provider networks][provider-networks] any
server (VM) created is already accessible via its fixed IP address and
tenant networks, subnets and floating IP addresses might not even be
available.

The IPI installation would fail trying to create these.

[provider-networks]: https://docs.openstack.org/networking-ovn/queens/admin/refarch/provider-networks.html

A UPI process would get around this by simply not creating these and
relying on the fact that everything is networked already:

1. Create `install-config.yaml`, `manifests` and `ignition-configs`
1. Modify the Ansible playbooks to not create tenant networks, subnets
or floating IP addresses
1. Upload the boostrap Ignition file to Swift or any other HTTP
storage available to the servers
1. Run the modified Ansible playbooks


### Implementation Details/Notes/Constraints

#### Scope

The UPI implementation covered here will mirror the existing OpenStack
IPI deployment (with and without Kuryr). The initial goal is to
approach feature parity with IPI.

It will point out places where the OpenShift administrator can specify
their custom configuration, but any specific steps, customisation or
automation not supported by IPI are out of scope for this effort.

For the initial implementation, the only network topology being tested
will be a typical vxlan-based tenant network setup. In other words,
OpenStack end-users will create their own Neutron networks and subnets
and provide external access via Neutron routers and Floating IP
addresses.

Any other topologies (explicit spine and leaf, flat provider networks,
etc.) will not be described or tested initially.

#### Automation

The automation for the OpenStack resource creation (networks, subnets
, ports, servers, security group etc.) will be provided via Ansible
playbooks.

The AWS UPI uses CloudFormation. While OpenStack has a similar project
(called Heat), a lot of existing OpenStack deployments do not run it.
It needs to be set up by the OpenStack operators and if it's not
available, there's nothing the end-user can do about it.

Ansible is a tool system administrators already tend to be familiar
with and it does not depend on any projects or code running inside of
OpenStack itself.

#### Kuryr

Just like in OpenStack IPI, Kuryr is an SDN that can be optionally
used to improve the networking performance of pod to pod traffic.

It is something the OpenShift deployer needs to opt into, but it is in
scope for this enhancement.

We will provide Ansible templates to create the extra resources Kuryr
requires, document the steps necessary to enable it as well as the
extra dependencies and quota requirements.

### Risks and Mitigations

The UPI work will mirror the IPI process closely, just with
documentation and Ansible instead of fully-automated process done by
the installer. As such, we do not anticipate any specific risks or
mitigations connected to the UPI process itself.

Running it in the CI does pose additional risks, however.

The OpenStack cloud which hosts the CI has a limited capacity
available to us. Adding additional jobs can run over that capacity and
might require increasing the quota (which is something the OpenStack
team was granted in the past, but we cannot take it for granted).

This will be exacerbated by the potential addition of Kuryr CI (IPI
and UPI) as Kuryr creates vastly more networks, subnets as well as a
number of load balancers.

In addition, any future work that aims to extend the UPI by e.g.
integrating with OpenStack configurations or projects might require
additional quota or features that the current OpenStack CI provider
does not provide.

## Design Details

### Test Plan

In general, the testing strategy should follow the existing UPI
platforms (e.g. AWS, GCP):

- There will be a new e2e job exercising the UPI Ansible templates
- This proposal does not expect making changes to the existing
codebase so there is no expectation of unit tests. If we do end up
making changes, we will add unit tests as appropriate.

### Graduation Criteria

The UPI process in general is well-understood and the OpenStack
platform is not proposing to do anything special. As such, we propose
to `GA` when CI end to end jobs exist and feedback is positive from
the CI, QE and people trying it.

More specifically:

- The UPI document is published in an OpenShift repository
- The OpenStack developers have successfully deployed an UPI OpenShift
following the document
- The UPI document and Ansible playbooks exist
- They have both been validated by more than one developer
- The deployment has been verified on more than one OpenStack cloud
- The CI jobs exist and are being exercised regularly
- The end to end jobs are stable and passing with the required rate
(same as the IPI jobs)
- The UPI deployment process has been followed by people outside of
the development team

We intend to target General Availability in the 4.4 release.

### Upgrade Strategy

Upgrades that will not require changes to the underlying topology of
the OpenStack resources should rely on standard OpenShift upgrade
mechanisms.

If a new OpenShift release does require changes to the OpenStack
resources (e.g. a new Neutron subnet, additional ports, etc.) these
changes will have to be added into the UPI document and OpenShift
errata.

We will attempt to provide adequate upgrade testing in the CI
(focusing form the one-before-last to the latest upgrade), but this
might be limited by the availability of the CI resources.

## Implementation History

Major milestones in the life cycle of a proposal should be tracked in
`Implementation History`.

## Drawbacks

Implementing this feature requires development and testing resources
that could otherwise be utilised elsewhere. In addition, it increases
the support surface of the OpenShift and OpenStack integration.

## Alternatives

### Don't Do It

People whose use-cases are not covered by OpenStack IPI can follow the
Bare Metal UPI document. The drawback of this approach is the extra
work on part of the person deploying OpenShift.

They will need to figure the OpenStack-specific parts on their own and
there will be no automation in place to help them out.

### Extend the OpenStack IPI support

We could identify the most common use cases that drive OpenStack users
towards UPI and add them to the IPI installer.

This would likely increase the amount of configuration options we need
to support (going against the IPI spirit of using only the absolute
minimum configuration necessary).

In addition, this approach can be taken in addition to the UPI work --
by providing the UPI documentation and scripts we can look for common
usage patterns and consider including them in the IPI installer on a
case by case basis.


## Infrastructure Needed

The OpenStack developers have all the infrastructure they need right
now. The same OpenStack clouds used for the IPI development can be
used for UPI as well.

Depending on the CI resource usage, they might need to add additional
quota. In addition, any subsequent integrations of additional
projects, storage solutions etc. will require infrastructure that
supports them -- both for dev and CI.

0 comments on commit eb2a0d3

Please sign in to comment.