Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for application gateways with multiple backend pools #644

Closed
Justin-W opened this issue Nov 10, 2021 · 9 comments · Fixed by #651 or cloudfoundry/docs-bosh#761
Closed

Support for application gateways with multiple backend pools #644

Justin-W opened this issue Nov 10, 2021 · 9 comments · Fixed by #651 or cloudfoundry/docs-bosh#761

Comments

@Justin-W
Copy link
Contributor

Is your feature request related to a problem? Please describe.

A single Azure Application Gateway can have multiple backend pools of VMs, and route traffic to different pools in various ways (e.g. path-based routing) so that a single APG front-end handles all traffic for multiple backends. However, it appears that the CPI's existing support for adding VMs to APGs (via the application_gateway property of vm_types/vm_extensions config data) which was added in #169 and #312 always attaches any Bosh-created VMs to the first backend pool, making it impossible to configure Bosh to manage the VMs of any APG backend pool(s) except the first pool, which currently forces us to configure a separate APG (with a separate frontend, etc. per APG) for each Bosh-managed backend pool.

Describe the solution you'd like

I would like to be able to use Bosh to manage multiple backend pools of VMs, where:

  1. multiple pools are attached to the same APG; (e.g. a VM needs to be attached to a single pool, but it's not the AP's first pool)
  2. some VMs may need to be attached to multiple backend pools (of the same APG); (e.g. such VMs handle traffic for more than one backend port/protocol/path/etc.)
  3. (Optionally) some VMs may need to be attached to (single or multiple) backend pools of multiple APGs; (e.g. such VMs handle traffic for multiple APGs/frontends; and also potentially multiple backends of some or all of those APGs, as in requirement 2)

The first 2 requirements above would seem to be most easily addressed by adding support for a new application_gateway_backend_pools property (as a list/array, to address requirement 2 mentioned above) in the vm_types/vm_extensions config data.

And although the 3rd requirement adds additional value hypothetically, my current use case doesn't technically require it (although I've addressed it in one of the alternative implementations discussed below).

Describe alternatives you've considered

Alternative 1: We could change the existing application_gateway property of vm_types/vm_extensions config data from a String to a Map/Hash which would support 2 properties (e.g. name and backend_pools), but that would be backwards-incompatible with existing configurations which already pass a String value.

Alternative 2: We could (maybe?) change the existing application_gateway property of vm_types/vm_extensions config data to allow either a String or a Map/Hash (as described in the previous alternative), since that would address backwards-compatibility. But this might be more complex to implement, test, and document/understand. Also, I'm not sure if multi-typed config properties are possible and/or permitted, as I don't recall any other such multi-typed properties in this or other CPIs' cloud properties. So I figured that this alternative was probably infeasible.

Alternative 3: We could add support for a new application_gateway_backend_pool property (as a String instead of a List), but that would not address requirement 2 (which is the most important of the 3 requirements for my use cases).

Alternative 4: we could add a new application_gateways property (as a List/Array of Maps/Hashes like those described in alternative 2 above) of vm_types/vm_extensions config data, as a more flexible alternative to the existing application_gateway String property. This would enable VMs to be attached to multiple backend pools of multiple APGs (optional requirement 3 above). And the config schema would seem consistent with how some other sections of Bosh config (e.g. the az and azs properties documented here) currently handle multi-typed config data alternatives, and would be backwards-compatible. But this alternative would require the CPI implementation support, test, etc. both the old and new properties, including extending the existing APG functionality to support VMs being attached to both multiple backends (requirement 2) and multiple APGs (requirement 3), and I'm not certain how much more complex it would be to implement support for the 3rd requirement (which I don't currently have a direct need for). However, if anyone will have a legitimate need for the 3rd requirement, then supporting it now would probably be better than adding support for it later (which would probably involve even more backwards-compatibility-related schema issues).

Additional context

@Justin-W
Copy link
Contributor Author

FYI: I'm planning on trying to implement this feature myself ASAP, but given the config schema backwards-compatibility issues I mentioned above, I was hoping for a bit of guidance on the design/implementation alternatives I outlined above.

I'm thinking that the primary design I recommended above (i.e. a new application_gateway_backend_pools property) or else alternative 4 are the most feasible (especially from a backwards-compatibility perspective). However, alternatives 1 and 2 may also be feasible if someone can provide me with a bit of guidance (about how to proceed with backwards-incompatible config schema changes and/or multi-typed String/Hash config schema elements, respectively).

@klakin-pivotal @rkoster @bgandon Any thoughts or tips before I start work on a PR?

@donghoang89
Copy link

We are facing the same issue. This would be a great feature to have.

@rkoster
Copy link
Contributor

rkoster commented Nov 15, 2021

I'm in favor of alternative 4, since, as you pointed out already, it is in line with how similar changes have been rolled out before (the az, azs example being one of those).

@beyhan
Copy link
Member

beyhan commented Nov 16, 2021

+1
We should also mark the old way as deprecated in case this feature gets implemented and add a short example in the docs how to switch from the old property to the new one.

@bosh-admin-bot
Copy link

This issue was marked as Stale because it has been open for 21 days without any activity. If no activity takes place in the coming 7 days it will automatically be close. To prevent this from happening remove the Stale label or comment below.

@Justin-W
Copy link
Contributor Author

FYI: PR #651 is a combination of previously-discussed (see above) alternatives 2 and 4. Once I dug into the code, I realized that the vm_type/load_balancer config already allowed either a String or a Hash (and also allowed a hacky version of multiple LBs via a delimited String, instead of an Array). Since there is much symmetry between the vm_type/load_balancer and vm_type/application_gateway config sections, it seemed best to maintain that symmetry (by making both sections capable of supporting an Array of Hashes), and making both sections support the same 3 sub-properties.

Justin-W added a commit to Justin-W/docs-bosh that referenced this issue Dec 10, 2021
…s related to the following Azure CPI PRs/issues:

- cloudfoundry/bosh-azure-cpi-release#651 (Fixes cloudfoundry/bosh-azure-cpi-release#644)
- cloudfoundry/bosh-azure-cpi-release#541 (Fixed cloudfoundry/bosh-azure-cpi-release#328)
- cloudfoundry/bosh-azure-cpi-release#638 (Fixed cloudfoundry/bosh-azure-cpi-release#111)

Note: This commit assumes that PR cloudfoundry#651 will be included in a new `v37.7.0` release of the Azure CPI (in order to properly document the CPI version requirements of new features). If PR cloudfoundry#651 is included in a different CPI version, then this PR will need to be amended appropriately.
@Justin-W
Copy link
Contributor Author

FYI: PR cloudfoundry/docs-bosh#761 updates the CPI docs to be consistent with the changes in #651 (and also some previous PRs; see cloudfoundry/docs-bosh/pull/761 descriptions for details).

We should also mark the old way as deprecated in case this feature gets implemented and add a short example in the docs how to switch from the old property to the new one.

Note that I didn't make the "old" String format deprecated (in either the CPI code or docs PRs), because:

  • the old String-only format continues to work fine for some use cases, and involves simpler config than the newer, more powerful Hash and Array formats
  • the original and newer formats are both supported by the new implementation
  • there are still many specs, documentation pages, etc. that continue to use the original String-only format
  • I ended up not needing to create a new, separate config entry, so there is no possibility of conflicts from configs containing both formats (unless they have dupe config properties with different formats, which would be more of a YAML correctness issue than a Bosh config issue)

Justin-W added a commit to Justin-W/docs-bosh that referenced this issue Dec 15, 2021
…ample of" sections for alternate Load Balancer configurations, for new and existing features related to the following Azure CPI PRs/issues:

- cloudfoundry/bosh-azure-cpi-release#651 (Fixes cloudfoundry/bosh-azure-cpi-release#644)
- cloudfoundry/bosh-azure-cpi-release#541 (Fixed cloudfoundry/bosh-azure-cpi-release#328)
- cloudfoundry/bosh-azure-cpi-release#638 (Fixed cloudfoundry/bosh-azure-cpi-release#111)

Note: This commit assumes that PR cloudfoundry#651 will be included in a new `v37.7.0` release of the Azure CPI (in order to properly document the CPI version requirements of new features). If PR cloudfoundry#651 is included in a different CPI version, then this PR will need to be amended appropriately.
Justin-W added a commit to Justin-W/docs-bosh that referenced this issue Dec 15, 2021
…y configurations, for new and existing features related to the following Azure CPI PRs/issues:

- cloudfoundry/bosh-azure-cpi-release#651 (Fixes cloudfoundry/bosh-azure-cpi-release#644)
- cloudfoundry/bosh-azure-cpi-release#541 (Fixed cloudfoundry/bosh-azure-cpi-release#328)
- cloudfoundry/bosh-azure-cpi-release#638 (Fixed cloudfoundry/bosh-azure-cpi-release#111)

Note: This commit assumes that PR cloudfoundry#651 will be included in a new `v37.7.0` release of the Azure CPI (in order to properly document the CPI version requirements of new features). If PR cloudfoundry#651 is included in a different CPI version, then this PR will need to be amended appropriately.
@bosh-admin-bot
Copy link

This issue was marked as Stale because it has been open for 21 days without any activity. If no activity takes place in the coming 7 days it will automatically be close. To prevent this from happening remove the Stale label or comment below.

@rkoster rkoster removed the Stale label Jan 6, 2022
Justin-W added a commit to Justin-W/bosh-azure-cpi-release that referenced this issue Jan 12, 2022
…foundry#111 and cloudfoundry#328, in preparation for cloudfoundry#644)

Note: There was already a high degree of symmetry (of both functionality and implementation) between the CPI's support for LBs and AGWs. However, previous CPI enhancements (for example: cloudfoundry#541 for cloudfoundry#111 and cloudfoundry#638 for cloudfoundry#328) had reduced this symmetry to some degree, and also caused some divergence between the names of some methods and variables (vs their implementations/values), which would make it harder to both implement and review the new functionality for cloudfoundry#644, and also harder to maintain the symmetry between the LB and AGW functionality.

These changes include:

- refactoring of existing code/implementation. E.g. Renamed some vars and methods to more accurately reflect their current values/behavior.
- minor configuration API enhancements. E.g. Allow the `load_balancer` config to be configured as an Array, instead of the (previously-undocumented, but already-supported) comma-delimited string of LB names, to support multiple LB configs which differ by more than just the LB name (which now enables the functionality of cloudfoundry#111 and cloudfoundry#328 to be used more independently within a single config).
- adding multiple new/missing unit tests, which cover existing functionality (previously added by cloudfoundry#541 for cloudfoundry#111 and cloudfoundry#638 for cloudfoundry#328) which was previously not covered/tested by the unit test suite.

---

COMMIT HISTORY:

Modified the `VMCloudProps._parse_load_balancer_config` method to allow the `load_balancer` config to be configured as an Array.
Note: When specifying an Array of LBs, each element of the Array should be a String and/or Hash of the same format which was previously (and still is) valid when configuring a single LB as a non-Array.

Refactoring: Minor refactoring in the `VMManager._get_load_balancers` method.

Added additional data type validation to the `VMCloudProps._parse_load_balancer_config` method.

Modified the `VMCloudProps._parse_load_balancer_config` method to return `nil` when the `load_balancer` config is omitted/missing.

Added additional specs for the `VMCloudProps._parse_load_balancer_config` method.

Refactoring: Renamed the `VMCloudProps.load_balancer` attribute.

Refactoring of the `VMCloudProps._parse_load_balancer_config` and `VMManager._get_load_balancers` methods' implementations.

Refactoring: Cleanup of the `VMCloudProps._parse_load_balancer_config` method's implementation.

Refactoring: Minor cleanup of the `VMManager._get_load_balancers` method's implementation.

Renamed a key within the Hash returned by the `AzureClient.parse_network_interface` method.

Fixed the `AzureClient.parse_network_interface` method to correctly return all of the NIC's load_balancers' info (instead of only the first LB's info).

Refactoring: Minor refactoring in the `AzureClient.create_network_interface` method.

Refactoring: Renamed some vars in the `AzureClient.create_network_interface` method.

Refactoring: Renamed a key within the `nic_params` Hash (from "load_balancer" to "load_balancers"), and fixed inaccurate doc comments for that key.

Fixed a comment typo.

Refactoring: Renamed a method and some vars: vm_manager_network.rb.
Justin-W added a commit to Justin-W/bosh-azure-cpi-release that referenced this issue Jan 12, 2022
…some typos).

Note: The existing implementation was insufficiently commented to convey an immediate understanding of the relationship and locations of some separate-but-coupled sections of code related to "primary" NIC/network data. It initially seemed like these sections may possibly need to be updated as part of cloudfoundry#644, and I ended up having to investigate this code's implementation rather deeply before I was able to determine that the aforementioned code didn't need to be updated (functionally) for cloudfoundry#644. However, to try to reduce the "magic code" factor of these code sections, I added some additional code comments to (hopefully) prevent anyone else from needing to perform a similar (unnecessary) deep dive in the future.
Justin-W added a commit to Justin-W/bosh-azure-cpi-release that referenced this issue Jan 12, 2022
Note: This increases/restores the symmetry between the LB and AGW functionality (and configuration), by enhancing AGW support with new (for AGWs) functionality which had already been previously supported for LBs (by cloudfoundry#541 for cloudfoundry#111 and cloudfoundry#638 for cloudfoundry#328). Restoring this level of symmetry partially resolves cloudfoundry#644.

These changes include:

- refactoring of existing code/implementation (and tests). E.g. Renamed some vars and methods to more accurately reflect their current values/behavior.
- significant configuration API enhancements, to enable the (previously LB-only) functionality of cloudfoundry#111 and cloudfoundry#328 to be used with both LB and/or AGW configs. E.g. Allow the `application_gateway` config to be configured as a Hash; allow the `application_gateway` config to be configured as an Array (new in this PR for both LBs and AGWs); added support for `application_gateway/resource_roup_name` config.

---

SQUASHED COMMIT HISTORY:

issue-644: multi-AGW: Refactored the `AzureClient.get_application_gateway_by_name` method to make the `resource_group_name` param a positional param (instead of an optional keyword param).
Note: This change was made so that the `AzureClient.get_application_gateway_by_name` method's signature is consistent with the signatures of the other `AzureClient.get_*` methods in requiring the `resource_group_name` param to be passed as the first positional param.

issue-644: multi-AGW: Modified the `VMCloudProps._parse_application_gateway_config` method to allow the `application_gateway` config to be configured as an Array.
Note: When specifying an Array of AGWs, each element of the Array should be a String and/or Hash of the same format which was previously (and still is) valid when configuring a single AGW (as a non-Array).

issue-644: multi-AGW: Modified the `VMCloudProps._parse_application_gateway_config` method to accept either String or Hash config (instead of String only).

issue-644: multi-AGW: Fixed the `AzureClient.parse_network_interface` method to correctly return all of the NIC's application_gateways' info (instead of only the first LB's info).
NOTE: All unit test specs are now passing again.

issue-644: multi-AGW: Update `VMManager._get_application_gateways` method to support multi-AGW (and also to use the new `ApplicationGatewayConfig.resource_group_name` property/data).

issue-644: multi-AGW: Modified the `VMCloudProps._parse_application_gateway_config` method to init `ApplicationGatewayConfig.resource_group_name` as `nil`, unless a value is explicitly configured. (See code comment for details.)

issue-644: multi-AGW: Update `AzureClient.parse_network_interface` method to use the new `resource_group_name` keyword param of the `AzureClient.get_application_gateway_by_name` method.

issue-644: multi-AGW: Added a new, optional `resource_group_name` keyword param to the `AzureClient.get_application_gateway_by_name` method.

issue-644: multi-AGW: Updated `AzureClient.create_network_interface` method to support multi-AGW.

Renamed a spec example in the Integration test suite, to make the meaning clearer: application_gateway_spec.rb.

COMMENTS: Added/updated comments and doc comments: AzureClient, VMManager.

issue-644: multi-AGW: Renamed a key within the `nic_params` Hash (from "application_gateway" to "application_gateways").

issue-644: multi-AGW: Rename and update `VMManager._get_application_gateways` method

issue-644: multi-AGW: Renamed various vars/keys (from `application_gateway` to `application_gateways`)

issue-644: multi-AGW: Added additional specs for the `VMCloudProps._parse_application_gateway_config` method.

issue-644: multi-AGW: Add new `VMCloudProps._parse_application_gateway_config` method.
NOTE: This change is part of a multi-commit set of incremental changes. Some of the existing unit test specs are now failing. They will be fixed/updated by the time the incremental changes are completed.
Justin-W added a commit to Justin-W/bosh-azure-cpi-release that referenced this issue Jan 12, 2022
…eways (cloudfoundry#644)

Note: The new, optional `application_gateway/backend_pool_name` sub-property is an optional key of each AGW Hash config (whether specifying an Array of AGWs or a single AGW).

Note: This completes the remaining feature enhancements for cloudfoundry#644.

---

SQUASHED COMMIT HISTORY:

issue-644: named-pool-AGW: shared_stuff.rb: Refactored the shared spec setup helper code to be able to support the `application_gateway` and/or `load_balancer` vm_props as a Hash (instead of a String).
Note: This commit has not changed the shared data props referenced above to Hashes, but that change is now easy to make if needed (in future, or by overriding let statements in specific specs/examples).

issue-644: named-pool-AGW: Updated `VMManager._get_application_gateways` to support the new `application_gateway/backend_pool_name` property (from the `vm_type` config).

issue-644: named-pool-AGW: Updated `AzureClient.get_application_gateway` to return additional data about the AGW's `backend_address_pools`.

issue-644: named-pool-AGW: Added new pending specs: create_network_interface_spec.rb.

issue-644: named-pool-AGW: Modified `VMCloudProps` to allow a new, optional `application_gateway/backend_pool_name` sub-property for the `application_gateway` config.
Note: The new sub-property is an optional key of each AGW Hash config, whether specifying an Array of AGWs or a single AGW.
Justin-W added a commit to Justin-W/bosh-azure-cpi-release that referenced this issue Jan 12, 2022
…loudfoundry#644)

Note: The new, optional `load_balancer/backend_pool_name` sub-property is an optional key of each LB Hash config (whether specifying an Array of LBs or a single LB).

Note: This maintains/restores functional symmetry between the LB and AGW configs (although the symmetric LB functionality was not an explicitly requested part of the feature enhancements for cloudfoundry#644, adding the feature only for AGWs would break functional symmetry).

---

SQUASHED COMMIT HISTORY:

issue-644: named-pool-LB: Added a new, optional `load_balancer/backend_pool_name` sub-property for the `load_balancer` config.
Note: The new sub-property is an optional key of each LB Hash config, whether specifying an Array of LBs or a single LB.
Justin-W added a commit to Justin-W/bosh-azure-cpi-release that referenced this issue Jan 12, 2022
)

Note: This also includes refactoring of the 2 modified spec files, to reduce code duplication and improve maintainability.

Note: These unit test suite enhancements were made in parallel for the new functionality/configuration of both LBs and AWGs (and separately from the implementation of the functionality being tested) due to the high degree of symmetry between the 2 sets of functionality. Given the original implementation (especially of `create_network_interface_spec.rb`), making these change independently (e.g. by including them as part of the previous LB-specific and AGW-specific commits) would have resulted in either significant test code/setup duplication (plus increased complexity) and/or would have required significant after-the-fact refactoring to prevent/fix such duplication. Implementing (plus refactoring) the new specs in parallel significantly expedited the implementation of these tests, and made it easier to maintain appropriate symmetry between the unit tests of the underlying (also mostly symmetric) LB and AGW functionality.

---

SQUASHED COMMIT HISTORY:

issue-644: multi-pool-AGW: Refactored specs to reduce dupe `let`s: create_network_interface_spec.rb.

issue-644: multi-pool-AGW: Implemented pending specs: create_network_interface_spec.rb.

issue-644: multi-pool-LB: Refactored specs to reduce dupe `let`s: create_network_interface_spec.rb.

issue-644: multi-pool-LB: Implemented pending specs: create_network_interface_spec.rb.

issue-644: specs: create_network_interface_spec.rb: Removed obsolete rubocop inline comments for `RSpec/RepeatedExampleGroupBody`.

issue-644: multi-pool-AGW: pre-refactoring: create_network_interface_spec.rb.

issue-644: multi-pool-LB: pre-refactoring: create_network_interface_spec.rb.

issue-644: multi-pool-AGW: Implemented pending specs: dynamic_public_ip_spec.rb.

issue-644: multi-pool-LB: Implemented pending specs: dynamic_public_ip_spec.rb.
@bosh-admin-bot
Copy link

This issue was marked as Stale because it has been open for 21 days without any activity. If no activity takes place in the coming 7 days it will automatically be close. To prevent this from happening remove the Stale label or comment below.

Repository owner moved this from Waiting for Changes | Open for Contribution to Done in Foundational Infrastructure Working Group Jan 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment