Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSSM-2256: Add IOR #680

Merged
merged 4 commits into from
Nov 16, 2022
Merged

OSSM-2256: Add IOR #680

merged 4 commits into from
Nov 16, 2022

Conversation

jwendell
Copy link
Member

No description provided.

jewertow and others added 4 commits November 15, 2022 13:22
* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <[email protected]>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <[email protected]>
Co-authored-by: Jonh Wendell <[email protected]>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)
The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <[email protected]>
Copy link
Contributor

@bartoszmajsak bartoszmajsak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor things we could improve here.

Comment on lines +58 to +59
github.com/openshift/api v0.0.0-20200929171550-c99a4deebbe5
github.com/openshift/client-go v0.0.0-20200929181438-91d71ef2122c
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason for 2y old version?

// KubeClient is an extension of `kube.Client` with auxiliary functions for IOR
type KubeClient interface {
IsRouteSupported() bool
GetActualClient() kube.Client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can choose the name here what do you think about UnwrapClient instead?

)

// KubeClient is an extension of `kube.Client` with auxiliary functions for IOR
type KubeClient interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why not simply call it IORClient?

)

// IORLog is IOR-scoped log
var IORLog = log.RegisterScope("ior", "IOR logging", 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need to be exported?

}(stop)

IORLog.Debugf("Registering IOR into Istio's Gateway broadcast")
kind := collections.IstioNetworkingV1Alpha3Gateways.Resource().GroupVersionKind()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we switch to v1beta1 here?

http:
- route:
- destination:
host: localhost
port:
number: 8080
`, virtualServiceName, virtualServiceName, gatewayNs)
`, virtualServiceName, virtualServiceName, gatewayNs, gatewayName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating arguments here you could use positional and refer to virtual service name using %[1]s, so that would become sth like

metadata:
  name: %[1]s
spec:
  hosts:
  - "%[1]s.maistra.io"
  gateways:
  - %[2]s/%[3]s

@jwendell
Copy link
Member Author

@bartoszmajsak IOR is dying. In 2.4 it will be deprecated (OSSM-2254). In 2.5 (or 3.0) it will be removed. Yann is refactoring some parts of it (OSSM-1689). Ideally I wouldn't like to introduce new changes. At least not in this PR. IMO any improvements (if any at all) should be done in follow ups. This is all about cherry-picking stuff from 2.3.

@bartoszmajsak
Copy link
Contributor

@jwendell as you wish, but I think if there are places we can improve a little bit and are not interfering with coming PR that would be nice.

@maistra-bot maistra-bot merged commit 38e7b16 into maistra:maistra-2.4 Nov 16, 2022
yannuil pushed a commit to yannuil/maistra-istio that referenced this pull request Aug 23, 2023
* [ior] OSSM-2256: Add IOR

* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <[email protected]>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <[email protected]>
Co-authored-by: Jonh Wendell <[email protected]>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

* OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

* OSSM-2109 Fix flaky IOR unit test (maistra#648)

The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <[email protected]>

* OSSM-2006 Fix multiNamespaceInformer.HasSynced()

Co-authored-by: Jacek Ewertowski <[email protected]>
Co-authored-by: Marko Lukša <[email protected]>
Co-authored-by: maistra-bot <[email protected]>
Signed-off-by: Yann Liu <[email protected]>
yannuil pushed a commit to yannuil/maistra-istio that referenced this pull request Sep 4, 2023
* [ior] OSSM-2256: Add IOR

* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <[email protected]>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <[email protected]>
Co-authored-by: Jonh Wendell <[email protected]>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

* OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

* OSSM-2109 Fix flaky IOR unit test (maistra#648)

The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <[email protected]>

* OSSM-2006 Fix multiNamespaceInformer.HasSynced()

Co-authored-by: Jacek Ewertowski <[email protected]>
Co-authored-by: Marko Lukša <[email protected]>
Co-authored-by: maistra-bot <[email protected]>
Signed-off-by: Yann Liu <[email protected]>
yannuil pushed a commit to yannuil/maistra-istio that referenced this pull request Sep 4, 2023
* [ior] OSSM-2256: Add IOR

* [ior] MAISTRA-1400 Add IOR to Pilot

* [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

* MAISTRA-1400: Add IOR to Pilot (maistra#135)

* MAISTRA-1400: Add IOR to Pilot

* [MAISTRA-1744] Add route annotation propagation (maistra#158)

* MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

* MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

* MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

If Gateway's httpsRedirect is set to true, create the OpenShift Route
with Insecure Policy set to `Redirect`.

Manual cherrypick from maistra#269.

* MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

In scenarios where multiple replicas of istiod are running,
only one IOR should be in charge of keeping routes in sync
with Istio Gateways. We achieve this by making sure IOR only
runs in the leader replica.

Also, because leader election is not 100% acurate, meaning
that for a small window of time there might be two instances
being the leader - which could lead to duplicated routes
being created if a new gateway is created in that time frame -
we also change the way the Route name is created: Instead of
having a generateName field, we now explicitly pass a name to
the Route object to be created. Being deterministic, it allows
the Route creation to fail when there's already a Route object
with the same name (created by the other leader in that time frame).

Use an exclusive leader ID for IOR

* Manual cherrypick of maistra#275

* MAISTRA-1813: Add unit tests for IOR (maistra#286)

* MAISTRA-2051 fixes for maistra install

* MAISTRA-2164: Refactor IOR internals (maistra#295)

Instead of doing lots of API calls on every event - this
does not scale well with lots of namespaces - keep the state
in memory, by doing an initial synchronization on start up and
updating it when receiving events.

The initial synchronization is more complex, as we have to deal with
asynchronous events (e.g., we have to wait for the Gateway store to
be warmed up). Once it's initialized, handling events as they arrive
becomes trivial.

Tests that make sure we do not make more calls to the API server than
the necessary were added, to avoid regressions.

* MAISTRA-2205: Add an option to opt-out for automatic route creation

If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
then IOR ignores it and doesn't attempt to create or manage route(s) for
this Gateway.

Also, ignore Gateways with the annotation `istio: egressgateway` as
these are not meant to have routes.

* Add integration test for IOR

Signed-off-by: Jacek Ewertowski <[email protected]>

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

* OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

For some obscure reason, it looks like we may receive UPDATE events with
the new object being equal to the old one. As IOR always delete and
recreate routes when receiving an UPDATE event, this might lead to some
service downtime, given for a few moments the route will not exist.

We guard against this behavior by comparing the `resourceVersion` field
of the new object and the one stored in the Route object.

* Add test

Co-authored-by: Brian Avery <[email protected]>
Co-authored-by: Jonh Wendell <[email protected]>

Fix debug log formatting

OSSM-1800: Copy gateway labels to routes

Simplify the comparison of resource versions

We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
There's no need to loop over the routes to perform the comparison.

This also fix the corner case where the gateway has one host and for
some reason OCP rejects the creation of the route (e.g., when hostname is already
taken). In this case the `syncRoute` object exists with zero routes in
it. Thus the loop is a no-op and the function wrongly returns with an
error of `eventDuplicatedMessage`. By comparing directly using the
`syncRoute.metadata` we fix this.

OSSM-1105: Support namespace portion in gateway hostnames

They are not used by routes, so we essentially ignore the namespace part
- anything on the left side of a "namespace/hostname" string.

OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

* OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

* OSSM-2109 Fix flaky IOR unit test (maistra#648)

The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

Co-authored-by: Marko Lukša <[email protected]>

* OSSM-2006 Fix multiNamespaceInformer.HasSynced()

Co-authored-by: Jacek Ewertowski <[email protected]>
Co-authored-by: Marko Lukša <[email protected]>
Co-authored-by: maistra-bot <[email protected]>
Signed-off-by: Yann Liu <[email protected]>
yannuil added a commit to yannuil/maistra-istio that referenced this pull request Sep 6, 2023
commit 466ae69
Author: Yang Liu <[email protected]>
Date:   Thu Mar 23 04:22:40 2023 +0800

    OSSM-1689 Simplify IOR (maistra#747)

    * Rework IOR initialization

    Signed-off-by: Yann Liu <[email protected]>

    * Remove `initialSync`

    `initialSync` is not needed.

    - During boostrap, `SetNamesapces`is always called with no namespaces.
    - When removing or adding a namespace, the underlaying informer will
      trigger an `ADD` event for all resources the informer watches

    Signed-off-by: Yann Liu <[email protected]>

    * DIsable TestPref

    Signed-off-by: Yann Liu <[email protected]>

    * Rename

    Signed-off-by: Yann Liu <[email protected]>

    * Call `findService` once for each gateway

    Signed-off-by: Yann Liu <[email protected]>

    * Use original host to generate Route name

    Signed-off-by: Yann Liu <[email protected]>

    * Skip duplicate update test

    Signed-off-by: Yann Liu <[email protected]>

    * Improve concurrency test

    Signed-off-by: Yann Liu <[email protected]>

    * Introduce update Route on Gateway update

    Signed-off-by: Yann Liu <[email protected]>

    * Fix data race

    Signed-off-by: Yann Liu <[email protected]>

    * Format and lint

    Signed-off-by: Yann Liu <[email protected]>

    * Respect log level

    Signed-off-by: Yann Liu <[email protected]>

    * Refactor IOR

    - `gatawayMap` is removed. `Routes` are retrived via API.
    -  `reconcileGateway` is used to achieve the desired state.
    - `processEvent` will only process the latest and try to abort early.

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unused functions

    Signed-off-by: Yann Liu <[email protected]>

    * Use `Lister` for finding target service

    Signed-off-by: Yann Liu <[email protected]>

    * Start IOR before kube client

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unused properties

    Signed-off-by: Yann Liu <[email protected]>

    * Rework test initialization

    Signed-off-by: Yann Liu <[email protected]>

    * Log correct debug information

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unnecessary parameters

    Signed-off-by: Yann Liu <[email protected]>

    * Remove ResourceVersion usage

    Signed-off-by: Yann Liu <[email protected]>

    * Avoid deletion of a route when failing to update

    Signed-off-by: Yann Liu <[email protected]>

    * Update FakeRouter to record API call counts

    Signed-off-by: Yann Liu <[email protected]>

    * Rework initialization

    Signed-off-by: Yann Liu <[email protected]>

    * Keep startup process order consistent

    Signed-off-by: Yann Liu <[email protected]>

    * Fix creating matching service

    Signed-off-by: Yann Liu <[email protected]>

    * Test IOR to be idempotent

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unused parameters

    Signed-off-by: Yann Liu <[email protected]>

    * Rename symbol

    Signed-off-by: Yann Liu <[email protected]>

    * Remove used struct

    Signed-off-by: Yann Liu <[email protected]>

    * Improve styling and wording

    Signed-off-by: Yann Liu <[email protected]>

    * Add support list across namespaces in faker

    Signed-off-by: Yann Liu <[email protected]>

    * Lint and format

    Signed-off-by: Yann Liu <[email protected]>

    * Introduce Openshift Route informer

    Signed-off-by: Yann Liu <[email protected]>

    * Lint

    Signed-off-by: Yann Liu <[email protected]>

    * Run make gen

    Signed-off-by: Yann Liu <[email protected]>

    * Fix data race

    Signed-off-by: Yann Liu <[email protected]>

    * Fix test data race

    Signed-off-by: Yann Liu <[email protected]>

    * Lint

    Signed-off-by: Yann Liu <[email protected]>

    * Rename variables

    Signed-off-by: Yann Liu <[email protected]>

    * Fix update route

    Signed-off-by: Yann Liu <[email protected]>

    * Linit

    Signed-off-by: Yann Liu <[email protected]>

    * Increase wait for the delete

    Signed-off-by: Yann Liu <[email protected]>

    * Maximize time to wait for the route deletion

    * Fix route update

    Signed-off-by: Yann Liu <[email protected]>

    * Fix route update

    Signed-off-by: Yann Liu <[email protected]>

    * Test with a 30 second wait

    Signed-off-by: Yann Liu <[email protected]>

    * Fix  flaky test

    Signed-off-by: Yann Liu <[email protected]>

    * Add disabling IOR and clean up

    Signed-off-by: Yann Liu <[email protected]>

    * Defer clean up

    Signed-off-by: Yann Liu <[email protected]>

    * Clear only ior routes

    Signed-off-by: Yann Liu <[email protected]>

    * rename newRoute to newRouteController

    * rename route.go to controller.go

    ---------

    Signed-off-by: Yann Liu <[email protected]>
    Co-authored-by: Marko Lukša <[email protected]>
    Signed-off-by: Yann Liu <[email protected]>

commit afe4692
Author: Jonh Wendell <[email protected]>
Date:   Wed Nov 16 08:10:44 2022 -0500

    OSSM-2256: Add IOR (maistra#680)

    * [ior] OSSM-2256: Add IOR

    * [ior] MAISTRA-1400 Add IOR to Pilot

    * [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (maistra#135) (maistra#240)

    * MAISTRA-1400: Add IOR to Pilot (maistra#135)

    * MAISTRA-1400: Add IOR to Pilot

    * [MAISTRA-1744] Add route annotation propagation (maistra#158)

    * MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (maistra#190)

    * MAISTRA-1089 Add support for IOR routes in all namespaces (maistra#193)

    * MAISTRA-2131: ior: honor Gateway's httpsRedirect (maistra#276)

    If Gateway's httpsRedirect is set to true, create the OpenShift Route
    with Insecure Policy set to `Redirect`.

    Manual cherrypick from maistra#269.

    * MAISTRA-2149: Make IOR robust in multiple replicas (maistra#282)

    In scenarios where multiple replicas of istiod are running,
    only one IOR should be in charge of keeping routes in sync
    with Istio Gateways. We achieve this by making sure IOR only
    runs in the leader replica.

    Also, because leader election is not 100% acurate, meaning
    that for a small window of time there might be two instances
    being the leader - which could lead to duplicated routes
    being created if a new gateway is created in that time frame -
    we also change the way the Route name is created: Instead of
    having a generateName field, we now explicitly pass a name to
    the Route object to be created. Being deterministic, it allows
    the Route creation to fail when there's already a Route object
    with the same name (created by the other leader in that time frame).

    Use an exclusive leader ID for IOR

    * Manual cherrypick of maistra#275

    * MAISTRA-1813: Add unit tests for IOR (maistra#286)

    * MAISTRA-2051 fixes for maistra install

    * MAISTRA-2164: Refactor IOR internals (maistra#295)

    Instead of doing lots of API calls on every event - this
    does not scale well with lots of namespaces - keep the state
    in memory, by doing an initial synchronization on start up and
    updating it when receiving events.

    The initial synchronization is more complex, as we have to deal with
    asynchronous events (e.g., we have to wait for the Gateway store to
    be warmed up). Once it's initialized, handling events as they arrive
    becomes trivial.

    Tests that make sure we do not make more calls to the API server than
    the necessary were added, to avoid regressions.

    * MAISTRA-2205: Add an option to opt-out for automatic route creation

    If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
    then IOR ignores it and doesn't attempt to create or manage route(s) for
    this Gateway.

    Also, ignore Gateways with the annotation `istio: egressgateway` as
    these are not meant to have routes.

    * Add integration test for IOR

    Signed-off-by: Jacek Ewertowski <[email protected]>

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (maistra#516)

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

    For some obscure reason, it looks like we may receive UPDATE events with
    the new object being equal to the old one. As IOR always delete and
    recreate routes when receiving an UPDATE event, this might lead to some
    service downtime, given for a few moments the route will not exist.

    We guard against this behavior by comparing the `resourceVersion` field
    of the new object and the one stored in the Route object.

    * Add test

    Co-authored-by: Brian Avery <[email protected]>
    Co-authored-by: Jonh Wendell <[email protected]>

    Fix debug log formatting

    OSSM-1800: Copy gateway labels to routes

    Simplify the comparison of resource versions

    We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
    There's no need to loop over the routes to perform the comparison.

    This also fix the corner case where the gateway has one host and for
    some reason OCP rejects the creation of the route (e.g., when hostname is already
    taken). In this case the `syncRoute` object exists with zero routes in
    it. Thus the loop is a no-op and the function wrongly returns with an
    error of `eventDuplicatedMessage`. By comparing directly using the
    `syncRoute.metadata` we fix this.

    OSSM-1105: Support namespace portion in gateway hostnames

    They are not used by routes, so we essentially ignore the namespace part
    - anything on the left side of a "namespace/hostname" string.

    OSSM-1650 Make sure initialSync and event loop behave the same (maistra#551)

    * OSSM-1301 Wait for Route resource type to become available on ior startup (maistra#631)

    * OSSM-2109 Fix flaky IOR unit test (maistra#648)

    The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

    Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

    Co-authored-by: Marko Lukša <[email protected]>

    * OSSM-2006 Fix multiNamespaceInformer.HasSynced()

    Co-authored-by: Jacek Ewertowski <[email protected]>
    Co-authored-by: Marko Lukša <[email protected]>
    Co-authored-by: maistra-bot <[email protected]>
    Signed-off-by: Yann Liu <[email protected]>

Signed-off-by: Yann Liu <[email protected]>
openshift-merge-robot pushed a commit that referenced this pull request Sep 6, 2023
commit 466ae69
Author: Yang Liu <[email protected]>
Date:   Thu Mar 23 04:22:40 2023 +0800

    OSSM-1689 Simplify IOR (#747)

    * Rework IOR initialization

    Signed-off-by: Yann Liu <[email protected]>

    * Remove `initialSync`

    `initialSync` is not needed.

    - During boostrap, `SetNamesapces`is always called with no namespaces.
    - When removing or adding a namespace, the underlaying informer will
      trigger an `ADD` event for all resources the informer watches

    Signed-off-by: Yann Liu <[email protected]>

    * DIsable TestPref

    Signed-off-by: Yann Liu <[email protected]>

    * Rename

    Signed-off-by: Yann Liu <[email protected]>

    * Call `findService` once for each gateway

    Signed-off-by: Yann Liu <[email protected]>

    * Use original host to generate Route name

    Signed-off-by: Yann Liu <[email protected]>

    * Skip duplicate update test

    Signed-off-by: Yann Liu <[email protected]>

    * Improve concurrency test

    Signed-off-by: Yann Liu <[email protected]>

    * Introduce update Route on Gateway update

    Signed-off-by: Yann Liu <[email protected]>

    * Fix data race

    Signed-off-by: Yann Liu <[email protected]>

    * Format and lint

    Signed-off-by: Yann Liu <[email protected]>

    * Respect log level

    Signed-off-by: Yann Liu <[email protected]>

    * Refactor IOR

    - `gatawayMap` is removed. `Routes` are retrived via API.
    -  `reconcileGateway` is used to achieve the desired state.
    - `processEvent` will only process the latest and try to abort early.

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unused functions

    Signed-off-by: Yann Liu <[email protected]>

    * Use `Lister` for finding target service

    Signed-off-by: Yann Liu <[email protected]>

    * Start IOR before kube client

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unused properties

    Signed-off-by: Yann Liu <[email protected]>

    * Rework test initialization

    Signed-off-by: Yann Liu <[email protected]>

    * Log correct debug information

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unnecessary parameters

    Signed-off-by: Yann Liu <[email protected]>

    * Remove ResourceVersion usage

    Signed-off-by: Yann Liu <[email protected]>

    * Avoid deletion of a route when failing to update

    Signed-off-by: Yann Liu <[email protected]>

    * Update FakeRouter to record API call counts

    Signed-off-by: Yann Liu <[email protected]>

    * Rework initialization

    Signed-off-by: Yann Liu <[email protected]>

    * Keep startup process order consistent

    Signed-off-by: Yann Liu <[email protected]>

    * Fix creating matching service

    Signed-off-by: Yann Liu <[email protected]>

    * Test IOR to be idempotent

    Signed-off-by: Yann Liu <[email protected]>

    * Remove unused parameters

    Signed-off-by: Yann Liu <[email protected]>

    * Rename symbol

    Signed-off-by: Yann Liu <[email protected]>

    * Remove used struct

    Signed-off-by: Yann Liu <[email protected]>

    * Improve styling and wording

    Signed-off-by: Yann Liu <[email protected]>

    * Add support list across namespaces in faker

    Signed-off-by: Yann Liu <[email protected]>

    * Lint and format

    Signed-off-by: Yann Liu <[email protected]>

    * Introduce Openshift Route informer

    Signed-off-by: Yann Liu <[email protected]>

    * Lint

    Signed-off-by: Yann Liu <[email protected]>

    * Run make gen

    Signed-off-by: Yann Liu <[email protected]>

    * Fix data race

    Signed-off-by: Yann Liu <[email protected]>

    * Fix test data race

    Signed-off-by: Yann Liu <[email protected]>

    * Lint

    Signed-off-by: Yann Liu <[email protected]>

    * Rename variables

    Signed-off-by: Yann Liu <[email protected]>

    * Fix update route

    Signed-off-by: Yann Liu <[email protected]>

    * Linit

    Signed-off-by: Yann Liu <[email protected]>

    * Increase wait for the delete

    Signed-off-by: Yann Liu <[email protected]>

    * Maximize time to wait for the route deletion

    * Fix route update

    Signed-off-by: Yann Liu <[email protected]>

    * Fix route update

    Signed-off-by: Yann Liu <[email protected]>

    * Test with a 30 second wait

    Signed-off-by: Yann Liu <[email protected]>

    * Fix  flaky test

    Signed-off-by: Yann Liu <[email protected]>

    * Add disabling IOR and clean up

    Signed-off-by: Yann Liu <[email protected]>

    * Defer clean up

    Signed-off-by: Yann Liu <[email protected]>

    * Clear only ior routes

    Signed-off-by: Yann Liu <[email protected]>

    * rename newRoute to newRouteController

    * rename route.go to controller.go

    ---------

    Signed-off-by: Yann Liu <[email protected]>
    Co-authored-by: Marko Lukša <[email protected]>
    Signed-off-by: Yann Liu <[email protected]>

commit afe4692
Author: Jonh Wendell <[email protected]>
Date:   Wed Nov 16 08:10:44 2022 -0500

    OSSM-2256: Add IOR (#680)

    * [ior] OSSM-2256: Add IOR

    * [ior] MAISTRA-1400 Add IOR to Pilot

    * [MAISTRA-1089][MAISTRA-1400][MAISTRA-1744][MAISTRA-1811]: Add IOR to Pilot (#135) (#240)

    * MAISTRA-1400: Add IOR to Pilot (#135)

    * MAISTRA-1400: Add IOR to Pilot

    * [MAISTRA-1744] Add route annotation propagation (#158)

    * MAISTRA-1811 Store resourceVersion of reconciled Gateway resource (#190)

    * MAISTRA-1089 Add support for IOR routes in all namespaces (#193)

    * MAISTRA-2131: ior: honor Gateway's httpsRedirect (#276)

    If Gateway's httpsRedirect is set to true, create the OpenShift Route
    with Insecure Policy set to `Redirect`.

    Manual cherrypick from #269.

    * MAISTRA-2149: Make IOR robust in multiple replicas (#282)

    In scenarios where multiple replicas of istiod are running,
    only one IOR should be in charge of keeping routes in sync
    with Istio Gateways. We achieve this by making sure IOR only
    runs in the leader replica.

    Also, because leader election is not 100% acurate, meaning
    that for a small window of time there might be two instances
    being the leader - which could lead to duplicated routes
    being created if a new gateway is created in that time frame -
    we also change the way the Route name is created: Instead of
    having a generateName field, we now explicitly pass a name to
    the Route object to be created. Being deterministic, it allows
    the Route creation to fail when there's already a Route object
    with the same name (created by the other leader in that time frame).

    Use an exclusive leader ID for IOR

    * Manual cherrypick of #275

    * MAISTRA-1813: Add unit tests for IOR (#286)

    * MAISTRA-2051 fixes for maistra install

    * MAISTRA-2164: Refactor IOR internals (#295)

    Instead of doing lots of API calls on every event - this
    does not scale well with lots of namespaces - keep the state
    in memory, by doing an initial synchronization on start up and
    updating it when receiving events.

    The initial synchronization is more complex, as we have to deal with
    asynchronous events (e.g., we have to wait for the Gateway store to
    be warmed up). Once it's initialized, handling events as they arrive
    becomes trivial.

    Tests that make sure we do not make more calls to the API server than
    the necessary were added, to avoid regressions.

    * MAISTRA-2205: Add an option to opt-out for automatic route creation

    If the Istio Gateway contains the annotation `maistra.io/manageRoute: false`
    then IOR ignores it and doesn't attempt to create or manage route(s) for
    this Gateway.

    Also, ignore Gateways with the annotation `istio: egressgateway` as
    these are not meant to have routes.

    * Add integration test for IOR

    Signed-off-by: Jacek Ewertowski <[email protected]>

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same (#516)

    * OSSM-1442: IOR: Ignore UPDATE events if resourceVersions are the same

    For some obscure reason, it looks like we may receive UPDATE events with
    the new object being equal to the old one. As IOR always delete and
    recreate routes when receiving an UPDATE event, this might lead to some
    service downtime, given for a few moments the route will not exist.

    We guard against this behavior by comparing the `resourceVersion` field
    of the new object and the one stored in the Route object.

    * Add test

    Co-authored-by: Brian Avery <[email protected]>
    Co-authored-by: Jonh Wendell <[email protected]>

    Fix debug log formatting

    OSSM-1800: Copy gateway labels to routes

    Simplify the comparison of resource versions

    We store the gateway resource version (the whole metadata actually) in the `syncRoute` object.
    There's no need to loop over the routes to perform the comparison.

    This also fix the corner case where the gateway has one host and for
    some reason OCP rejects the creation of the route (e.g., when hostname is already
    taken). In this case the `syncRoute` object exists with zero routes in
    it. Thus the loop is a no-op and the function wrongly returns with an
    error of `eventDuplicatedMessage`. By comparing directly using the
    `syncRoute.metadata` we fix this.

    OSSM-1105: Support namespace portion in gateway hostnames

    They are not used by routes, so we essentially ignore the namespace part
    - anything on the left side of a "namespace/hostname" string.

    OSSM-1650 Make sure initialSync and event loop behave the same (#551)

    * OSSM-1301 Wait for Route resource type to become available on ior startup (#631)

    * OSSM-2109 Fix flaky IOR unit test (#648)

    The sleep in ensureNamespaceExists was hardcoded to 100ms, regardless of r.handleEventTimeout. This timeout during unit tests is only 1ms, so the 100ms sleep caused the for loop to only run once.

    Here we change the duration of the sleep to be 1/100 of r.handleEventTimeout. This change preserves the production sleep time of 100ms, but reduces the sleep time in unit tests to 10μs. This makes ensureNamespaceExists() run the for loop multiple times before giving up, fixing the test's flakiness.

    Co-authored-by: Marko Lukša <[email protected]>

    * OSSM-2006 Fix multiNamespaceInformer.HasSynced()

    Co-authored-by: Jacek Ewertowski <[email protected]>
    Co-authored-by: Marko Lukša <[email protected]>
    Co-authored-by: maistra-bot <[email protected]>
    Signed-off-by: Yann Liu <[email protected]>

Signed-off-by: Yann Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants