Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: New architecture of Apache APISIX Ingress controller #610

Closed
tao12345666333 opened this issue Jul 30, 2021 · 38 comments · Fixed by #1803
Closed

proposal: New architecture of Apache APISIX Ingress controller #610

tao12345666333 opened this issue Jul 30, 2021 · 38 comments · Fixed by #1803
Assignees
Labels
discuss enhancement New feature or request triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@tao12345666333
Copy link
Member

In the current architecture of the Apache APISIX Ingress controller, we use the Apache APISIX Ingress controller as a control plane component.
The user creates a specified type of CR in Kubernetes, and the Apache APISIX Ingress controller converts it into a data structure that can be received by Apache APISIX, and creates, modifies or deletes it by calling the admin API.
Such an architecture has the following advantages:

  • The separation of CP and DP can ensure that even if the CP component is abnormal, DP can still run properly;
  • Users can deploy DP in any location they like, including outside the Kubernetes cluster

But such an architecture will also have its disadvantages
Users need to maintain a complete Apache APISIX cluster, which cannot be done simply by modifying the replicas field of the Apache APISIX Ingress controller

I hope to introduce an architecture similar to ingress-nginx, which is widely used in Kubernetes.

In this way, users can complete the deployment directly through a Pod. At the same time, user can simply modify the replicas parameter to complete the scale.

sync from mail list: https://lists.apache.org/thread.html/r929a6dfa9620d96874056750c6b07b8139b4952c8f168670553dfb86%40%3Cdev.apisix.apache.org%3E

@tao12345666333 tao12345666333 self-assigned this Jul 30, 2021
@tao12345666333 tao12345666333 added the enhancement New feature or request label Jul 30, 2021
@gxthrj
Copy link
Contributor

gxthrj commented Jul 30, 2021

Agree +1

@juzhiyuan
Copy link
Member

+1

1 similar comment
@tokers
Copy link
Contributor

tokers commented Aug 1, 2021

+1

@github-actions
Copy link

This issue has been marked as stale due to 90 days of inactivity. It will be closed in 30 days if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jun 29, 2022
@tao12345666333 tao12345666333 added discuss triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed stale labels Jun 29, 2022
@tao12345666333
Copy link
Member Author

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

I guess it will be

APISIX Ingress (gRPC server)  —>  gRPC client
                                      |
                                       -> APISIX standalone mode

APISIX may become a child process managed by another component we implement.

@tokers
Copy link
Contributor

tokers commented Aug 15, 2022

What about using an ETCD adapter to let the custom component support ETCD APIs so that we can avoid any changes for APISIX.

@tao12345666333
Copy link
Member Author

APISIX standalone mode will fully update the configuration, which will have some impact on health checks or caching.

@sober-wang
Copy link

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

I guess it will be

APISIX Ingress (gRPC server)  —>  gRPC client
                                      |
                                       -> APISIX standalone mode

APISIX may become a child process managed by another component we implement.

Why use gRPC ?
gRPC is more complex than rest apt. we should definition more and more protbuf file and increase the code complexity.

I recommend use default APISIX admin api.

@tao12345666333
Copy link
Member Author

In the simplest terms, to make it easier to scale and manage, etcd must be removed. So there is a high probability that we will no longer use the Admin API

If there is no storage component, then we will drop the Admin API @sober-wang

Using gRPC allows for active push configuration via the server.
Even introducing xDS here is an option

gRPC is more complex than rest apt. we should definition more and more protbuf file and increase the code complexity.
It is normal for new features to introduce some code changes.

While the current mode is really simple, obviously I want it to be more powerful.
I won't stop it for fear of needing to write code or adding complexity

@macmiranda
Copy link
Contributor

macmiranda commented Nov 21, 2022

I'm new to APISIX so I started reviewing the Architecture and the Deployment modes, alongside the documentation of the Ingress Controller itself (since my goal is to manage APISIX via Kubernetes CRDs and use it as an alternate Ingress Class for K8s Ingresses)

If my understanding is correct, the APISIX Ingress Controller makes the APISIX Control Plane (and the Admin API for that matter) almost entirely obsolete (at least in the context being discussed here).

If the actual configuration of APISIX is now done via Custom Resources (which are ultimately persisted in the etcd of the cluster itself) why would we need another etcd cluster to persist configuration data? I'm guessing that the current architecture of the Ingress Controller was meant to be a functional adapter for the APISIX Admin API without having to introduce any significant changes to the latter whilst still making it possible to use APISIX in the context of a Kubernetes Cluster (and its resources).

While it does what it's meant to do, it also introduces certain problems, some of which deserve serious consideration:

  • the etcd cluster installed with APISIX in the Traditional mode does not get backed up through regular cluster backups
  • in order to be able to talk to the Admin API and configure routes, upstreams, etc. the Ingress Controller needs to store the admin user key in a Config Map which is essentially insecure and makes it harder for credentials to be rotated. Setting those keys in Helm values also isn't a good solution since we store all the values in our SCM. The best would be if we could get those values from Vault. [edit] Just found out about this issue which goes in line with what I said here [/edit]
  • the Ingress Controllers' IP addresses need to be hard-coded in the APISIX Control Plane configuration if we want to take advantage of the admin.allow.ipList. It can be quite challenging to get those values when installing both charts via Helm (without hard-coding them and they are essentially dynamic) so you end up having to allow the whole Cluster Network range or 0.0.0.0/0
  • maintaining an extra Control Plane and an extra etcd cluster means more overhead for the Platform teams and introduces more points of failure. [edit] As an illustration of this, check how many of the apisix-helm-chart repo's issues are related to etcd alone (37 of 80) [/edit]

It would be great if the Ingress Controller could talk directly to the Data Plane in standalone mode.

@tao12345666333
Copy link
Member Author

You are right!

That's the main reason why I came up with this idea.

This will be my third priority, I will deal with #1465 first and then release v1.6. Then I will start working on this one.

It won't be long before I post my thoughts 💡 here to discuss with you all

@tao12345666333
Copy link
Member Author

2023-01-18 09-56-25屏幕截图

I have a new idea.

Since APISIX v3 has added the capability of gRPC-client, some optimizations have been made to the CP/DP deployment model in APISIX v3.
So we can apply this model in APISIX Ingress, implement a gRPC server similar to ETCD in APISIX-Ingress-Controller, let it serve as the control plane, and APISIX, which is actually a data plane, connects to it through gRPC.

In this way, the data plane APISIX is exactly the same as normal APISIX, not in Standalone mode, so you can use all the capabilities of APISIX without any modification to APISIX.

WDYT?

@macmiranda
Copy link
Contributor

Sounds good to me though I'm not the most familiar with the APISIX architecture, specially not when it comes to the gRPC components.
One question though, would the Ingress controller also need some type of state store or would it work fine with just reading the state from the kubernetes resources?
Also not sure how the client would authenticate to the server. Would mTLS be an option?

@tao12345666333
Copy link
Member Author

In the new architecture, the ingress controller is a stateless component.

It can just read and store resource status in Kubernetes's resources.

For authentication, we can add certificates to protect the connection.

@caibirdme
Copy link

Is it possible that controller just modify the apisix.yaml configmap thus other standalone apisix instances can watch these changes

@tao12345666333
Copy link
Member Author

Is it possible that controller just modify the apisix.yaml configmap thus other standalone apisix instances can watch these changes

@caibirdme no, this is not designed for standalone mode.

Are you using standalone mode? I want to understand your use case

@sober-wang
Copy link

2023-01-18 09-56-25屏幕截图

I have a new idea.

Since APISIX v3 has added the capability of gRPC-client, some optimizations have been made to the CP/DP deployment model in APISIX v3. So we can apply this model in APISIX Ingress, implement a gRPC server similar to ETCD in APISIX-Ingress-Controller, let it serve as the control plane, and APISIX, which is actually a data plane, connects to it through gRPC.

In this way, the data plane APISIX is exactly the same as normal APISIX, not in Standalone mode, so you can use all the capabilities of APISIX without any modification to APISIX.

WDYT?

look like , the apisix pull a configuretion from apisix-ingress-controller.
Apisix team member's will do it , Are you sure?

maybe , I'm misunderstand the means. So can you clarify the direction of data flow?

@tao12345666333
Copy link
Member Author

Currently, APISIX v3 already supports decouple mode.
DP and CP are separate.

CP provides an etcd-like service.

In the new architecture of APISIX Ingress, we only need to let the Ingress controller assume the role of CP.
APISIX at DP is only the role of DP.

@caibirdme
Copy link

Are you using standalone mode? I want to understand your use case

I'm using apisix in standalone mode to work as the ingress gateway. I don't want to use ingress, because it's only designed for http. And I don't want to deploy an etcd cluster either.
Now I have a deployment for 3-10 apisix replicas, and they're configured by apisix.yaml(configmap). When I want to update the apisix.yaml, I just update the configmap in helm chart and upgrade it. After 1 min later, configmap updated in pod, and apisix could watch that change right away.
By doing this, I don't need to learn the apisix-ingress crd, I don't need an etcd cluster, I follow the gitops manner, all the changes are managed by git. After reading the apisix docs, I can configure my ingress as both L4 proxy and L7 proxy

@mchtech
Copy link

mchtech commented Jun 6, 2023

discuss a scenario:

If these four situations are met:

  1. k8s control plane works well
  2. apps are rolling update
  3. "ingress controller" cannot sync ingress rules (or long sync delay)
    3.1 ingress controller crashloop
    3.2 or their nodes down
    3.3 or they cannot connect to apiserver (node network problem)
    3.4 or apisix etcd down (old architecture)
  4. k8s/apisix administrators don't notice what happened

data plane (upstream) will reference obsoleted pod ip, which leads to:

  1. app A pod IP is recycled by cni ipam, the redundancy of app A will reduce
  2. or app A pod has been terminated, its IP re-assigned to app B pod: app A will HTTP 404 randomly

how about dp and cp run in same pod architecture? I think it can minimized the risk (3.2, 3.3: only affect corresponding apisix dp, not all apisix dp).

@zhuoyang
Copy link
Contributor

zhuoyang commented Jun 12, 2023

Is there any decision on how to implement this feature? Folks on my current company are willing to spend some engineering time on this

@tao12345666333
Copy link
Member Author

@zhuoyang I'm glad to hear this news.

Currently, the #1803 plan is to implement an etcd-server.

In fact, there are still many things we need to do. I will write a detailed technical plan and break down the tasks as soon as possible. Hope we can work together to complete this feature.

@zhuoyang
Copy link
Contributor

zhuoyang commented Aug 1, 2023

cool! let's keep in touch

@nagidocs
Copy link

nagidocs commented Aug 10, 2023

can configure my ingress as both L4 proxy and

Can please share the code somewhere how it's setup with standalone mode is done...I mean how you are refering to backend svc's in diff namespaces and passing apisix.yaml config in the deployment, as I can't find that capability in the present helmchat of apisix.
Do we need to manually tweak the deploy afterwards ?

@seethedoor
Copy link

Great idea! This can elevate apisix-ingress-controller to a core position.

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

But when the ingress-controller goes down, it would require re-fetching, generating, and distributing the route entries upon restart. In scenarios with a large number of ingresses, this may bring more pressure on the apiserver? Or may result in longer recovery times?

@tao12345666333 tao12345666333 added this to the v1.7.0 milestone Aug 31, 2023
@tao12345666333 tao12345666333 linked a pull request Aug 31, 2023 that will close this issue
12 tasks
@tao12345666333
Copy link
Member Author

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

I didn't fully understand your meaning, are you referring to the new architecture or the existing one?

@seethedoor
Copy link

In fact, the current approach is equivalent to storing two sets of data in two etcd instances: one for the Ingress CRD and the other for Apisix's own data.

I didn't fully understand your meaning, are you referring to the new architecture or the existing one?

The existing one, and I mean your new design would avoid this. This is the benefit.

@tao12345666333
Copy link
Member Author

https://github.com/apache/apisix-ingress-controller/releases/tag/v1.7.0

V1.7.0 released with this feature. Thanks all!!!
I will close this one.

@mfractal
Copy link

thanks @tao12345666333 is there documentation ready around the new feature ?

@mlasevich
Copy link

I must be missing something, but doesn't APISIX already support reading/monitoring its config from a YAML file, the YAML file can/should be mounted as a ConfigMap by APISIX pods - so all the ingress controller needs to do is monitor ingress/CRD records and update the ConfigMap as necessary. New architecture with mimicking of etcd service seems like WAAAY overkill, no?

@mfractal
Copy link

I must be missing something, but doesn't APISIX already support reading/monitoring its config from a YAML file, the YAML file can/should be mounted as a ConfigMap by APISIX pods - so all the ingress controller needs to do is monitor ingress/CRD records and update the ConfigMap as necessary. New architecture with mimicking of etcd service seems like WAAAY overkill, no?

Not if you want to use CRDs

@mlasevich
Copy link

Not if you want to use CRDs

Can you please say more?

Is there something in CRDs that is not available in the configuration file in standalone more?

@lpiob
Copy link

lpiob commented Oct 3, 2023

I partially agree with @mlasevich - since Apisix can be controlled by a configmap, this configmap could be produced and updated by ingress-controller and we would not need a mock etcd. It does not rule out CRDs.

However, I assume that you can't have everything at once and I am able to accept mock-etcd in a transitional state.

I have a question, about the pod configuration. The composite.md documentation shows an example in which one pod has both apisix-ingress-controller and apisix-gateway. This certainly made testing easier, but I'm not sure whether it's intended to be used that way "in production".

The same scheme was duplicated within the actual ingress-controller chart, making the chart from apisix-gateway now unnecessary and causing the ingress-controller to be scaled simultaneously with apisix-gateway.

Was this intended? If not then I am able to prepare a fix for chart.

@tao12345666333
Copy link
Member Author

It seems that there are still some doubts about this, so let me answer the questions involved.

1. Why not use standalone mode directly?

We know that APISIX has a standalone mode (obviously), but APISIX's standalone mode is only a subset of its functions. It cannot complete all the capabilities of APISIX (especially the important dynamic ability for production environment). This is also why we spent a lot of time and energy to implement an etcd-mocked server. We hope to provide the complete capability of APISIX.

2. About Pod configuration

The essence of this architecture is to simplify deployment, eliminate the need for users to maintain etcd services, and make scaling easier. Therefore, in the current state, we recommend deploying them in the same Pod.

3. About Helm chart

As the configuration of APISIX is required in this mode, the configuration file of APISIX has been added. Similarly, not all configurations of APISIX are fully supported at this stage.
Of course, even with the introduction of these new architecture features, we will not abandon the existing architecture because it has its advantages. Therefore, we will not disrupt any existing deployment methods or Helm charts.

We look forward to hearing more feedback and test results from the community.

@lpiob
Copy link

lpiob commented Oct 19, 2023

  1. About Pod configuration
    The essence of this architecture is to simplify deployment, eliminate the need for users to maintain etcd services, and make scaling easier. Therefore, in the current state, we recommend deploying them in the same Pod.

I understand the intention of these changes - they simplify a lot, but at the same time are losing a LOT of functionality that we had with a separate Apisix deployment.

For example:

  • the apisix-ingress-controller helm chart does not allow to set resources limits/requests for the apisix-gateway
  • configuration values like plugins, customPlugins, pluginAttrs, configurationSnippet are completely missing from the new chart.

Is the desired way forward to add this missing functionality back to apisix-ingress-controller helm chart?

@mfractal
Copy link

@mfractal FYI https://github.com/apache/apisix-ingress-controller/blob/master/docs/en/latest/composite.md

Sir, thank you! i am finally getting to this now :)
is it possible to perform the installation of this configuration via helm chart or only via the instructions provided in the link ?

@nightguide
Copy link

nightguide commented Nov 21, 2023

Hi all!

I want to share my thoughts on this matter.
The idea of composite architecture is good, but it has certain disadvantages.

Dataplane in composite mode only receives changes to its configuration, but does not change configuration of Kubernetes resources: Deployment, Services, ConfigMaps, etc.

For example:

  1. No possibility to change the number of Dataplane replicas via Apisix Ingress Controller Custom Resource (CR)

  2. No possibility to configure an additional TCP/UDP proxy, which will additionally require a change in the Service resource K8S

  3. No possibility to connect external configurations from ConfigMap

  4. No possibility to easily and simply configure and update version apisix on standalone instances

I think it would be nice to look towards the Operator SDK, which would allow the use of CR to configure and deploy standalone instances with Apisix.

This approach will not only allow you to configure standalone instances, but will also allow you to manage Kubernetes resources for Apisix.

I think this approach will be more Kubernetes native

Have you thought about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss enhancement New feature or request triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.