-
Notifications
You must be signed in to change notification settings - Fork 29
Implement DHCP server support #102
Comments
Item 1 is done. Design doc link: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/docs/dhcp_programming.adoc |
Item 2 is under review: futurewei-cloud/alcor#193 |
I'm interesting in this issue, and I have some experence in network stack developing. May this issue assigned to me? Thanks. |
@w2520n2520 Absolutely, and thank you! This issue has been assigned. |
Update to Item 2: PR futurewei-cloud/alcor#193 has been merged to alcor/master. |
Hi Liguang and Eric, |
Hi @w2520n2520 - you asked the right question and on the right track. This dhcp server needs to intercept the dhcp packets using openflow rules, parse it and reply with DHCP_OFFER and later DHCP_ACK message. More information is available in the reference session in the design doc: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/docs/dhcp_programming.adoc |
When building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve. |
Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already. |
Followed build and execution
Well, build and test should be executed in the generated docker "a1", my misunderstanding. |
Seeing 18 tests passed on the unit/functional test is good enough for now. What kind of error do you see when you run ./build/bin/AlcorControlAgent? It will try to connect to kafka so those error maybe expected if kafka was not setup. The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later. |
Thanks
Thanks Eric. Just trying to build up my working ground here. |
Hi Eric, So these resources will always be updates together? Any chance they can be updated independently? Thanks. @er1cthe0ne |
Hi Nan Wu, Good question, the GoalState message contains:
Aca_Comm_Manager will try to update the whole GoalState in an efficient manner. For DHCP create, the likely GoalState message would look like:
Or DHCP update, it could look like:
Does it make sense? Let me know if you have other questions. @w2520n2520 |
Hi Nan Wu, Do you think you can have the standalone DHCP application available in a few weeks? It would be great if we can complete the integration into AlcorControlAgent by the month of June. @w2520n2520 |
Hi Nan Wu, Checking in here. Do you think we can meet the target of June to have a standalone DHCP application based on this design and integrate it with AlcorControlAgent? Let me know. @w2520n2520 |
Hi Eric, |
Hi Nan Wu, Checking in here and see if there is anything I can help. Maybe we can breakdown the standalone DHCP application task into smaller pieces? e.g.:
How does it sound? @w2520n2520 |
Hi Eric, Q1: How should i put dhcp_server? Should it be in a independent thread or run in the same one with aca_main?(maybe not a good idea). About the "??" part, net_handler use rpc to talk to transit_daemon of mizar, but dhcp_server is supposed to be on the same node, so rpc may be not necessary here, but again network dhcp-server will be on different node, the same comm way will benefit. I have limited understanding about alcor-agent's whole design behind, I may need your involvement here. Q2: How is like the code flow for 3rd item? Didn't find the if for packet_in under current src dir. Thanks for your guidance and help. |
Hi Nan Wu, Thanks for the questions, I will answer it one by one. Do let me know if you have other questions.
Thanks, |
Hi Eric, Thanks for the reply.
[Nan]: OK. I thought i was supposed to start from here. We can do it later.
[Nan]: No, I mean the packet_in flow here instead of the control message flow(goal state). In the dhcp design doc, it mentioned openflow table rules will be used to transfer dhcp packets to dhcp-server. The question is if the dataplane is mizar, there will be no openflow tables right? Another one is, if openflow table is used, there will be two flows--one for local dhcp-server, the other is for network-dhcp-server with low priority. When the local one fails, so should its corresponding flow, so packet will be transfer to the network one.
[Nan]: Yes, about the openflow rule programming part. |
Hi Nan Wu,
No problem, feel free to ask :)
The current focus is OVS dataplane, and the current design only support one dataplane per host. The backup network-dhcp-server is used when local ACA is down, and it didn't have a chance to setup the local-dhcp-server flow. In the event if ACA exit gracefully, it should remove the local-dhcp-server flow. If ACA exit unexpectedly, it will try to restart a few times and if ACA really cannot get back to running state. Alcor controller would detect it and perform corrective actions. In summary, I am not sure how both local-dhcp-server and network-dhcp-server flow works at the same time since one of them will be used based on priority. Unless we set a timeout on local-dhcp-server flow but then ACA will need to keep renewing it.
Did I answer your question above? Let me know.
Ok, please go ahead and execute system call for now (see execute_system_command). ACA will be adding better openflow client support in the future (per current design) and then DHCP code can leverage that when ready. Hope all of them make sense to you. BTW, once you have some code implemented, it will be great to send a PR so that we can look at and discuss if needed. @w2520n2520 |
More information on packet_in_handler flow. In order to have DHCP packets send to ACA, we will need to implement an openflow controller, and have an openflow rule send the matched DHCP packets to openflow controller, that's ACA in our case. We may use something similar to ovs-ofctl implementation, which acks as an openflow controller. Below is an experiment to show that it should work: root@fw0016589: ping -I 192.168.0.131 -c1 192.168.0.124 Br-int is letting all the traffic go now: root@fw0016589: ovs-ofctl dump-flows br-int Adding new openflow rule to send all packet to CONTROLLER, that’s ovs-ofctl for this case: Ping doesn’t work anymore because the packets has been sent to CONTROLLER! --- 192.168.0.124 ping statistics --- Printed out by ovs-ofctl! The flow rules shows that the packets is going to CONTROLLER: @w2520n2520 - let me know if you have question on the approach or have a better suggestion. |
I think this is the reason:
Solving: Any idea? @er1cthe0ne @cj-chung |
@w2520n2520 - allow me to suggest a few things, let me know if that make sense. First thing is to setup a local compiling environment: Since @chenpiaoping is looking into ACA, maybe he can give a hand on it. Once you have the local build setup, we can resolve the issues quickly. If there is a need to update the cmake version on our CI to 3.12.4, we can make that modification in our CI environment assuming that's the solution to resolve all the compiling issues. |
Tried in local env, same issue. |
Let's update your local environment's cmake version to 3.12.4 or higher, apply the fix you tried previously on CMakeLists.txt and see if that would address the issues. Please show us the error message so that we can take a look. |
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") |
Hi @w2520n2520 and @gure, I was able to get your branch to compile, please see the below steps.
# Set the version number. #add_compile_options(-O0) # enable no optimization during development add_subdirectory(src)
|
Hi @er1cthe0ne , @Gzure Adding g_ofctl_command to both gtest and functest makes the compilation work.
Would it be possible that g_ofctl_command is self-contained inside AlcorControlAgentLib since it is a lib? |
I am thinking about to remove it, on issue #120 number 4 point, I am suggesting to remove g_ofctl_command since we may not need it. |
All related unit test passed. Request to merge. @er1cthe0ne @cj-chung |
@w2520n2520 @gure, please reference to this script for physical machine setup of ACA: |
Hi @er1cthe0ne @Gzure and I do this for testing: |
Hi @w2520n2520, I am not sure I understand the concern. Can you tell me what is your question?
This could be a limitation based on the OVS code we use, but I don't think it is a blocking issue because we would only monitor br-int for the scenarios we defined. @cj-chung to correct me if I am wrong. |
4.1
4.2
Since we didn't find the caller of control so we change the entrance in main to br-int to debug packet procedure. |
Yes. that's correct call stack. |
Hi @cj-chung , One question:
Whether another flow should be installed for packet-replying-from-server-to-client?
In a word, we have no error seen in code flow now but no packet-out observed on network. We may use your help to figure it out. Thanks. @Gzure @er1cthe0ne |
The "in_port" indicates where the packet sent to, so the packet should be sent to controller. If you use tcpdump to capture packets on br-tun or br-int, you should able to see the packet on these bridges. You can use the following command to test the packet-out function: and use |
Hi @cj-chung @er1cthe0ne ,
It seems this command only send "actual packet" which means dhcp needs to encap the whole packet from app-to-eth instead of dhcp payload. |
@w2520n2520 Yes. You need a whole packet for the hex string. Since I just directly send the packet string to OVS. |
Success Criteria:
Agent support DHCP programming and allow VMs/Containers to receive the assigned IP address through DHCP.
Details:
We need to implement the DHCP support in OpenStack environment, taking over the responsibility of neutron DHCP agent. This tasks includes:
4a. We need to add option flow rule to capture DHCP packets and send to openflow controller (ACA), we can add that into the DHCP class init function, called by Aca_Goal_State_Handler::Aca_Goal_State_Handler() constructor
-the add rule should look like: add-flow br-int "table=0,priority=25,udp,udp_src=68,udp_dst=67,actions=CONTROLLER"
-the delete rule should look like: del-flows br-int udp,udp_src=68,udp_dst=67
-to program the openflow rules, use: ACA_OVS_L2_Programmer::get_instance().execute_openflow_command
4b. When aca_ovs_control code received a DHCP packet, it needs to call DHCP function to parse and process it, please provide the interface to call and @cj-chung can tell you where to change the code to call it.
5a. Please see DISABLED_2_ports_ROUTING_test_traffic_one_machine in https://github.com/futurewei-cloud/alcor-control-agent/blob/master/test/gtest/aca_tests.cpp for an example on how we used docker + ovs-docker on physical machine or VM to create container for testing. We can create container and assigned a mac address to it, let it do DHCP to test our DHCP implementation. See https://goldmann.pl/blog/2014/01/30/assigning-ip-addresses-to-docker-containers-via-dhcp/
The text was updated successfully, but these errors were encountered: