Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve feature mode switch process #12188

Merged
merged 6 commits into from
Nov 2, 2022

Conversation

lixiaoyuner
Copy link
Contributor

@lixiaoyuner lixiaoyuner commented Sep 27, 2022

Signed-off-by: Yun Li [email protected]

Why I did it

Based on our new design about k8s project, need to refine container upgrade process, this PR fix several existing bugs to ensure container can be taken over by kube and can be upgraded successfully and can fallback, if something wrong happens, container can go back to local mode and run the latest stable version image.

How I did it

Refine ctrmgrd service, feature service and start service inside feature container

  1. Fix reading node labels with header bug, need to add --without-header
  2. Remove higher version checker to support fallback
  3. Tag k8s scheduled stable version to latest, ensure when container go back to local mode, it would run the latest stable version
  4. Remove ipv6 parameter when join, this parameter is not necessary

How to verify it

Enable k8s feature is SONiC device, after SONiC device joins k8s cluster, container deployed by k8s can replace SONiC local container and upgrade successfully, if something wrong happens, container can go back to local mode and run the latest stable version image.

Which release branch to backport (provide reason below if selected)

Fix existing bugs

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205

Description for the changelog

Refine ctrmgrd service, feature service and start service inside feature container

Ensure to add label/tag for the feature raised. example - PR#2174 where, Generic Config and Update feature has been labelled as GCU.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

def tag_latest_image(server, feat, docker_id, image_ver):
res = 1
if not UNIT_TESTING:
status = os.system("docker ps |grep {} >/dev/null".format(docker_id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure the running time of the container >= expected min time.
By the time, the timer fires, potentially it could have restarted.

Said that, systemd would have failed for frequent restarts.

Think through for any possible hole.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I set "tag_latest_image_on_wait_seconds" = 600s, it means when a k8s pod running for over 10 minutes(by monitor the pod's container id), I will tag the container's image. About this default value, do you have any ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be not. Suggest

  1. Try restarting service during this period
  2. Kill the main process in the container that will make the container exit within this period
  3. Any more .... ;)

Copy link
Contributor Author

@lixiaoyuner lixiaoyuner Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The situation you say should be:

  • k8s schedules container and remote_state is running
  • ctrmgrd capture remote_state ready -> running during last process, register timer_handler(600s, tag_function(previous_docker_id))
  • during the 600s, systemctl restart feature

The current reactions are:

  • feature systemd service will stop -> start -> wait, when stop, set remote_state=stopped, when start, not docker start, when wait, wait docker_id
  • k8s finds its scheduled container is killed, try to reschedule, container_up set remote_state=pending
  • ctrmgrd captures remote_state=pending, set remote_state=ready
  • container_up set remote_state=running and set docker_id
  • ctrmgrd capture remote_state ready -> running, will register timer_handler(600s, tag_function(current_docker_id))
  • and tag_latest_image_on_wait_seconds is up, tag function finds the previous_docker_id is disappeared, will do nothing
  • and tag_latest_image_on_wait_seconds is up, tag function finds the current_docker_id, tag latest.

Have tested this case and some other cases, all good.
Any other corner cases you want me to test?

And by the way, about removing 'tag as local' from reboot script, you want me to do it in the PR or in the future? I want to do in the future, because it seems redundant, but no real impacts. And it can ensure tag as local when reboot, maybe necessary for us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer we do it in this PR, as this PR's major purpose is to tag local.

One more Q: After you tag as "local" when you do service restart, who wins, systemd or kube.

I prefer systemd takes over. This would add the new version in label that should auto-block kube from deploying.
Can you check this out too?

Copy link
Contributor Author

@lixiaoyuner lixiaoyuner Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tag as local is in reboot script in sonic-utilities repo, removing from reboot script, PR is here. #Remove tag as local

Q: After you tag as "local" when do service restart, who wins?
Kube wins.
I see label such as feature_version_enabled=false will be set only when containers start up as local. Actually, when we restart the service, systemd stop will set remote_state=stopped so that systemd will wait k8s to reschedule. K8s finds its container is down, will reschedule immediately. In this process, container has no chance to start up as local to set feature_version_enabled=false. So, there is no auto-block.
About this problem, I think it's right that service is taken over by kube in this situation, because the set-owner is kube and k8s is connected. Taken over by kube should be the proper action.

By the way. I want to remove the instance_higher checker, because we want to support fallback in the future, if post-check failed, we could fall back to the previous version, I think the version should be managed by scheduler. What's your concern about this before, is there any necessary things so that we always need the higher version? #remove in this commit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two points:

  1. Please merge the reboot fix PR into this PR -- Like to keep all related in one PR.
    For example, if we revert one, it should affect both. it should both either stay or go as one piece.
  2. I agree with your explanation. Earlier tagging upon reboot only did not need any change, as state-db starts empty. Now that we are preponing the tagging, we need to do required updates that systemd takes over, if kube image is tagged as latest. In fact, if it makes simple, we can change from "kube running" to "kube-transient" once you tag it latest. On next systemd restart, seeing it kube-transient" we can go local path.

No need for kube to manage after local tagging. It is more stable if systemd takes over.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point 1: About the two PRs, the reason is that the reboot script is in another repo called sonic-utilities, seems we can't raise a PR including two repos code.

Point 2: About after tagging, it's easy to let feature go back to local after restart, but one of our visions is that k8s can collect the containers real-time metrics, if features are taken over by systemd after tagging, we can't keep collecting the metrics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point 1 -- Agree.
Point 2 -- Debatable. What happens if device gets rebooted ? We have STATE-DB for counters/metrics.

Let it be so for now.

Copy link
Contributor

@renukamanavalan renukamanavalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
Just take care of "any possible loop hole" before tagging as local.
If you are doing this, we could remove the "tagging as local" from the reboot script.

@lixiaoyuner
Copy link
Contributor Author

Looks good to me. Just take care of "any possible loop hole" before tagging as local. If you are doing this, we could remove the "tagging as local" from the reboot script.

Thanks @renukamanavalan, about possible loop hole, let me try to test this part of code carefully. And about removing "tagging as local" in reboot script, I don't think there will be some conflicts with PR, maybe remove it int the future if needed?

@renukamanavalan
Copy link
Contributor

Looks good to me. Just take care of "any possible loop hole" before tagging as local. If you are doing this, we could remove the "tagging as local" from the reboot script.

Thanks @renukamanavalan, about possible loop hole, let me try to test this part of code carefully. And about removing "tagging as local" in reboot script, I don't think there will be some conflicts with PR, maybe remove it int the future if needed?

We will do it as one set. It is most likely to slip through the cracks otherwise. We just want one way of tagging local.
Just one more: Make sure, after tagged as local, the next service restart will let systemd take over and we don't need kube to manage this anymore.

@lixiaoyuner lixiaoyuner merged commit e1440f0 into sonic-net:master Nov 2, 2022
do_freeze(feature, "bail out as current deploy version {} is not higher".
format(version))
return
# if not instance_higher(feature, state_data[VERSION], version):
Copy link
Collaborator

@qiluo-msft qiluo-msft Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not instance_higher(feature, state_data[VERSION], version):

Do not just comment code, remove them if they are dead. #Closed

USE_K8S_PROXY: ""
}

def log_debug(m):
msg = "{}: {}".format(inspect.stack()[1][3], m)
print(msg)
#print(msg)
Copy link
Collaborator

@qiluo-msft qiluo-msft Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#print(msg)

Remove the debug code. #Closed

@@ -172,11 +176,11 @@ def register_db(self, db_name):
self.db_connectors[db_name] = swsscommon.DBConnector(db_name, 0)


def register_timer(self, ts, handler):
def register_timer(self, ts, handler, args=()):
Copy link
Collaborator

@qiluo-msft qiluo-msft Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args=()

Even it is allowed to use empty tuple (which is immutable) as default value, high suggest using None as default value. #Closed

err = ""
out = ""
ret = 0
try:
local_ipv6 = _get_local_ipv6()
#local_ipv6 = _get_local_ipv6()
#_download_file(server, port, insecure)
Copy link
Collaborator

@qiluo-msft qiluo-msft Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete dead code. #Closed

@lixiaoyuner lixiaoyuner mentioned this pull request Nov 3, 2022
7 tasks
qiluo-msft pushed a commit that referenced this pull request Nov 18, 2022
* Fix kube mode to local mode long duration issue

* Remove IPV6 parameters which is not necessary

* Fix read node labels bug

* Tag the running image to latest if it's stable

* Disable image_version_higher check

* Change image_version_higher checker test case

Signed-off-by: Yun Li <[email protected]>
@mssonicbld
Copy link
Collaborator

@lixiaoyuner PR conflicts with 202205 branch

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Dec 2, 2022
* Fix kube mode to local mode long duration issue

* Remove IPV6 parameters which is not necessary

* Fix read node labels bug

* Tag the running image to latest if it's stable

* Disable image_version_higher check

* Change image_version_higher checker test case

Signed-off-by: Yun Li <[email protected]>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202205: #12919

lixiaoyuner added a commit that referenced this pull request Dec 4, 2022
* Fix kube mode to local mode long duration issue

* Remove IPV6 parameters which is not necessary

* Fix read node labels bug

* Tag the running image to latest if it's stable

* Disable image_version_higher check

* Change image_version_higher checker test case

Signed-off-by: Yun Li <[email protected]>

Signed-off-by: Yun Li <[email protected]>
Co-authored-by: lixiaoyuner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants