-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSD Migration for changed disk slots #1302
Conversation
srv/modules/runners/proposal.py
Outdated
3. Remove proposal file with "-replace" in its name | ||
""" | ||
|
||
# import ipdb; ipdb.set_trace() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this, that should not be in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just from reading through it I think this is on a good way. Some docs on how this is expected to be used never hurt. Looking forward to giving this a spin!
srv/modules/runners/proposal.py
Outdated
:param profile_dir: Profile directory, e.g. "/srv/pillar/ceph/proposals/profile-default" | ||
""" | ||
ToReplace = namedtuple('ToReplace', ['fullpath', 'filename']) | ||
dir = '{}/stack/default/ceph/minions'.format(profile_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dir
is a python built-in function. maybe use dir_
or _dir
to avoid confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, right! Will change that.
srv/modules/runners/proposal.py
Outdated
""" | ||
ToReplace = namedtuple('ToReplace', ['fullpath', 'filename']) | ||
dir = '{}/stack/default/ceph/minions'.format(profile_dir) | ||
files = [f for f in os.listdir(dir) if isfile(join(dir, f))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add the f.endswith('-replace')
here and thus only iterate once over all profile files. Somehting like
files = [f for f in os.listdir(dir) if isfile(join(dir, f)) and f.endswith('-replace')]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, that does not need two iterations over all profiles
srv/modules/runners/replace.py
Outdated
return "" | ||
|
||
|
||
def _strip_enddigits(string): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doe this do somehting different then string.rstrip('0123456789')
? If I'm not missing anything this method is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is a case of "do before properly looking if there is an built-in that does the same" 😆 I'll adapt the code.
srv/modules/runners/proposal.py
Outdated
for minion in minions_to_replace: | ||
ReplaceDiskOn(minion).replace() | ||
return True | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that if a replacement is ongoing no new proposals are created? Wondering if that'll bite us at some point when someone replaces a disk and adds a new node in one go...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is actually a good point I did not think about. I guess we could remove the minions in minions_to_replace from the targets and run proposal.generate
on the rest.
840041f
to
0c930cc
Compare
srv/modules/runners/replace.py
Outdated
@@ -52,23 +53,24 @@ def osd(*args, **kwargs): | |||
|
|||
local = salt.client.LocalClient() | |||
host_osds = local.cmd('I@roles:storage', 'osd.list', tgt_type='compound') | |||
assert isinstance(host_osds, list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local.cmd('I@roles:storage', 'osd.list', tgt_type='compound')
returns a dict of host: list_of_osds
srv/salt/_modules/osd.py
Outdated
@@ -1793,8 +1793,7 @@ def partitions(self, osd_id): | |||
|
|||
for device_type in ['journal', 'block', 'block.db', 'block.wal', 'block_dmcrypt']: | |||
result = self._uuid_device("{}/{}".format(mount_dir, device_type)) | |||
if result: | |||
_partitions[device_type] = result | |||
_partitions[device_type] = result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the if result:
is still needed as the partitions
method also feeds the grains.
If no /dev/disk/by-id symlink is found, e.g. because that path does not exist (as it is a case in VMs), the devicename that was provided in the first place is returned. Signed-off-by: Alexander Graul <[email protected]>
0de3628
to
f0c2e89
Compare
f0c2e89
to
bd6f3f5
Compare
Tried:
|
lgtm after those two flapping unittests are fixed. |
bd6f3f5
to
b27e1b9
Compare
931f4e0
to
2a372b2
Compare
Real hardware yields some problems. In-depth explanation following |
2a372b2
to
fb52800
Compare
fb52800
to
d18ac8f
Compare
OSD Migration currently only works if the same slot is used. This works because the generation of proposals is deterministic, so if all disks are in the same position we end up with the same proposal. This commit changes that behaviour: instead of creating a new proposal, the old one is used and adapted. This happens in the following steps: 1) When `salt-run replace.osd X` is used and a proposal file (e.g. "data1.ceph.yml" in /srv/pillar/proposals/...) gets the "-replace" suffix, a new attribute is written for the disks that get replaced 2) When `salt-run proposals.populate` is used (e.g. in stage.1), all proposal files that have the "-replace" suffix get parsed and the disks in the proposal are compared to the currently installed disks on the node. 3) Disks in the old proposal that have the attribute inserted in 1) and are still present (i.e. replaced in place) plus new disks (disks that are not in the old proposal but present) are considered free. 4) The disks in the old proposal get interated over and if they are labeled for replacement they are swapped with one of the free disks. 5) This modified proposal gets saved in the place of the original one and can be used by later stages Once the replacement in the proposal went through, new proposals are generated on all other nodes in case replacements and node additions happen at once. Signed-off-by: Alexander Graul <[email protected]>
Signed-off-by: Alexander Graul <[email protected]>
d18ac8f
to
02858eb
Compare
@susebot run teuthology |
retest this please |
Commit 02858eb is NOT OK for suite suse:tier1. |
8/9 failures are infrastructure related |
Fixes #1215
Please have a look already, there are still a few things to do (I've noted them below) but the core business logic should stay more or less the same (unless you find some problems 😉)
Description:
OSD Migration currently only works if the same slot is used. This works
because the generation of proposals is deterministic, so if all disks
are in the same position we end up with the same proposal.
This commit changes that behaviour: instead of creating a new proposal,
the old one is used and adapted. This happens in the following steps:
salt-run replace.osd X
is used and a proposal file (e.g."data1.ceph.yml" in /srv/pillar/proposals/...) gets the "-replace"
suffix, a new attribute is written for the disks that get replaced
salt-run proposals.populate
is used (e.g. in stage.1), allproposal files that have the "-replace" suffix get parsed and the
disks in the proposal are compared to the currently installed disks
on the node.
are still present (i.e. replaced in place) plus new disks (disks
that are not in the old proposal but present) are considered free.
labeled for replacement they are swapped with one of the free disks.
and can be used by later stages
Things that should still be done:
Checklist:
and or functional tests