Reuse OSD ID in migrations, support replace.osds #1216

swiftgist · 2018-07-06T21:08:54Z

These functions are essentially equivalent and bundled together. See #1146 for a detailed description.
Also, this PR should not be merged until a decision about #1215 happens.

~~#1188~~ #1259 is a prerequisite for this PR. Otherwise, the smoketests may fail.

Signed-off-by: Eric Jackson [email protected]

susebot · 2018-07-06T21:09:03Z

You are not allowed to trigger tests. Please ask to get whitelisted.

denisok · 2018-07-19T14:06:51Z

@jan--f @jschmid1 @tserong please review - this needs to be landed ASAP

tserong · 2018-07-20T07:06:10Z

Looking at ceph osd tree while running smoketests, I get output like:

ID  CLASS WEIGHT  TYPE NAME          STATUS    REWEIGHT PRI-AFF 
 -1       0.46562 root default                                  
 -5       0.07758     host data1                                
 10   hdd 0.01939         osd.10     destroyed        0 1.00000 
 14   hdd 0.01939         osd.14     destroyed        0 1.00000 
 18   hdd 0.01939         osd.18     destroyed        0 1.00000 
 22   hdd 0.01939         osd.22     destroyed        0 1.00000 
 -3       0.11636     host data2                                
  0   hdd 0.01939         osd.0             up  1.00000 1.00000 
  4   hdd 0.01939         osd.4             up  1.00000 1.00000 
  8   hdd 0.01939         osd.8             up  1.00000 1.00000 
 12   hdd 0.01939         osd.12            up  1.00000 1.00000 
 16   hdd 0.01939         osd.16            up  1.00000 1.00000 
 21   hdd 0.01939         osd.21            up  1.00000 1.00000 
 -7       0.11636     host data3                                
  1   hdd 0.01939         osd.1             up  1.00000 1.00000 
  7   hdd 0.01939         osd.7             up  1.00000 1.00000 
 11   hdd 0.01939         osd.11            up  1.00000 1.00000 
 15   hdd 0.01939         osd.15            up  1.00000 1.00000 
 19   hdd 0.01939         osd.19            up  1.00000 1.00000 
 23   hdd 0.01939         osd.23            up  1.00000 1.00000 
 -9       0.11636     host data4                                
  3   hdd 0.01939         osd.3             up  1.00000 1.00000 
  5   hdd 0.01939         osd.5             up  1.00000 1.00000 
  9   hdd 0.01939         osd.9             up  1.00000 1.00000 
 13   hdd 0.01939         osd.13            up  1.00000 1.00000 
 17   hdd 0.01939         osd.17            up  1.00000 1.00000 
 20   hdd 0.01939         osd.20            up  1.00000 1.00000 
-11       0.03896     host localhost                            
  2   hdd 0.01947         osd.2           down        0 1.00000 
  6   hdd 0.01949         osd.6             up  1.00000 1.00000

Now, I know the smoketests destroy all the OSDs on one node, then do migrates involving only two OSDs on that node, but, while this is all running, the OSDs appear in the "localhost" bucket, not in the "data1" bucket. Is that indicative of a problem at all?

tserong · 2018-07-20T07:47:28Z

...actually, my environment somehow got broken part way though this:

# salt '*' grains.get host
data1.ceph:
    localhost
data2.ceph:
    localhost
admin.ceph:
    localhost
data3.ceph:
    localhost
data4.ceph:
    localhost

Rebooting seems to have fixed it.

jschmid1 · 2018-07-20T09:45:55Z

srv/modules/runners/replace.py

+    """
+    # Parameters for osd.remove module
+    supported = ['force', 'timeout', 'delay']
+    passed = ["{}={}".format(k, v) for k, v in kwargs.items() if k in supported]


Would a notification that someone passed unsupported parameters be helpful here? That'd also help people that did a typo.

The underscore variables get passed from Salt when called via a state file, iirc. The warning would be a wall of text without extra processing trying to separate Salt internal variables. I thought the debug would be enough.

jschmid1 · 2018-07-20T09:48:21Z

srv/modules/runners/replace.py

+    """
+    Usage
+    """
+    usage = ('salt-run replace.osd id [id ...][force=True]:\n\n'


That should reflect the supported args of the respective module '['force', 'timeout', 'delay']'

Hmm, the timeout and delay are passed to the osd.remove module, so the defaults aren't set here. Also, only using those two in smoketests currently. I will have to think about what to put here.

jschmid1 · 2018-07-20T09:49:28Z

srv/modules/runners/replace.py

+
+    if len(osds) > 1:
+        # Pause for a moment, let the admin see what they passed
+        print("Removing osds {} from minions\nPress Ctrl-C to abort".format(", ".join(osds)))


Maybe tell the admin for how long he'll be able to hit CTRL+C

Let me take a look.

jschmid1 · 2018-07-20T09:54:19Z

srv/modules/runners/replace.py

+    completed = osds
+    for osd_id in osds:
+        host = _find_host(osd_id, host_osds)
+        if host:


We should be verbose in case there was no host found. Currently it will just not do anything and bail out silently

The remove.osd command also calls this module. In the case that a remove.osd has already been called and is no longer present on the minions, then the OSD will be removed from Ceph (a message is printed for that). Currently, the behavior prints messages for what actions it does.

I could add a verbose option, but I would prefer the default to not be verbose. If an admin is using a wildcard for dozens of OSDs and rerunning the command multiple times, then the output gets shorter as the Ceph cluster reaches the desired state.

jschmid1 · 2018-07-20T10:00:25Z

srv/modules/runners/replace.py

+    return False
+
+
+def _remove_minion(local, master_minion, osd_id, passed, host):


the name of the function should be _remove_osd_from_minion or something in that direction.

Maybe just '_remove_osd'.

jan--f · 2018-07-20T10:48:30Z

srv/modules/runners/smoketests.py

+    return ""
+
+
+def help_():


Seems like there is something missing here?

This is a policy question: the smoketest runner lives in the qa package. Which way do we want to go with help messages? Do they follow all runners or only those in the main package?

tserong

~~Has anyone tried this with a custom crushmap, to verify the OSDs remain where they should during migration?~~

I did a manual test with a custom crushmap and osd crush update on start = false in ceph.conf:

# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF 
-12       1.00000 root alt-root                               
-11       1.00000     host data1-fake                         
  2   hdd 1.00000         osd.2           up  1.00000 1.00000 
 -1       0.42195 root default                                
 -3       0.07286     host data1                              
  6   hdd 0.01457         osd.6           up  1.00000 1.00000 
 10   hdd 0.01457         osd.10          up  1.00000 1.00000 
 14   hdd 0.01457         osd.14          up  1.00000 1.00000 
 18   hdd 0.01457         osd.18          up  1.00000 1.00000 
 22   hdd 0.01457         osd.22          up  1.00000 1.00000 
[...]

This was originally deployed with filestore, I flipped to a bluestore profile for the data1 host, ran stage 2, then ceph.migrate.osds; the migration succeeded and everything remained where it was meant to be in the crushmap.

tserong · 2018-07-20T10:49:37Z

srv/modules/runners/replace.py

+    return ""
+
+
+def minion_profile(minion):


I'd suggest calling this _rename_minion_profile.

This is the public module name, so the runner call is "salt-run replace.minion_profile".

Ah, my bad.

tserong · 2018-07-20T10:54:48Z

srv/modules/runners/select.py

@@ -51,7 +51,7 @@ def _grain_host(client, minion):
    """
    Return the host grain for a given minion, for use a short hostname
    """
-    return list(client.cmd(minion, 'grains.item', ['host']).values())[0]['host']
+    return list(client.cmd(minion, 'grains.item', ['nodename']).values())[0]['nodename']


Why nodename? We use grains['host'] fairly extensively elsewhere.

This will take me a bit to remember. I believe I noticed a difference between Salt versions about different default behavior.

jschmid1 · 2018-07-20T12:34:22Z

srv/salt/_modules/osd.py

@@ -797,6 +797,12 @@ def clean(self):

        Note: expected to only run inside of "not is_prepared"
        """
+        if (self.osd.disk_format != 'filestore' and
+            self.osd.disk_format != 'bluestore'):


In which circumstances did you see that happening?

I did this primarily for the smoketests. Overriding Salt dictionaries had some issues, so I set a disk_format of 'none'.

On the chance that an admin does mistype a format, it's better not to wipe all the partitions on that drive until it's corrected.

jschmid1 · 2018-07-20T12:35:35Z

srv/salt/_modules/osd.py

@@ -1244,6 +1250,9 @@ def prepare(self):
        if self.osd.device:
            cmd = "PYTHONWARNINGS=ignore ceph-disk -v prepare "

+            # OSD ID


Maybe a bit more descriptive comment. The other option would be to extend the docstring and explain what the osd_id is used for.

jschmid1 · 2018-07-20T12:38:44Z

srv/salt/_modules/osd.py

        self._weight = weight
        self._grains = grains
        self.force = force
+        self.keyring = None
+        self.client = None
+        if 'keyring' in kwargs:


I know we have that in a lot of places. But this notation is easier to read and allows to assign default values:

self.keyring = kwargs.get('keyring', 'a default value')

jschmid1 · 2018-07-20T12:49:54Z

srv/salt/_modules/osd.py

+                if msg:
+                    log.error(msg)
+                    return msg
+                log.debug("OSD {} marked and recorded".format(self.osd_id))


That could be a log.info/warning ( something that a user sees on his screen )

We are in a module, so the message will not come back to the admin on the Salt master. I'll set it to info on the off chance that somebody runs this as a salt-call.

jschmid1 · 2018-07-20T12:52:02Z

srv/salt/_modules/osd.py

+                log.error(msg)
+                return msg
+
+        # Remove grain


Lately we ran into a situation where the remove.osd process supposedly was successful but actually did not run through completely which resulted in stale grains. We should somehow make people more aware that this command needs to be executed until it succeeds or bad things will happen :/

The new behavior of the remove.osd and replace.osd will keep retrying on timeouts. The other error messages should be much better now and actually reach the admin, so the admin will be aware that the OSD is not removed.

jschmid1 · 2018-07-20T12:58:30Z

srv/salt/_modules/osd.py

+                log.debug("content: {} {}".format(type(content), content))
+
+        if device in content:
+            # Exit early, no by-pass equivalent from previous run


by- ~~pass~~path?

jschmid1 · 2018-07-20T13:03:44Z

srv/salt/_modules/osd.py

+        _rc, _stdout, _stderr = __salt__['helper.run'](cmd)
+        if _stdout:
+            _devices = _stdout.split()
+            if _devices[0]:


Even the conditional check on an ~~potentially~~ empty array will raise an "IndexError":

It'd rather:

if devices: return devices[0]

This could still produce an IndexError.
if devices and devices[0] should do the trick. What are the repercussions though if we don't throw something here? Might be a good idea to fail loudly here?

This could still produce an IndexError.

The if devices only returns True if there is at least one item in the list. And that's what we are asking for here - the first element. Since python does not allow to set custom indices ( or at least we don't do that here ) we are fine with that check, aren't we?

In [1]: foo = ['bar'] In [2]: if foo: ...: print(foo[0]) ...: bar In [3]: foo = [] In [4]: if foo: ...: print(foo[0]) ...: In [5]:

Yes you're right...we can rely on having a list here.

jschmid1 · 2018-07-20T13:38:31Z

A comment on the smoketests:

I just tried blue->file migration and it removed 3/5 of my disks on the first osd node and recreated 2/5. I don't think that this is intentional.

tserong · 2018-07-20T13:48:38Z

I just tried blue->file migration and it removed 3/5 of my disks on the first osd node and recreated 2/5. I don't think that this is intentional.

It's intentional. The smoke tests remove all the OSDs on one storage node, then run a whole bunch of different migrations by first creating two osds in one format or another, then migrating them to another.

jan--f · 2018-07-20T14:31:57Z

In regards to @jschmid1's comment regarding a failed osd removal. Would it make sense to have a function that makes sure ceph and DeepSea agree on which OSDs are currently deployed? Along the lines "if ceph osd ls doesn't contain $id, make sure no minion has a grain for $id or a directory /var/lib/ceph/osd/ceph-$id". Iiuc this is the wrench a borked osd removal throws in our gears.

I'm not sure of all the failure modes an osd removal can produce. The theory is that if ceph doesn't know it, we shouldn't either. Not sure though if a running osd process can re-add itself to ceph or something fun like that.

jschmid1 · 2018-07-20T14:58:50Z

It's intentional. The smoke tests remove all the OSDs on one storage node, then run a whole bunch of different migrations by first creating two osds in one format or another, then migrating them to another.

ok, must have missed that.

jan--f · 2018-07-20T15:05:08Z

Smoke tests went fine for me initially. Had several repeated runs and everything was green.

I just did another smoketest run with one pool and ~23GB of data in it. This one did not finish and my cluster is still in warn with 3 PGs that can't seem to repair themselves (2 perring, 1 remapped and peering). There're two factors that might play a role here:

its a vagrant cluster, so repair is slow and the smoketests wait has timed out
the smoketests remove OSDs before starting. Not sure if the smoketests wait before starting the migrations (they should) but that might have put extra pressure on the cluster

In any case, we should absolutely test this and make sure the migration work and test alright with a somwhat filled up cluster.

swiftgist · 2018-08-17T12:39:24Z

rebased to capture additional tests.

jschmid1 · 2018-08-17T12:52:17Z

@susebot run teuthology

smithfarm · 2018-08-17T14:20:28Z

@jschmid1 I only just now re-enabled the migrate and replace functional tests, so you might need to re-run teuthology.

susebot · 2018-08-17T14:21:20Z

Commit 2c32311 is NOT OK.
Check tests results in the Jenkins job: http://158.69.90.90:8080/job/deepsea-pr/43/

smithfarm · 2018-08-17T14:27:45Z

@jschmid1 In this PR, Stage 3 is failing with:

2018-08-17T13:09:47.233 INFO:teuthology.orchestra.run.target167114247232.stdout:----------
2018-08-17T13:09:47.233 INFO:teuthology.orchestra.run.target167114247232.stdout:          ID: mgr keyrings
2018-08-17T13:09:47.233 INFO:teuthology.orchestra.run.target167114247232.stdout:    Function: salt.state
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:      Result: False
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:     Comment: Run failed on minions: target167114247232.teuthology
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:     Started: 13:09:46.770026
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:    Duration: 431.877 ms
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:     Changes:
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:              target167114247232.teuthology:
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:              ----------
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:                        ID: /var/lib/ceph/mgr/ceph-target167114247232/keyring
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                  Function: file.managed
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                    Result: False
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                   Comment: Unable to manage file: none of the specified sources were found
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                   Started: 13:09:47.145161
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                  Duration: 32.161 ms
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:                   Changes:
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:              Summary for target167114247232.teuthology
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:              ------------
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Succeeded: 0
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Failed:    1
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              ------------
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Total states run:     1
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Total run time:  32.161 ms

All the tests fail in the same way. To me, it looks like something in PR is causing it.

These functions are essentially equivalent and bundled together. Signed-off-by: Eric Jackson <[email protected]>

tserong · 2018-08-20T09:51:12Z

I did a clean install with 8e2592c on my vagrant setup, and stages 0-3 plus ceph.functests.1node.migrate seem to work fine (this cluster has no data in it, though)

smithfarm · 2018-08-20T10:18:26Z

@susebot run teuthology

jschmid1 · 2018-08-20T11:30:05Z

All tests are still failing with

2018-08-20T10:34:15.177 INFO:teuthology.orchestra.run.target054037030145.stdout:          ID: mgr keyrings
2018-08-20T10:34:15.177 INFO:teuthology.orchestra.run.target054037030145.stdout:    Function: salt.state
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:      Result: False
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:     Comment: Run failed on minions: target054037030145.teuthology
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:     Started: 10:34:14.698933
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:    Duration: 446.225 ms
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:     Changes:
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:              target054037030145.teuthology:
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:              ----------
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                        ID: /var/lib/ceph/mgr/ceph-target054037030145/keyring
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                  Function: file.managed
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                    Result: False
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                   Comment: Unable to manage file: none of the specified sources were found
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                   Started: 10:34:15.065326
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                  Duration: 41.795 ms
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:                   Changes:
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              Summary for target054037030145.teuthology
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              ------------
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              Succeeded: 0
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              Failed:    1
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              ------------
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:              Total states run:     1
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:              Total run time:  41.795 ms
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:Summary for target054037030145.teuthology_master
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:------------
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:Succeeded: 6 (changed=4)
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:Failed:    1

I also just deployed this branch locally and could not reproduce this issue. @smithfarm

susebot · 2018-08-20T11:42:35Z

Commit 8e2592c is NOT OK.
Check tests results in the Jenkins job: http://158.69.90.90:8080/job/deepsea-pr/47/

smithfarm · 2018-08-20T14:17:11Z

@jschmid1 That would indicate some incompatibility of this PR with the teuthology environment (host name, etc.). Preparing a debugging env for you to SSH into.

smithfarm · 2018-08-20T14:22:29Z

I suspect that {{ grains['host'] }} is not behaving as expected in the teuthology environment. Perhaps it returns the long hostname instead of the short? How to verify this?

Signed-off-by: Joshua Schmid <[email protected]>

jschmid1 · 2018-08-20T15:13:12Z

I suspect that {{ grains['host'] }} is not behaving as expected in the teuthology environment. Perhaps it returns the long hostname instead of the short? How to verify this?

That's what I thought too. On my VMs nodename == host, but on teuhology it's not. Assuming that it's the same for real hardware.
Pushed a fix for it.

smithfarm · 2018-08-20T15:16:49Z

Manually triggered suse/tier1/functional: http://137.74.25.20:8081/ubuntu-2018-08-20_15:14:07-suse:tier1:functional-ses6---basic-openstack/

    deepsea:
      branch: wip-migrate
      exec:
      - suites/basic/health-ok.sh
      - salt-run state.orch ceph.functests.1node.replace
      repo: https://github.com/SUSE/DeepSea.git

smithfarm · 2018-08-20T16:04:14Z

Results are in, and ceph.functests.1node.migrate does not pass :-( Please take a look.

jschmid1 · 2018-08-21T08:09:38Z

Results are in, and ceph.functests.1node.migrate does not pass :-( Please take a look.

@smithfarm
That doesn't necessarily is a failure. I saw that it fails to go active+clean which could also be a hickup in the deployment. We should run that tests again to be sure it's a bug

smithfarm · 2018-08-21T09:22:49Z

migrate functests orchestration failed again:

2018-08-21T09:18:10.545 INFO:teuthology.orchestra.run.target149202187000.stdout:target149202187000.teuthology_master:
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Check environment ftob - Function: salt.state - Result: Clean Started: - 09:15:19.000101 Duration: 1979.115 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Remove OSDs ftob - Function: salt.state - Result: Changed Started: - 09:15:20.979742 Duration: 151400.922 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Remove destroyed ftob - Function: salt.state - Result: Changed Started: - 09:17:52.380958 Duration: 5616.153 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Initialize OSDs ftob - Function: salt.state - Result: Changed Started: - 09:17:57.997634 Duration: 9774.949 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: smoketests.checklist - Function: salt.runner - Result: Changed Started: - 09:18:07.772990 Duration: 2151.255 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:----------
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:          ID: Check reset OSDs ftob
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:    Function: salt.state
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:      Result: False
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:     Comment: Run failed on minions: target149202187000.teuthology
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:     Started: 09:18:09.924634
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:    Duration: 574.146 ms
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:     Changes:
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:              target149202187000.teuthology:
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:              ----------
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:                        ID: check
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                  Function: osd.correct
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                    Result: False
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                   Comment:
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                   Started: 09:18:10.288754
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                  Duration: 194.844 ms
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                   Changes:
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:              Summary for target149202187000.teuthology
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              ------------
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Succeeded: 0
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Failed:    1
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              ------------
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Total states run:     1
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Total run time: 194.844 ms
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:Summary for target149202187000.teuthology_master
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:------------
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Succeeded: 5 (changed=4)
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Failed:    1
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:------------
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Total states run:     6
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Total run time: 171.497 s

jschmid1 · 2018-08-22T08:18:18Z

in response to #1216 (comment) :

The root cause for this is imo that we are running these tests on a single node.

That causes to trigger rebalancing of data from one osd to another on a single host which, in this case, resulted in pg inconsistency. Therefore the removal process timed out and left traces of a $wrong_deployment, which triggered osd.correct to fail.

When I tried to reproduce this on my cluster ( 5 OSD nodes x 6 OSDs ) the rebalance happend quickly and the test advanced.

Our take on this would be to increase the number of nodes we require for testing the migration to at least 2 ( 3 to be on the safe side )

jschmid1 · 2018-08-22T12:32:51Z

In the light of the recent findings[0], I think it's time to merge this beast of a PR.

Migration tests are failing due to a couple of reasons:

Too less OSDS and OSD hosts, resulting in rebalancing issues ( fixed )
kernel not being notified after partitions are removed [1] ( orphans ins /dev/vdx[0-9] whilst lsblk does not show anything , removed with sgdisk -Z or fdisk or parted.[2] This resulted in issues during deployment
umount fails, leaves mounted osds behind
no failhard on states, the actual failure happend on the wrong end and showed weird results

The first two points points are being fixed by adapting the way we execute tests ( more nodes, less tests on one node )

The last two points will be fixed shortly after this is merged. ( pending commit 8d1b303 )

NOTE: for all of us the tests pass locally. All the points mentioned above are only valid for OVH

@smithfarm @tserong @jan--f Please speak up if you see any problem with this.

[0] https://etherpad.nue.suse.com/p/Deepsea_standup_2018_08_21

[1]

target147135132110:~ # l /dev/vdd*
brw-rw---- 1 root disk 254, 48 Aug 22 14:33 /dev/vdd
brw-rw---- 1 ceph ceph 254, 58 Aug 22 14:33 /dev/vdd10
-rw-r--r-- 1 root root       0 Aug 22 13:54 /dev/vdd2
-rw-r--r-- 1 root root       0 Aug 22 14:30 /dev/vdd8
brw-rw---- 1 ceph ceph 254, 57 Aug 22 14:33 /dev/vdd9

[2]

vdd     254:48   0   10G  0 disk
├─vdd9  254:57   0  100M  0 part
└─vdd10 254:58   0  100M  0 part

smithfarm · 2018-08-22T20:52:20Z

@jschmid1 I don't have any reservations about merging this. The migrate functional test orchestration still fails in OVH even with your most recent fixes (wip-adapt-migrate-functests), but it almost passes (179 states succeeded, 1 failed). I'm confident in your ability to get it completely fixed in a follow-up PR 👍

swiftgist force-pushed the wip-migrate branch from 05ba1fb to 3b21e44 Compare July 6, 2018 21:51

denisok requested review from jschmid1, jan--f and tserong July 12, 2018 13:33

denisok assigned swiftgist Jul 12, 2018

jschmid1 mentioned this pull request Jul 20, 2018

Remove creation of unused partitions, implement recommendation #1188

Closed

jschmid1 added the needs-backport label Jul 20, 2018

jschmid1 reviewed Jul 20, 2018

View reviewed changes

jan--f reviewed Jul 20, 2018

View reviewed changes

tserong reviewed Jul 20, 2018

View reviewed changes

jschmid1 reviewed Jul 20, 2018

View reviewed changes

Reuse OSD ID in migrations, support replace.osds

8e2592c

These functions are essentially equivalent and bundled together. Signed-off-by: Eric Jackson <[email protected]>

jschmid1 force-pushed the wip-migrate branch from 2c32311 to 8e2592c Compare August 20, 2018 08:18

Use host rather than nodename consistently

31e9478

Signed-off-by: Joshua Schmid <[email protected]>

jschmid1 force-pushed the wip-migrate branch from 521f980 to 31e9478 Compare August 20, 2018 15:01

jan--f mentioned this pull request Aug 22, 2018

[backport][SES5]Reuse OSD ID in migrations, support replace.osds #1258

Merged

smithfarm mentioned this pull request Aug 22, 2018

qa: common/deploy: run "zypper ps -s" on all nodes #1296

Merged

1 task

jschmid1 merged commit 4b426b2 into master Aug 23, 2018

jschmid1 added backported and removed needs-backport labels Aug 23, 2018

This was referenced Aug 29, 2018

migration, crush map and alternate roots #1091

Closed

Stop migration if custom crushmap is detected #975

Closed

		return False


		def _remove_minion(local, master_minion, osd_id, passed, host):

Reuse OSD ID in migrations, support replace.osds #1216

Reuse OSD ID in migrations, support replace.osds #1216

Conversation

swiftgist commented Jul 6, 2018 • edited by jschmid1 Loading

susebot commented Jul 6, 2018

denisok commented Jul 19, 2018

tserong commented Jul 20, 2018

tserong commented Jul 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jan--f Jul 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tserong left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jschmid1 Jul 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jschmid1 Jul 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jschmid1 commented Jul 20, 2018

tserong commented Jul 20, 2018

jan--f commented Jul 20, 2018

jschmid1 commented Jul 20, 2018

jan--f commented Jul 20, 2018

swiftgist commented Aug 17, 2018

jschmid1 commented Aug 17, 2018

smithfarm commented Aug 17, 2018

susebot commented Aug 17, 2018

smithfarm commented Aug 17, 2018 • edited Loading

tserong commented Aug 20, 2018

smithfarm commented Aug 20, 2018

jschmid1 commented Aug 20, 2018

susebot commented Aug 20, 2018

smithfarm commented Aug 20, 2018

smithfarm commented Aug 20, 2018

jschmid1 commented Aug 20, 2018

smithfarm commented Aug 20, 2018 • edited Loading

smithfarm commented Aug 20, 2018

jschmid1 commented Aug 21, 2018

smithfarm commented Aug 21, 2018

jschmid1 commented Aug 22, 2018 • edited Loading

jschmid1 commented Aug 22, 2018 • edited Loading

smithfarm commented Aug 22, 2018

swiftgist commented Jul 6, 2018 •

edited by jschmid1

Loading

tserong commented Jul 20, 2018 •

edited

Loading

jan--f Jul 20, 2018 •

edited

Loading

tserong left a comment •

edited

Loading

jschmid1 Jul 20, 2018 •

edited

Loading

jschmid1 Jul 20, 2018 •

edited

Loading

smithfarm commented Aug 17, 2018 •

edited

Loading

smithfarm commented Aug 20, 2018 •

edited

Loading

jschmid1 commented Aug 22, 2018 •

edited

Loading

jschmid1 commented Aug 22, 2018 •

edited

Loading