Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse OSD ID in migrations, support replace.osds #1216

Merged
merged 2 commits into from
Aug 23, 2018
Merged

Conversation

swiftgist
Copy link
Contributor

@swiftgist swiftgist commented Jul 6, 2018

These functions are essentially equivalent and bundled together. See #1146 for a detailed description.
Also, this PR should not be merged until a decision about #1215 happens.

#1188 #1259 is a prerequisite for this PR. Otherwise, the smoketests may fail.

Signed-off-by: Eric Jackson [email protected]

@susebot
Copy link
Collaborator

susebot commented Jul 6, 2018

You are not allowed to trigger tests. Please ask to get whitelisted.

@denisok
Copy link

denisok commented Jul 19, 2018

@jan--f @jschmid1 @tserong please review - this needs to be landed ASAP

@tserong
Copy link
Member

tserong commented Jul 20, 2018

Looking at ceph osd tree while running smoketests, I get output like:

ID  CLASS WEIGHT  TYPE NAME          STATUS    REWEIGHT PRI-AFF 
 -1       0.46562 root default                                  
 -5       0.07758     host data1                                
 10   hdd 0.01939         osd.10     destroyed        0 1.00000 
 14   hdd 0.01939         osd.14     destroyed        0 1.00000 
 18   hdd 0.01939         osd.18     destroyed        0 1.00000 
 22   hdd 0.01939         osd.22     destroyed        0 1.00000 
 -3       0.11636     host data2                                
  0   hdd 0.01939         osd.0             up  1.00000 1.00000 
  4   hdd 0.01939         osd.4             up  1.00000 1.00000 
  8   hdd 0.01939         osd.8             up  1.00000 1.00000 
 12   hdd 0.01939         osd.12            up  1.00000 1.00000 
 16   hdd 0.01939         osd.16            up  1.00000 1.00000 
 21   hdd 0.01939         osd.21            up  1.00000 1.00000 
 -7       0.11636     host data3                                
  1   hdd 0.01939         osd.1             up  1.00000 1.00000 
  7   hdd 0.01939         osd.7             up  1.00000 1.00000 
 11   hdd 0.01939         osd.11            up  1.00000 1.00000 
 15   hdd 0.01939         osd.15            up  1.00000 1.00000 
 19   hdd 0.01939         osd.19            up  1.00000 1.00000 
 23   hdd 0.01939         osd.23            up  1.00000 1.00000 
 -9       0.11636     host data4                                
  3   hdd 0.01939         osd.3             up  1.00000 1.00000 
  5   hdd 0.01939         osd.5             up  1.00000 1.00000 
  9   hdd 0.01939         osd.9             up  1.00000 1.00000 
 13   hdd 0.01939         osd.13            up  1.00000 1.00000 
 17   hdd 0.01939         osd.17            up  1.00000 1.00000 
 20   hdd 0.01939         osd.20            up  1.00000 1.00000 
-11       0.03896     host localhost                            
  2   hdd 0.01947         osd.2           down        0 1.00000 
  6   hdd 0.01949         osd.6             up  1.00000 1.00000 

Now, I know the smoketests destroy all the OSDs on one node, then do migrates involving only two OSDs on that node, but, while this is all running, the OSDs appear in the "localhost" bucket, not in the "data1" bucket. Is that indicative of a problem at all?

@tserong
Copy link
Member

tserong commented Jul 20, 2018

...actually, my environment somehow got broken part way though this:

# salt '*' grains.get host
data1.ceph:
    localhost
data2.ceph:
    localhost
admin.ceph:
    localhost
data3.ceph:
    localhost
data4.ceph:
    localhost

Rebooting seems to have fixed it.

"""
# Parameters for osd.remove module
supported = ['force', 'timeout', 'delay']
passed = ["{}={}".format(k, v) for k, v in kwargs.items() if k in supported]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a notification that someone passed unsupported parameters be helpful here? That'd also help people that did a typo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The underscore variables get passed from Salt when called via a state file, iirc. The warning would be a wall of text without extra processing trying to separate Salt internal variables. I thought the debug would be enough.

"""
Usage
"""
usage = ('salt-run replace.osd id [id ...][force=True]:\n\n'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should reflect the supported args of the respective module '['force', 'timeout', 'delay']'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the timeout and delay are passed to the osd.remove module, so the defaults aren't set here. Also, only using those two in smoketests currently. I will have to think about what to put here.


if len(osds) > 1:
# Pause for a moment, let the admin see what they passed
print("Removing osds {} from minions\nPress Ctrl-C to abort".format(", ".join(osds)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tell the admin for how long he'll be able to hit CTRL+C

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me take a look.

completed = osds
for osd_id in osds:
host = _find_host(osd_id, host_osds)
if host:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be verbose in case there was no host found. Currently it will just not do anything and bail out silently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove.osd command also calls this module. In the case that a remove.osd has already been called and is no longer present on the minions, then the OSD will be removed from Ceph (a message is printed for that). Currently, the behavior prints messages for what actions it does.

I could add a verbose option, but I would prefer the default to not be verbose. If an admin is using a wildcard for dozens of OSDs and rerunning the command multiple times, then the output gets shorter as the Ceph cluster reaches the desired state.

return False


def _remove_minion(local, master_minion, osd_id, passed, host):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name of the function should be _remove_osd_from_minion or something in that direction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just '_remove_osd'.

return ""


def help_():
Copy link
Contributor

@jan--f jan--f Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there is something missing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a policy question: the smoketest runner lives in the qa package. Which way do we want to go with help messages? Do they follow all runners or only those in the main package?

Copy link
Member

@tserong tserong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anyone tried this with a custom crushmap, to verify the OSDs remain where they should during migration?

I did a manual test with a custom crushmap and osd crush update on start = false in ceph.conf:

# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF 
-12       1.00000 root alt-root                               
-11       1.00000     host data1-fake                         
  2   hdd 1.00000         osd.2           up  1.00000 1.00000 
 -1       0.42195 root default                                
 -3       0.07286     host data1                              
  6   hdd 0.01457         osd.6           up  1.00000 1.00000 
 10   hdd 0.01457         osd.10          up  1.00000 1.00000 
 14   hdd 0.01457         osd.14          up  1.00000 1.00000 
 18   hdd 0.01457         osd.18          up  1.00000 1.00000 
 22   hdd 0.01457         osd.22          up  1.00000 1.00000 
[...]

This was originally deployed with filestore, I flipped to a bluestore profile for the data1 host, ran stage 2, then ceph.migrate.osds; the migration succeeded and everything remained where it was meant to be in the crushmap.

return ""


def minion_profile(minion):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest calling this _rename_minion_profile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the public module name, so the runner call is "salt-run replace.minion_profile".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my bad.

@@ -51,7 +51,7 @@ def _grain_host(client, minion):
"""
Return the host grain for a given minion, for use a short hostname
"""
return list(client.cmd(minion, 'grains.item', ['host']).values())[0]['host']
return list(client.cmd(minion, 'grains.item', ['nodename']).values())[0]['nodename']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why nodename? We use grains['host'] fairly extensively elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will take me a bit to remember. I believe I noticed a difference between Salt versions about different default behavior.

@@ -797,6 +797,12 @@ def clean(self):

Note: expected to only run inside of "not is_prepared"
"""
if (self.osd.disk_format != 'filestore' and
self.osd.disk_format != 'bluestore'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which circumstances did you see that happening?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this primarily for the smoketests. Overriding Salt dictionaries had some issues, so I set a disk_format of 'none'.

On the chance that an admin does mistype a format, it's better not to wipe all the partitions on that drive until it's corrected.

@@ -1244,6 +1250,9 @@ def prepare(self):
if self.osd.device:
cmd = "PYTHONWARNINGS=ignore ceph-disk -v prepare "

# OSD ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit more descriptive comment. The other option would be to extend the docstring and explain what the osd_id is used for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating

self._weight = weight
self._grains = grains
self.force = force
self.keyring = None
self.client = None
if 'keyring' in kwargs:
Copy link
Contributor

@jschmid1 jschmid1 Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we have that in a lot of places. But this notation is easier to read and allows to assign default values:

self.keyring = kwargs.get('keyring', 'a default value')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing

if msg:
log.error(msg)
return msg
log.debug("OSD {} marked and recorded".format(self.osd_id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be a log.info/warning ( something that a user sees on his screen )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in a module, so the message will not come back to the admin on the Salt master. I'll set it to info on the off chance that somebody runs this as a salt-call.

log.error(msg)
return msg

# Remove grain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lately we ran into a situation where the remove.osd process supposedly was successful but actually did not run through completely which resulted in stale grains. We should somehow make people more aware that this command needs to be executed until it succeeds or bad things will happen :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new behavior of the remove.osd and replace.osd will keep retrying on timeouts. The other error messages should be much better now and actually reach the admin, so the admin will be aware that the OSD is not removed.

log.debug("content: {} {}".format(type(content), content))

if device in content:
# Exit early, no by-pass equivalent from previous run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by- passpath?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

_rc, _stdout, _stderr = __salt__['helper.run'](cmd)
if _stdout:
_devices = _stdout.split()
if _devices[0]:
Copy link
Contributor

@jschmid1 jschmid1 Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even the conditional check on an potentially empty array will raise an "IndexError":

It'd rather:

if devices:
  return devices[0]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could still produce an IndexError.
if devices and devices[0] should do the trick. What are the repercussions though if we don't throw something here? Might be a good idea to fail loudly here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could still produce an IndexError.

The if devices only returns True if there is at least one item in the list. And that's what we are asking for here - the first element. Since python does not allow to set custom indices ( or at least we don't do that here ) we are fine with that check, aren't we?

In [1]: foo = ['bar']

In [2]: if foo:
   ...:     print(foo[0])
   ...: 
bar

In [3]: foo = []

In [4]: if foo:
   ...:     print(foo[0])
   ...: 

In [5]: 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right...we can rely on having a list here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@jschmid1
Copy link
Contributor

A comment on the smoketests:

I just tried blue->file migration and it removed 3/5 of my disks on the first osd node and recreated 2/5. I don't think that this is intentional.

@tserong
Copy link
Member

tserong commented Jul 20, 2018

I just tried blue->file migration and it removed 3/5 of my disks on the first osd node and recreated 2/5. I don't think that this is intentional.

It's intentional. The smoke tests remove all the OSDs on one storage node, then run a whole bunch of different migrations by first creating two osds in one format or another, then migrating them to another.

@jan--f
Copy link
Contributor

jan--f commented Jul 20, 2018

In regards to @jschmid1's comment regarding a failed osd removal. Would it make sense to have a function that makes sure ceph and DeepSea agree on which OSDs are currently deployed? Along the lines "if ceph osd ls doesn't contain $id, make sure no minion has a grain for $id or a directory /var/lib/ceph/osd/ceph-$id". Iiuc this is the wrench a borked osd removal throws in our gears.

I'm not sure of all the failure modes an osd removal can produce. The theory is that if ceph doesn't know it, we shouldn't either. Not sure though if a running osd process can re-add itself to ceph or something fun like that.

@jschmid1
Copy link
Contributor

It's intentional. The smoke tests remove all the OSDs on one storage node, then run a whole bunch of different migrations by first creating two osds in one format or another, then migrating them to another.

ok, must have missed that.

@jan--f
Copy link
Contributor

jan--f commented Jul 20, 2018

Smoke tests went fine for me initially. Had several repeated runs and everything was green.

I just did another smoketest run with one pool and ~23GB of data in it. This one did not finish and my cluster is still in warn with 3 PGs that can't seem to repair themselves (2 perring, 1 remapped and peering). There're two factors that might play a role here:

  • its a vagrant cluster, so repair is slow and the smoketests wait has timed out
  • the smoketests remove OSDs before starting. Not sure if the smoketests wait before starting the migrations (they should) but that might have put extra pressure on the cluster

In any case, we should absolutely test this and make sure the migration work and test alright with a somwhat filled up cluster.

@swiftgist
Copy link
Contributor Author

rebased to capture additional tests.

@jschmid1
Copy link
Contributor

@susebot run teuthology

@smithfarm
Copy link
Contributor

@jschmid1 I only just now re-enabled the migrate and replace functional tests, so you might need to re-run teuthology.

@susebot
Copy link
Collaborator

susebot commented Aug 17, 2018

Commit 2c32311 is NOT OK.
Check tests results in the Jenkins job: http://158.69.90.90:8080/job/deepsea-pr/43/

@smithfarm
Copy link
Contributor

smithfarm commented Aug 17, 2018

@jschmid1 In this PR, Stage 3 is failing with:

2018-08-17T13:09:47.233 INFO:teuthology.orchestra.run.target167114247232.stdout:----------
2018-08-17T13:09:47.233 INFO:teuthology.orchestra.run.target167114247232.stdout:          ID: mgr keyrings
2018-08-17T13:09:47.233 INFO:teuthology.orchestra.run.target167114247232.stdout:    Function: salt.state
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:      Result: False
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:     Comment: Run failed on minions: target167114247232.teuthology
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:     Started: 13:09:46.770026
2018-08-17T13:09:47.234 INFO:teuthology.orchestra.run.target167114247232.stdout:    Duration: 431.877 ms
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:     Changes:
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:              target167114247232.teuthology:
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:              ----------
2018-08-17T13:09:47.235 INFO:teuthology.orchestra.run.target167114247232.stdout:                        ID: /var/lib/ceph/mgr/ceph-target167114247232/keyring
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                  Function: file.managed
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                    Result: False
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                   Comment: Unable to manage file: none of the specified sources were found
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                   Started: 13:09:47.145161
2018-08-17T13:09:47.236 INFO:teuthology.orchestra.run.target167114247232.stdout:                  Duration: 32.161 ms
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:                   Changes:
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:              Summary for target167114247232.teuthology
2018-08-17T13:09:47.237 INFO:teuthology.orchestra.run.target167114247232.stdout:              ------------
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Succeeded: 0
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Failed:    1
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              ------------
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Total states run:     1
2018-08-17T13:09:47.238 INFO:teuthology.orchestra.run.target167114247232.stdout:              Total run time:  32.161 ms

All the tests fail in the same way. To me, it looks like something in PR is causing it.

These functions are essentially equivalent and bundled together.

Signed-off-by: Eric Jackson <[email protected]>
@tserong
Copy link
Member

tserong commented Aug 20, 2018

I did a clean install with 8e2592c on my vagrant setup, and stages 0-3 plus ceph.functests.1node.migrate seem to work fine (this cluster has no data in it, though)

@smithfarm
Copy link
Contributor

@susebot run teuthology

@jschmid1
Copy link
Contributor

All tests are still failing with

2018-08-20T10:34:15.177 INFO:teuthology.orchestra.run.target054037030145.stdout:          ID: mgr keyrings
2018-08-20T10:34:15.177 INFO:teuthology.orchestra.run.target054037030145.stdout:    Function: salt.state
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:      Result: False
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:     Comment: Run failed on minions: target054037030145.teuthology
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:     Started: 10:34:14.698933
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:    Duration: 446.225 ms
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:     Changes:
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:              target054037030145.teuthology:
2018-08-20T10:34:15.178 INFO:teuthology.orchestra.run.target054037030145.stdout:              ----------
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                        ID: /var/lib/ceph/mgr/ceph-target054037030145/keyring
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                  Function: file.managed
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                    Result: False
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                   Comment: Unable to manage file: none of the specified sources were found
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                   Started: 10:34:15.065326
2018-08-20T10:34:15.179 INFO:teuthology.orchestra.run.target054037030145.stdout:                  Duration: 41.795 ms
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:                   Changes:
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              Summary for target054037030145.teuthology
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              ------------
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              Succeeded: 0
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              Failed:    1
2018-08-20T10:34:15.180 INFO:teuthology.orchestra.run.target054037030145.stdout:              ------------
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:              Total states run:     1
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:              Total run time:  41.795 ms
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:Summary for target054037030145.teuthology_master
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:------------
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:Succeeded: 6 (changed=4)
2018-08-20T10:34:15.181 INFO:teuthology.orchestra.run.target054037030145.stdout:Failed:    1

I also just deployed this branch locally and could not reproduce this issue. @smithfarm

@susebot
Copy link
Collaborator

susebot commented Aug 20, 2018

Commit 8e2592c is NOT OK.
Check tests results in the Jenkins job: http://158.69.90.90:8080/job/deepsea-pr/47/

@smithfarm
Copy link
Contributor

@jschmid1 That would indicate some incompatibility of this PR with the teuthology environment (host name, etc.). Preparing a debugging env for you to SSH into.

@smithfarm
Copy link
Contributor

I suspect that {{ grains['host'] }} is not behaving as expected in the teuthology environment. Perhaps it returns the long hostname instead of the short? How to verify this?

@jschmid1
Copy link
Contributor

I suspect that {{ grains['host'] }} is not behaving as expected in the teuthology environment. Perhaps it returns the long hostname instead of the short? How to verify this?

That's what I thought too. On my VMs nodename == host, but on teuhology it's not. Assuming that it's the same for real hardware.
Pushed a fix for it.

@smithfarm
Copy link
Contributor

smithfarm commented Aug 20, 2018

Manually triggered suse/tier1/functional: http://137.74.25.20:8081/ubuntu-2018-08-20_15:14:07-suse:tier1:functional-ses6---basic-openstack/

    deepsea:
      branch: wip-migrate
      exec:
      - suites/basic/health-ok.sh
      - salt-run state.orch ceph.functests.1node.replace
      repo: https://github.com/SUSE/DeepSea.git

@smithfarm
Copy link
Contributor

Results are in, and ceph.functests.1node.migrate does not pass :-( Please take a look.

@jschmid1
Copy link
Contributor

Results are in, and ceph.functests.1node.migrate does not pass :-( Please take a look.

@smithfarm
That doesn't necessarily is a failure. I saw that it fails to go active+clean which could also be a hickup in the deployment. We should run that tests again to be sure it's a bug

@smithfarm
Copy link
Contributor

migrate functests orchestration failed again:

2018-08-21T09:18:10.545 INFO:teuthology.orchestra.run.target149202187000.stdout:target149202187000.teuthology_master:
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Check environment ftob - Function: salt.state - Result: Clean Started: - 09:15:19.000101 Duration: 1979.115 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Remove OSDs ftob - Function: salt.state - Result: Changed Started: - 09:15:20.979742 Duration: 151400.922 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Remove destroyed ftob - Function: salt.state - Result: Changed Started: - 09:17:52.380958 Duration: 5616.153 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: Initialize OSDs ftob - Function: salt.state - Result: Changed Started: - 09:17:57.997634 Duration: 9774.949 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:  Name: smoketests.checklist - Function: salt.runner - Result: Changed Started: - 09:18:07.772990 Duration: 2151.255 ms
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:----------
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:          ID: Check reset OSDs ftob
2018-08-21T09:18:10.546 INFO:teuthology.orchestra.run.target149202187000.stdout:    Function: salt.state
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:      Result: False
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:     Comment: Run failed on minions: target149202187000.teuthology
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:     Started: 09:18:09.924634
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:    Duration: 574.146 ms
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:     Changes:
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:              target149202187000.teuthology:
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:              ----------
2018-08-21T09:18:10.547 INFO:teuthology.orchestra.run.target149202187000.stdout:                        ID: check
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                  Function: osd.correct
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                    Result: False
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                   Comment:
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                   Started: 09:18:10.288754
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                  Duration: 194.844 ms
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:                   Changes:
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:
2018-08-21T09:18:10.548 INFO:teuthology.orchestra.run.target149202187000.stdout:              Summary for target149202187000.teuthology
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              ------------
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Succeeded: 0
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Failed:    1
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              ------------
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Total states run:     1
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:              Total run time: 194.844 ms
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:Summary for target149202187000.teuthology_master
2018-08-21T09:18:10.549 INFO:teuthology.orchestra.run.target149202187000.stdout:------------
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Succeeded: 5 (changed=4)
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Failed:    1
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:------------
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Total states run:     6
2018-08-21T09:18:10.550 INFO:teuthology.orchestra.run.target149202187000.stdout:Total run time: 171.497 s

@jschmid1
Copy link
Contributor

jschmid1 commented Aug 22, 2018

in response to #1216 (comment) :

The root cause for this is imo that we are running these tests on a single node.

That causes to trigger rebalancing of data from one osd to another on a single host which, in this case, resulted in pg inconsistency. Therefore the removal process timed out and left traces of a $wrong_deployment, which triggered osd.correct to fail.

When I tried to reproduce this on my cluster ( 5 OSD nodes x 6 OSDs ) the rebalance happend quickly and the test advanced.

Our take on this would be to increase the number of nodes we require for testing the migration to at least 2 ( 3 to be on the safe side )

@jschmid1
Copy link
Contributor

jschmid1 commented Aug 22, 2018

In the light of the recent findings[0], I think it's time to merge this beast of a PR.

Migration tests are failing due to a couple of reasons:

  • Too less OSDS and OSD hosts, resulting in rebalancing issues ( fixed )
  • kernel not being notified after partitions are removed [1] ( orphans ins /dev/vdx[0-9] whilst lsblk does not show anything , removed with sgdisk -Z or fdisk or parted.[2] This resulted in issues during deployment
  • umount fails, leaves mounted osds behind
  • no failhard on states, the actual failure happend on the wrong end and showed weird results

The first two points points are being fixed by adapting the way we execute tests ( more nodes, less tests on one node )

The last two points will be fixed shortly after this is merged. ( pending commit 8d1b303 )

NOTE: for all of us the tests pass locally. All the points mentioned above are only valid for OVH

@smithfarm @tserong @jan--f Please speak up if you see any problem with this.

[0] https://etherpad.nue.suse.com/p/Deepsea_standup_2018_08_21

[1]

target147135132110:~ # l /dev/vdd*
brw-rw---- 1 root disk 254, 48 Aug 22 14:33 /dev/vdd
brw-rw---- 1 ceph ceph 254, 58 Aug 22 14:33 /dev/vdd10
-rw-r--r-- 1 root root       0 Aug 22 13:54 /dev/vdd2
-rw-r--r-- 1 root root       0 Aug 22 14:30 /dev/vdd8
brw-rw---- 1 ceph ceph 254, 57 Aug 22 14:33 /dev/vdd9

[2]

vdd     254:48   0   10G  0 disk
├─vdd9  254:57   0  100M  0 part
└─vdd10 254:58   0  100M  0 part

@smithfarm
Copy link
Contributor

@jschmid1 I don't have any reservations about merging this. The migrate functional test orchestration still fails in OVH even with your most recent fixes (wip-adapt-migrate-functests), but it almost passes (179 states succeeded, 1 failed). I'm confident in your ability to get it completely fixed in a follow-up PR 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants