Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

now you can use a network link name with a hyphen #1045

Merged
merged 1 commit into from
Oct 31, 2016

Conversation

wind0204
Copy link
Contributor

@leseb
Copy link
Member

leseb commented Oct 25, 2016

Have you been able to test and validate that?
It'd be nice if you could share some logs :), thanks!

@wind0204
Copy link
Contributor Author

wind0204 commented Oct 25, 2016

@leseb sure; though I've gotten stuck at another issue ( #760 )

see my a little weak proof:

root@0-h0:~/ceph-ansible# cat /etc/ceph/ceph.conf
[global]
mon host = 192.168.255.227,192.168.255.254
...
root@peta-s-227:~/mnt# cat /etc/ceph/ceph.conf
[global]
mon host = 192.168.255.227,192.168.255.254
...

other settings are the same with the issue thread ( #1038 )

@leseb
Copy link
Member

leseb commented Oct 25, 2016

How many mons do you have? Only 2?
If you can't collect keys, this means monitors didn't start or are not in quorum.
Please check if processes are running, if so check their logs.

@wind0204
Copy link
Contributor Author

wind0204 commented Oct 25, 2016

@leseb I have 2 monitors and they seem to be up and running:

root@peta-s-227:~/mnt# systemctl status ceph-mon@*
● [email protected] - Ceph cluster monitor daemon
   Loaded: loaded (/lib/systemd/system/[email protected]; enabled)
   Active: active (running) since Tue 2016-10-25 20:02:55 KST; 3h 17min ago
 Main PID: 25182 (ceph-mon)
   CGroup: /system.slice/system-ceph\x2dmon.slice/[email protected]
           └─25182 /usr/bin/ceph-mon -f --cluster ceph --id peta-s-227 --setuser ceph --setgroup ceph

Oct 25 20:02:55 peta-s-227 systemd[1]: Started Ceph cluster monitor daemon.
Oct 25 20:02:57 peta-s-227 ceph-mon[25182]: starting mon.peta-s-227 rank -1 at 192.168.255.227:6789/0 mon_data /var/lib/ceph/mon/ceph-peta-s-227 f...5b39fc6bf
Oct 25 23:20:28 peta-s-227 systemd[1]: [/lib/systemd/system/[email protected]:24] Unknown lvalue 'TasksMax' in section 'Service'
Hint: Some lines were ellipsized, use -l to show in full.
root@peta-s-227:~/mnt# ps -el | grep ceph-mon
4 S 64045 25182     1 30  80   0 - 3776679 -    ?        01:00:08 ceph-mon
doeals0@0-h0:~$ systemctl status ceph-mon@*
● [email protected] - Ceph cluster monitor daemon
   Loaded: loaded (/lib/systemd/system/[email protected]; enabled)
   Active: active (running) since Tue 2016-10-25 20:53:04 KST; 2h 29min ago
 Main PID: 25945 (ceph-mon)
   CGroup: /system.slice/system-ceph\x2dmon.slice/[email protected]
           └─25945 /usr/bin/ceph-mon -f --cluster ceph --id 0-h0 --setuser ceph --setgroup ceph
doeals0@0-h0:~$ ps -el | grep ceph-mon
4 S 64045 25945     1 19  80   0 - 2538652 -    ?        00:29:18 ceph-mon

@leseb
Copy link
Member

leseb commented Oct 25, 2016

2 doesn't make a quorum. Drop down to 1 or go with 3 at least.

@wind0204
Copy link
Contributor Author

@leseb I could run a ceph cluster with only 2 monitors when I used ceph-deploy; what makes you say so? I don't have enough resources to deploy 3 monitors; nor do I want to have only 1 monitor risking the availability.

@leseb
Copy link
Member

leseb commented Oct 25, 2016

No no no. Monitors have quorum that can only be resolved by an odd number
of machines. Having 1 or 2 mons is the same. If you loose one of the two
the remaining one will go stale. So please DON'T DO THIS.

On Tuesday, 25 October 2016, Nicholas Gim [email protected] wrote:

@leseb https://github.com/leseb I could run a ceph cluster with only 2
monitors when I used ceph-deploy; what makes you say so? I don't have
enough resources to deploy 3 monitors; nor do I want to have only 1 monitor
risking the availability.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1045 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA3tX-c-GBgj1ll8lBxo47EAdS8P9f0eks5q3hVugaJpZM4Ke9ol
.

Cheers

––––––
Sébastien Han
Senior Cloud Architect

"Always give 100%. Unless you're giving blood."

Mail: [email protected]
Address: 11 bis, rue Roquépine - 75008 Paris

@wind0204
Copy link
Contributor Author

wind0204 commented Oct 26, 2016

@leseb alright, thanks a lot for the information. I've reduced it to 1 from 2; the monitor is up and running ( http://docs.ceph.com/docs/jewel/rados/configuration/mon-config-ref/#initial-members )

now I get another error message for ceph-osds this time ( related pull request : #1035 (comment) )

TASK [ceph-osd : include] ******************************************************
fatal: [peta-s-225]: FAILED! => {"failed": true, "reason": "no action detected in task. This often indicates a misspelled module name, or incorrect module path.\n\nThe error appears to have been in '/home/doeals0/ceph-ansible/roles/ceph-osd/tasks/pre_requisite.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: enable extras repo on centos\n  ^ here\n\n\nThe error appears to have been in '/home/doeals0/ceph-ansible/roles/ceph-osd/tasks/pre_requisite.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: enable extras repo on centos\n  ^ here\n"}
fatal: [peta-s-226]: FAILED! => {"failed": true, "reason": "no action detected in task. This often indicates a misspelled module name, or incorrect module path.\n\nThe error appears to have been in '/home/doeals0/ceph-ansible/roles/ceph-osd/tasks/pre_requisite.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: enable extras repo on centos\n  ^ here\n\n\nThe error appears to have been in '/home/doeals0/ceph-ansible/roles/ceph-osd/tasks/pre_requisite.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: enable extras repo on centos\n  ^ here\n"}

NO MORE HOSTS LEFT *************************************************************
        to retry, use: --limit @/home/doeals0/ceph-ansible/site.retry

PLAY RECAP *********************************************************************
0-h0                       : ok=56   changed=0    unreachable=0    failed=0   
peta-s-225                 : ok=45   changed=0    unreachable=0    failed=1   
peta-s-226                 : ok=44   changed=0    unreachable=0    failed=1   

@wind0204
Copy link
Contributor Author

wind0204 commented Oct 26, 2016

@leseb I got through the running of 'ansible-playbook site.yml' by negating the pull request #1035 (comment) and I presume that my pull request would help work; though I'm entirely new to Jinja2 and the other stuff so not very sure about the safety of my code.

PLAY RECAP *********************************************************************
0-h0                       : ok=56   changed=0    unreachable=0    failed=0   
peta-s-225                 : ok=65   changed=0    unreachable=0    failed=0   
peta-s-226                 : ok=64   changed=0    unreachable=0    failed=0   

doeals0@0-h0:~/ceph-ansible$ ssh ansibled@0-h0 "sudo ceph osd tree"                                                                                           
ID WEIGHT  TYPE NAME           UP/DOWN REWEIGHT PRIMARY-AFFINITY 
-1 4.09140 root default                                          
-2 2.04570     host peta-s-226                                   
 0 0.68190         osd.0            up  1.00000          1.00000 
 2 0.68190         osd.2            up  1.00000          1.00000 
 5 0.68190         osd.5            up  1.00000          1.00000 
-3 2.04570     host peta-s-225                                   
 1 0.68190         osd.1            up  1.00000          1.00000 
 3 0.68190         osd.3            up  1.00000          1.00000 
 4 0.68190         osd.4            up  1.00000          1.00000 

doeals0@0-h0:~/ceph-ansible$ ssh ansibled@0-h0 "sudo ceph status"                                                                                             
    cluster afe5c536-0d76-4dcc-89ed-b1c5b39fc6bf
     health HEALTH_WARN
            8 pgs degraded
            64 pgs stuck unclean
            8 pgs undersized
            too few PGs per OSD (21 < min 30)
     monmap e1: 1 mons at {0-h0=192.168.255.254:6789/0}
            election epoch 5, quorum 0 0-h0
     osdmap e25: 6 osds: 6 up, 6 in; 56 remapped pgs
            flags sortbitwise
      pgmap v56: 64 pgs, 1 pools, 0 bytes data, 0 objects
            201 MB used, 4189 GB / 4189 GB avail
                  31 active+remapped
                  25 active
                   8 active+undersized+degraded

@leseb
Copy link
Member

leseb commented Oct 31, 2016

Alright thanks for the comments, the code looks good to me.
Let me give it a try myself and I'll let you know.

Thanks!

@leseb
Copy link
Member

leseb commented Oct 31, 2016

test this please

@leseb
Copy link
Member

leseb commented Oct 31, 2016

LGTM, @font do you mind testing this against a containerized deployment?
Thanks!

@font
Copy link
Contributor

font commented Oct 31, 2016

@leseb It should work for containerized deployments as this change only applies when not mon_containerized_deployment and not mon_containerized_deployment_with_kv. I will test anyway to make sure.

@font
Copy link
Contributor

font commented Oct 31, 2016

@leseb Tests on containerized deployment looks good.

@leseb
Copy link
Member

leseb commented Oct 31, 2016

@font thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants