Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mod_unique_id cause Apache to fail restart on first coverge #259

Closed
niven01 opened this issue Oct 9, 2014 · 17 comments
Closed

mod_unique_id cause Apache to fail restart on first coverge #259

niven01 opened this issue Oct 9, 2014 · 17 comments

Comments

@niven01
Copy link

niven01 commented Oct 9, 2014

Hello,

I've encountered a problem after introducing unique_id where Apache fails to restart.
The work around at this moment is a killall httpd and re run chef-client

This can be reproduced in test-kitchen by just including two recipes:

name: upstream_apache
run_list:
  - "recipe[apache2]"
  - "recipe[apache2::mod_unique_id]"

The behavior seen is that httpd is stopped but only the master process is killed. There are a bunch of httpd orphans still sitting there listening on 80 and 443

When Apache tries to start it fails as ports are already in use.

Every now and then it will succeed which suggests to me this maybe a timing issue and that httpd start is happening before the stop has completed.

Error:

           Error executing action `restart` on resource 'service[apache2]'

       ================================================================================

           Mixlib::ShellOut::ShellCommandFailed
           ------------------------------------
           Expected process to exit with [0], but received '1'
           ---- Begin output of /sbin/service httpd restart ----
           STDOUT: Stopping httpd: [  OK  ]
           Starting httpd: [FAILED]
           STDERR: [Thu Oct 09 11:56:26 2014] [warn] NameVirtualHost *:80 has no VirtualHosts
           (98)Address already in use: make_sock: could not bind to address [::]:80
           (98)Address already in use: make_sock: could not bind to address 0.0.0.0:80

       no listening sockets available, shutting down
           Unable to open logs
           ---- End output of /sbin/service httpd restart ----
           Ran /sbin/service httpd restart returned 1


       Resource Declaration:
           ---------------------
           # In /tmp/kitchen/cache/cookbooks/apache2/recipes/default.rb

            24: service 'apache2' do
            25:   service_name node['apache']['package']
            26:   case node['platform_family']
            27:   when 'rhel'
            28:     reload_command '/sbin/service httpd graceful'
            29:   when 'debian'
            30:     provider Chef::Provider::Service::Debian
            31:   when 'arch'
            32:     service_name 'httpd'
            33:   end
            34:   supports [:start, :restart, :reload, :status]

           Compiled Resource:
           ------------------
           # Declared in /tmp/kitchen/cache/cookbooks/apache2/recipes/default.rb:24:in `from_file'

           service("apache2") do

         action [:enable, :start]
             updated true
             supports {:restart=>true, :reload=>true, :status=>true, :start=>true}
             retries 0
             retry_delay 2
             guard_interpreter :default
             service_name "httpd"
             enabled true
             running true
             pattern "apache2"
             reload_command "/sbin/service httpd graceful"
             cookbook_name "apache2"
             recipe_name "default"
             only_if "/usr/sbin/httpd -t"
           end
@niven01
Copy link
Author

niven01 commented Oct 9, 2014

Forgot to add this is on Centos 6.5

@rosstimson
Copy link

I'm encountering this issue on Amazon Linux too. In my case I'm adding some extra modules by setting the following attribute in my wrapper cookbook:

default['apache']['default_modules'] = %w(
  status alias auth_basic authn_core auth_digest authn_file authz_core
  authz_groupfile authz_host authz_user autoindex dir env mime mime_magic
  expires headers negotiation rewrite setenvif log_config logio ssl
)

The problem with my use case is that I'm trying to use this to build an AMI using Packer, the build fails and manually fixing by running killall httpd isn't really an option.

I've seen it work a few times without modification though, it's almost like there is some sort of race condition (in my case anyway).

@rosstimson
Copy link

Doh, if I'd have read your original issue fully I'd have realised that you've already suggested that this looks like some sort of timing issue.

@niven01
Copy link
Author

niven01 commented Oct 9, 2014

Yes,killall httpd is a bad work around. I't just so happens its something I can do in this situation but going forward its not an option.

@niven01
Copy link
Author

niven01 commented Oct 9, 2014

@rosstimson I notice you're not including mod_unique_id. Have you been able to narrow it down to a specific module or is it any combination. I ask as I have a few modules in my recipe but it seems to be specific to mod_unique_id. As soon as I include it I start seeing problems.

@rosstimson
Copy link

@niven01 I removed every module and started adding them back in one by one. The downside of this is a realised that the issue was caused by my custom mod_security module that I had created in my fork that looks like so:

case node['platform_family']
when 'rhel'
  package 'mod_security' do
    action :install
    notifies :run, resources(execute: 'generate-module-list'), :immediately
  end

  file "#{node['apache']['dir']}/conf.d/mod_security.conf" do
    action :delete
    backup false
  end
end

apache_module 'security2' do
  conf true
end

I can also, consistently, replicate the same problem by including the mod_unique_id module that you are having issues with.

@drpebcak drpebcak added the bug label Oct 16, 2014
@svanzoest
Copy link
Contributor

Since you are seeing this on centos 6.5, I assume this is with apache httpd 2.2? Do we see the same issue on centos 7?

@fernandohonig
Copy link

Hi @svanzoest indeed, this is with httpd 2.2. We have not tried with CentOS 7.

@rosstimson
Copy link

Yeah, Amazon Linux is using httpd 2.2 as well. Not had a chance to try this with CentOS 7.

@niven01
Copy link
Author

niven01 commented Nov 6, 2014

Just tried Centos7 and it is succeeding every time (so far)

@svanzoest svanzoest added bug and removed bug labels Dec 1, 2014
@svanzoest
Copy link
Contributor

This seems to be an issue with the 2.2 init scripts. We did remove a && sleep 1 from service resource in the cookbook (f0fc1ce) seems like this showed up when we removed it?

We clearly do not need it on the newer platforms, but it seems like mod_security and mod_unique_id cause things to take longer in a way that the init scripts do not handle gracefully. When tried to look back to see what was the reasoning for the sleeps to get added, but there was nothing in the commits that indicated a reason. I guess we found two. ;-)

@drpebcak
Copy link
Contributor

drpebcak commented May 8, 2015

I believe this should fix it - it seems like this is only being experienced on rhel systems.

@gitgc
Copy link

gitgc commented Dec 17, 2015

I'm still seeing this exact issue on Apache 2.2/Centos 6.5, using the latest 3.1.0 version of this cookbook. As described in original post, the parent apache process is killed, but a bunch of zombie apache processes are still running, binding the ports, preventing a restart without a killall httpd.

@drpebcak
Copy link
Contributor

@gitgc what happens when you run service httpd restart manually on that box? Does it take a really long time or cause some sort of error?

@kalebwalton
Copy link

I am having this issue using latest version of this cookbook as well. When I adjust the restart/reload lines to be as follows, all is well (I'm not saying it's the right thing to do... just that it resolves my issue):

service 'apache2' do
  service_name node['apache']['service_name']
  case node['platform_family']
  when 'rhel'
    restart_command '/sbin/service httpd restart && sleep 1'
    reload_command '/sbin/service httpd graceful && sleep 1'
  when 'debian'
    provider Chef::Provider::Service::Debian
  when 'arch'
    service_name 'httpd'
  end
  supports [:start, :restart, :reload, :status]
  action [:enable, :start]
  only_if "#{node['apache']['binary']} -t", :environment => {           'APACHE_LOG_DIR' => node['apache']['log_dir'] }, :timeout => 10
end

@niven01
Copy link
Author

niven01 commented Feb 9, 2016

Just like to add this is still happening with 3.1.0 version of this cookbook for me

@lock
Copy link

lock bot commented Jul 24, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jul 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants