mod_unique_id cause Apache to fail restart on first coverge #259

niven01 · 2014-10-09T11:59:53Z

Hello,

I've encountered a problem after introducing unique_id where Apache fails to restart.
The work around at this moment is a killall httpd and re run chef-client

This can be reproduced in test-kitchen by just including two recipes:

name: upstream_apache
run_list:
  - "recipe[apache2]"
  - "recipe[apache2::mod_unique_id]"

The behavior seen is that httpd is stopped but only the master process is killed. There are a bunch of httpd orphans still sitting there listening on 80 and 443

When Apache tries to start it fails as ports are already in use.

Every now and then it will succeed which suggests to me this maybe a timing issue and that httpd start is happening before the stop has completed.

Error:

           Error executing action `restart` on resource 'service[apache2]'

       ================================================================================

           Mixlib::ShellOut::ShellCommandFailed
           ------------------------------------
           Expected process to exit with [0], but received '1'
           ---- Begin output of /sbin/service httpd restart ----
           STDOUT: Stopping httpd: [  OK  ]
           Starting httpd: [FAILED]
           STDERR: [Thu Oct 09 11:56:26 2014] [warn] NameVirtualHost *:80 has no VirtualHosts
           (98)Address already in use: make_sock: could not bind to address [::]:80
           (98)Address already in use: make_sock: could not bind to address 0.0.0.0:80

       no listening sockets available, shutting down
           Unable to open logs
           ---- End output of /sbin/service httpd restart ----
           Ran /sbin/service httpd restart returned 1


       Resource Declaration:
           ---------------------
           # In /tmp/kitchen/cache/cookbooks/apache2/recipes/default.rb

            24: service 'apache2' do
            25:   service_name node['apache']['package']
            26:   case node['platform_family']
            27:   when 'rhel'
            28:     reload_command '/sbin/service httpd graceful'
            29:   when 'debian'
            30:     provider Chef::Provider::Service::Debian
            31:   when 'arch'
            32:     service_name 'httpd'
            33:   end
            34:   supports [:start, :restart, :reload, :status]

           Compiled Resource:
           ------------------
           # Declared in /tmp/kitchen/cache/cookbooks/apache2/recipes/default.rb:24:in `from_file'

           service("apache2") do

         action [:enable, :start]
             updated true
             supports {:restart=>true, :reload=>true, :status=>true, :start=>true}
             retries 0
             retry_delay 2
             guard_interpreter :default
             service_name "httpd"
             enabled true
             running true
             pattern "apache2"
             reload_command "/sbin/service httpd graceful"
             cookbook_name "apache2"
             recipe_name "default"
             only_if "/usr/sbin/httpd -t"
           end

The text was updated successfully, but these errors were encountered:

niven01 · 2014-10-09T13:22:03Z

Forgot to add this is on Centos 6.5

rosstimson · 2014-10-09T14:24:05Z

I'm encountering this issue on Amazon Linux too. In my case I'm adding some extra modules by setting the following attribute in my wrapper cookbook:

default['apache']['default_modules'] = %w(
  status alias auth_basic authn_core auth_digest authn_file authz_core
  authz_groupfile authz_host authz_user autoindex dir env mime mime_magic
  expires headers negotiation rewrite setenvif log_config logio ssl
)

The problem with my use case is that I'm trying to use this to build an AMI using Packer, the build fails and manually fixing by running killall httpd isn't really an option.

I've seen it work a few times without modification though, it's almost like there is some sort of race condition (in my case anyway).

rosstimson · 2014-10-09T14:27:54Z

Doh, if I'd have read your original issue fully I'd have realised that you've already suggested that this looks like some sort of timing issue.

niven01 · 2014-10-09T17:10:03Z

Yes,killall httpd is a bad work around. I't just so happens its something I can do in this situation but going forward its not an option.

niven01 · 2014-10-09T17:43:37Z

@rosstimson I notice you're not including mod_unique_id. Have you been able to narrow it down to a specific module or is it any combination. I ask as I have a few modules in my recipe but it seems to be specific to mod_unique_id. As soon as I include it I start seeing problems.

rosstimson · 2014-10-16T15:27:25Z

@niven01 I removed every module and started adding them back in one by one. The downside of this is a realised that the issue was caused by my custom mod_security module that I had created in my fork that looks like so:

case node['platform_family']
when 'rhel'
  package 'mod_security' do
    action :install
    notifies :run, resources(execute: 'generate-module-list'), :immediately
  end

  file "#{node['apache']['dir']}/conf.d/mod_security.conf" do
    action :delete
    backup false
  end
end

apache_module 'security2' do
  conf true
end

I can also, consistently, replicate the same problem by including the mod_unique_id module that you are having issues with.

svanzoest · 2014-11-01T18:11:01Z

Since you are seeing this on centos 6.5, I assume this is with apache httpd 2.2? Do we see the same issue on centos 7?

fernandohonig · 2014-11-04T13:33:58Z

Hi @svanzoest indeed, this is with httpd 2.2. We have not tried with CentOS 7.

rosstimson · 2014-11-04T15:27:22Z

Yeah, Amazon Linux is using httpd 2.2 as well. Not had a chance to try this with CentOS 7.

niven01 · 2014-11-06T13:22:11Z

Just tried Centos7 and it is succeeding every time (so far)

svanzoest · 2015-05-08T01:46:34Z

This seems to be an issue with the 2.2 init scripts. We did remove a && sleep 1 from service resource in the cookbook (f0fc1ce) seems like this showed up when we removed it?

We clearly do not need it on the newer platforms, but it seems like mod_security and mod_unique_id cause things to take longer in a way that the init scripts do not handle gracefully. When tried to look back to see what was the reasoning for the sleeps to get added, but there was nothing in the commits that indicated a reason. I guess we found two. ;-)

drpebcak · 2015-05-08T06:34:17Z

I believe this should fix it - it seems like this is only being experienced on rhel systems.

gitgc · 2015-12-17T18:56:52Z

I'm still seeing this exact issue on Apache 2.2/Centos 6.5, using the latest 3.1.0 version of this cookbook. As described in original post, the parent apache process is killed, but a bunch of zombie apache processes are still running, binding the ports, preventing a restart without a killall httpd.

drpebcak · 2015-12-17T21:36:58Z

@gitgc what happens when you run service httpd restart manually on that box? Does it take a really long time or cause some sort of error?

kalebwalton · 2016-01-27T18:29:28Z

I am having this issue using latest version of this cookbook as well. When I adjust the restart/reload lines to be as follows, all is well (I'm not saying it's the right thing to do... just that it resolves my issue):

service 'apache2' do
  service_name node['apache']['service_name']
  case node['platform_family']
  when 'rhel'
    restart_command '/sbin/service httpd restart && sleep 1'
    reload_command '/sbin/service httpd graceful && sleep 1'
  when 'debian'
    provider Chef::Provider::Service::Debian
  when 'arch'
    service_name 'httpd'
  end
  supports [:start, :restart, :reload, :status]
  action [:enable, :start]
  only_if "#{node['apache']['binary']} -t", :environment => {           'APACHE_LOG_DIR' => node['apache']['log_dir'] }, :timeout => 10
end

niven01 · 2016-02-09T08:57:34Z

Just like to add this is still happening with 3.1.0 version of this cookbook for me

lock · 2018-07-24T10:20:00Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

drpebcak added the bug label Oct 16, 2014

svanzoest added bug and removed bug labels Dec 1, 2014

svanzoest assigned drpebcak May 8, 2015

drpebcak closed this as completed May 14, 2015

niven01 mentioned this issue Feb 9, 2016

resolve reload problem on rhel based systems using apache2.2 #408

Closed

svanzoest unassigned drpebcak Nov 8, 2016

lock bot locked as resolved and limited conversation to collaborators Jul 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mod_unique_id cause Apache to fail restart on first coverge #259

mod_unique_id cause Apache to fail restart on first coverge #259

niven01 commented Oct 9, 2014

niven01 commented Oct 9, 2014

rosstimson commented Oct 9, 2014

rosstimson commented Oct 9, 2014

niven01 commented Oct 9, 2014

niven01 commented Oct 9, 2014

rosstimson commented Oct 16, 2014

svanzoest commented Nov 1, 2014

fernandohonig commented Nov 4, 2014

rosstimson commented Nov 4, 2014

niven01 commented Nov 6, 2014

svanzoest commented May 8, 2015

drpebcak commented May 8, 2015

gitgc commented Dec 17, 2015

drpebcak commented Dec 17, 2015

kalebwalton commented Jan 27, 2016

niven01 commented Feb 9, 2016

lock bot commented Jul 24, 2018

mod_unique_id cause Apache to fail restart on first coverge #259

mod_unique_id cause Apache to fail restart on first coverge #259

Comments

niven01 commented Oct 9, 2014

niven01 commented Oct 9, 2014

rosstimson commented Oct 9, 2014

rosstimson commented Oct 9, 2014

niven01 commented Oct 9, 2014

niven01 commented Oct 9, 2014

rosstimson commented Oct 16, 2014

svanzoest commented Nov 1, 2014

fernandohonig commented Nov 4, 2014

rosstimson commented Nov 4, 2014

niven01 commented Nov 6, 2014

svanzoest commented May 8, 2015

drpebcak commented May 8, 2015

gitgc commented Dec 17, 2015

drpebcak commented Dec 17, 2015

kalebwalton commented Jan 27, 2016

niven01 commented Feb 9, 2016

lock bot commented Jul 24, 2018