Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'closed stream' error when using concurrency > 1 #176

Closed
cburroughs opened this issue Jul 10, 2015 · 12 comments
Closed

'closed stream' error when using concurrency > 1 #176

cburroughs opened this issue Jul 10, 2015 · 12 comments
Assignees
Labels

Comments

@cburroughs
Copy link

Line 125: https://github.com/test-kitchen/kitchen-vagrant/blob/v0.18.0/lib/kitchen/driver/vagrant.rb#L125

Closing the stream seems like what this line is trying to do anyway?

I'm seeing this 1-5% of the time in a CI environment, and somewhat more likely with parrallel builds.

I, [2015-07-09T22:30:14.898466 #51619]  INFO -- Kitchen: -----> Starting Kitchen (v1.4.0)
D, [2015-07-09T22:30:14.906098 #51619] DEBUG -- Kitchen: [Vagrant command] BEGIN (vagrant --version)
D, [2015-07-09T22:30:15.120979 #51619] DEBUG -- Kitchen: [Vagrant command] END (0m0.20s)
I, [2015-07-09T22:30:15.428904 #51619]  INFO -- Kitchen: -----> Cleaning up any prior instances of <default-centos-66>
I, [2015-07-09T22:30:15.429084 #51619]  INFO -- Kitchen: -----> Destroying <default-centos-66>...
I, [2015-07-09T22:30:15.430265 #51619]  INFO -- Kitchen: -----> Testing <default-centos-66>
I, [2015-07-09T22:30:15.430392 #51619]  INFO -- Kitchen: -----> Creating <default-centos-66>...
I, [2015-07-09T22:31:58.217803 #51619]  INFO -- Kitchen: -----> Converging <default-centos-66>...
I, [2015-07-09T22:32:05.578834 #51619]  INFO -- Kitchen: -----> Setting up <default-centos-66>...
I, [2015-07-09T22:32:05.579970 #51619]  INFO -- Kitchen: -----> Verifying <default-centos-66>...
I, [2015-07-09T22:32:17.621145 #51619]  INFO -- Kitchen: -----> Destroying <default-centos-66>...
D, [2015-07-09T22:32:17.624432 #51619] DEBUG -- Kitchen: ------Exception-------
D, [2015-07-09T22:32:17.624527 #51619] DEBUG -- Kitchen: Class: Kitchen::ActionFailed
D, [2015-07-09T22:32:17.624597 #51619] DEBUG -- Kitchen: Message: Failed to complete #destroy action: [closed stream]
D, [2015-07-09T22:32:17.624667 #51619] DEBUG -- Kitchen: ---Nested Exception---
D, [2015-07-09T22:32:17.624734 #51619] DEBUG -- Kitchen: Class: IOError
D, [2015-07-09T22:32:17.624801 #51619] DEBUG -- Kitchen: Message: closed stream
D, [2015-07-09T22:32:17.624896 #51619] DEBUG -- Kitchen: ------Backtrace-------
D, [2015-07-09T22:32:17.624963 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/ruby_compat.rb:30:in `select'
D, [2015-07-09T22:32:17.625031 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/ruby_compat.rb:30:in `io_select'
D, [2015-07-09T22:32:17.625096 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/transport/packet_stream.rb:75:in `available_for_read?'
D, [2015-07-09T22:32:17.625204 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/transport/packet_stream.rb:87:in `next_packet'
D, [2015-07-09T22:32:17.625327 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/transport/session.rb:183:in `block in poll_message'
D, [2015-07-09T22:32:17.625406 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/transport/session.rb:178:in `loop'
D, [2015-07-09T22:32:17.625481 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/transport/session.rb:178:in `poll_message'
D, [2015-07-09T22:32:17.625546 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/connection/session.rb:461:in `dispatch_incoming_packets'
D, [2015-07-09T22:32:17.625611 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/connection/session.rb:222:in `preprocess'
D, [2015-07-09T22:32:17.625686 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/connection/session.rb:206:in `process'
D, [2015-07-09T22:32:17.625773 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/connection/session.rb:170:in `block in loop'
D, [2015-07-09T22:32:17.625864 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/connection/session.rb:170:in `loop'
D, [2015-07-09T22:32:17.625981 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/connection/session.rb:170:in `loop'
D, [2015-07-09T22:32:17.626059 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/net-ssh-2.9.2/lib/net/ssh/connection/session.rb:119:in `close'
D, [2015-07-09T22:32:17.626133 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/transport/ssh.rb:91:in `close'
D, [2015-07-09T22:32:17.626209 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/gems/2.1.0/gems/kitchen-vagrant-0.18.0/lib/kitchen/driver/vagrant.rb:125:in `destroy'
D, [2015-07-09T22:32:17.626316 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:424:in `public_send'
D, [2015-07-09T22:32:17.626384 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:424:in `block in perform_action'
D, [2015-07-09T22:32:17.626465 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:485:in `call'
D, [2015-07-09T22:32:17.626575 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:485:in `block in synchronize_or_call'
D, [2015-07-09T22:32:17.626641 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:483:in `synchronize'
D, [2015-07-09T22:32:17.626708 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:483:in `synchronize_or_call'
D, [2015-07-09T22:32:17.626785 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:453:in `block in action'
D, [2015-07-09T22:32:17.626865 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/lib/ruby/2.1.0/benchmark.rb:279:in `measure'
D, [2015-07-09T22:32:17.626947 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:452:in `action'
D, [2015-07-09T22:32:17.627023 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:424:in `perform_action'
D, [2015-07-09T22:32:17.627090 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:411:in `destroy_action'
D, [2015-07-09T22:32:17.627156 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:341:in `block in transition_to'
D, [2015-07-09T22:32:17.627223 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:340:in `each'
D, [2015-07-09T22:32:17.627299 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:340:in `transition_to'
D, [2015-07-09T22:32:17.627365 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:171:in `destroy'
D, [2015-07-09T22:32:17.627430 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:195:in `ensure in test'
D, [2015-07-09T22:32:17.627511 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/instance.rb:195:in `test'
D, [2015-07-09T22:32:17.627589 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/command.rb:176:in `public_send'
D, [2015-07-09T22:32:17.627673 #51619] DEBUG -- Kitchen: /opt/chefdk/embedded/apps/test-kitchen/lib/kitchen/command.rb:176:in `block (2 levels) in run_action'
D, [2015-07-09T22:32:17.627755 #51619] DEBUG -- Kitchen: ----------------------
@kamaradclimber
Copy link

We are impacted by this issue as well.
It seems to be more frequent since we've started to use concurrency=2

@kamaradclimber
Copy link

@tyler-ball any thoughs on this issue ?

@cheeseplus
Copy link
Contributor

We're happy to try to look into this more but can we get some more detail?

@kamaradclimber
Copy link

Hard to give more details than @cburroughs.
It happens exclusively on our CI environment (with many kitchen jobs launched concurrently).
It happens sometimes with N executor per physical server, using concurrency level 2 with N/2 executor increases a lot frequency of errors.

We reduced the issue to zero (so far) when using a per-job socket to discuss with virtualbox. Maybe @rveznaver could tell you more about this tweak if you are interested

@rveznaver
Copy link

I honestly do not remember the root cause. I think it is somewhat linked to the fact that Vagrant doesn't provide a parallel option for its VirtualBox provider.
I've solved the issue by going one layer deeper and isolating each VirtualBox process via IPC sockets (see: #202), so Vagrant can do whatever it likes (there are several commands in VirtualBox with a global effect on all the instances for a given process), but it will only affect its one instance.

@cheeseplus
Copy link
Contributor

We've done some work in upstream kitchen around some of the stream handling so I'd be curious if this is still seen in your environments? Also is this only virtualbox?

@kamaradclimber
Copy link

No users have referenced this since @rveznaver has isolated via IPC sockets.

@cheeseplus cheeseplus changed the title closing connnection during destory can confusingly fail with 'closed stream' 'closed stream' error when using concurrency > 1 Jul 26, 2017
@cheeseplus cheeseplus self-assigned this Jul 26, 2017
@amontalban
Copy link

amontalban commented Aug 9, 2017

Hey guys,

Sorry if this is not 100% related but we are getting this closed stream error when running multiple parallel kitchen test processes inside Jenkins.

This is the log output:

       [INFO    ] Loading fresh modules for state activity
       [INFO    ] Completed state [libgd] at time 03:30:53.509570 duration_in_ms=4641.062
       [INFO    ] Running state [pango] at time 03:30:53.538154
       [INFO    ] Executing state pkg.latest for pango
       [INFO    ] Executing command ['pkg', '--version'] in directory '/home/vagrant'
       [INFO    ] Executing command ['pkg', 'search', '-S', 'name', '-Q', 'version', '-e', '-q', 'pango'] in directory '/home/vagrant'
       [INFO    ] Executing command ['pkg', 'install', '-y', 'pango'] in directory '/home/vagrant'
-----> Destroying <appserver-freebsd-103>...
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: 1 actions failed.
>>>>>>     Failed to complete #destroy action: [closed stream] on appserver-freebsd-103
>>>>>> ----------------------
>>>>>> Please see .kitchen/logs/kitchen.log for more details
>>>>>> Also try running `kitchen diagnose --all` for configuration
script returned exit code 20

And it happens at same time that another VM finishes the kitchen test process, so we are pretty sure this has to be related to concurrency (At this time we are running 3 VirtualBox VMs and 2 Docker instances but we expect to have more and the error is getting pretty frequent).

I'm going to test #202 and give you our feedback on the matter.

Thanks!

@abednarik
Copy link

Hi all

We applied #202 in our develop branch and we detect something that we want to comment. Every time we run a kitchen test we see this

[ack_fix_vagrant_concurrency-MVVQYBRDAXXGZQN4AXNNULFWB3J7O576EUXB5VC5I6FZKQN5EPEQ] Running shell script
Using /home/jenkins-slave/.rvm/gems/ruby-2.4.1
-----> Starting Kitchen (v1.16.0)
-----> Cleaning up any prior instances of <cronmaster-freebsd-103>
-----> Destroying <cronmaster-freebsd-103>...
       Finished destroying <cronmaster-freebsd-103> (0m0.00s).
-----> Testing <cronmaster-freebsd-103>
-----> Creating <cronmaster-freebsd-103>...
       Bringing machine 'default' up with 'virtualbox' provider...
       ==> default: Preparing master VM for linked clones...
           default: This is a one time operation. Once the master VM is prepared,
           default: it will be used as a base for linked clones, making the creation
           default: of new VMs take milliseconds on a modern system.
       ==> default: Importing base box 'perceptyx/freebsd-10.3-amd64'...
       
Progress: 10%
Progress: 20%
Progress: 30%
Progress: 40%
Progress: 50%
Progress: 60%
Progress: 70%
Progress: 80%
Progress: 90%
==> default: Cloning VM...

==> default: Matching MAC address for NAT networking...

Note that the vagrant box is imported every time we run the job. This was not the previous behavior, box was imported just one time and the next runs didn't re-import the box.

[SaltStack_master-IOZJWEBD3FDDA3KENYEYT6IBYV3FNSFNVIG4QIQRNTATKP7ORNDA] Running shell script
Using /home/jenkins-slave/.rvm/gems/ruby-2.4.1
-----> Starting Kitchen (v1.16.0)
-----> Cleaning up any prior instances of <cronmaster-freebsd-103>
-----> Destroying <cronmaster-freebsd-103>...
       Finished destroying <cronmaster-freebsd-103> (0m0.00s).
-----> Testing <cronmaster-freebsd-103>
-----> Creating <cronmaster-freebsd-103>...
       Bringing machine 'default' up with 'virtualbox' provider...
       ==> default: Cloning VM...
       
==> default: Matching MAC address for NAT networking...

Is this something we should expect or is a bug?

Thanks in advance.
Cheers.

@cheeseplus
Copy link
Contributor

@abednarik please open that as a separate issue - that behaviour seems suspicious and a side effect of #202 that needs to be addressed

@cheeseplus
Copy link
Contributor

I'm closing this one out as we have implemented #202 and anything that is fallout of that should be opened as a new issue.

@abednarik
Copy link

Done #322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants