Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#40] Allow SSHKit.run executing on hosts in parallel #131

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions lib/sshkit.ex
Original file line number Diff line number Diff line change
Expand Up @@ -308,17 +308,27 @@ defmodule SSHKit do
assert "Hello World!\n" == stdout
```
"""
def run(context, command) do
def run(context, command, mode \\ :sequential, timeout \\ :infinity) do
cmd = Context.build(context, command)

run = fn host ->
op = fn host ->
{:ok, conn} = SSH.connect(host.name, host.options)
res = SSH.run(conn, cmd)
res = SSH.run(conn, cmd, [timeout: timeout])
:ok = SSH.close(conn)
res
end

Enum.map(context.hosts, run)
perform(context.hosts, op, mode)
end

defp perform(hosts, op, :sequential) do
Enum.map(hosts, op)
end

defp perform(hosts, op, :parallel) do
hosts
|> Enum.map(fn host -> Task.async(fn -> op.(host) end) end)
|> Enum.map(fn task -> Task.await(task, :infinity) end)
end

@doc ~S"""
Expand Down
16 changes: 16 additions & 0 deletions test/sshkit_functional_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,22 @@ defmodule SSHKitFunctionalTest do
assert name == host.options[:user]
end

@tag boot: [@bootconf]
test "connects as the login user and runs commands in parallel", %{hosts: [host]} do
begin_time = Time.utc_now()
[{:ok, output1, 0},{:ok, output2, 0}] =
[host, host]
|> SSHKit.context()
|> SSHKit.run("sleep 2; id -un", :parallel)
end_time = Time.utc_now()
run_time = Time.diff(end_time, begin_time, :second)

assert run_time < 4
assert String.trim(stdout(output1)) == host.options[:user]
assert String.trim(stdout(output2)) == host.options[:user]

end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 For the test, I would rather start 2 hosts. I would expect this to be the more common use case for the :parallel feature.

You can start another host by adding another config entry to the :boot tag. Maybe with a different user, so the output is actually different.

A small note on formatting:

  • Please remove the blank line (line 33)
  • Throughout the code base we use spaces after the comma between list items (on line 23, seems like Credo does not catch that)

Thank you 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, I would like to avoid relying on time in the tests since that may fail for a bunch of reasons and make the test suite flaky. I wonder if this could be tested in a different way? 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tessi do you have an idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@tessi tessi Sep 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might need two tests :)

  1. to test parallel execution on two different hosts (as @pmeinhardt said) which is not required to test the actualy parallelism but instead focuses on correct execution (so we see when things break for real users in a real'ish scenario).
  2. test parallel execution on the same host, in this time also testing that things really run in parallel.

For 2. I agree that the time approach might* work -- but it makes our test suite at least 2 seconds slower. Instead I would suggest to test a side effect on that host which can only happen when two parallel ssh sessions happen. The question is which side effect to test. Possible options are:

  • monitor currently active ssh shells. The command could monitor active ssh sessions and wait until two sessions are seen (maybe with a timeout to prevent lifelock).
  • kinda crappy: a command could append numbers from 1 to 1.000.000 into a file. If both ssh sessions do that the file content would contain "unsorted" numbers
  • if the two commands could be different: one could open a socket/file and read/wait on it. the other could write to that socket/file.

I personally prefer the first option, as it's the least crappy one :)
It seems the who command lists currently logged in users

The who utility displays a list of all users currently logged on, showing for each user the login name, tty name, the date and time of login, and hostname if not local.

We could write (and then execute) a little bash script which runs who | wc -l until two users are logged in. That bash script could be a while-loop wrapped in the timeout command.

*) might because when tests are run on CI timing is always an issue. Maybe some tests needs longer than others and result in flaky tests. Things will always break in unexpected ways :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and again (because I forgot to write this before), thanks @seungjin for your contribution 💚 I would love this feature to go in.


@tag boot: [@bootconf]
test "runs commands and returns their output and exit status", %{hosts: [host]} do
context = SSHKit.context(host)
Expand Down