Skip to content

Commit 0af3d23

Browse files
authored
Tunnel tests (#260)
* Updated tunnel code, tests. * Added host output tests * Added scp and sftp tests. Updated sftp code. * Added ssh-python client tests * Updated tunnel shutdown * Updated single client * Fix issue with identity auth - #222 * Updated documentation * Updated readme
1 parent 8f4d7c4 commit 0af3d23

20 files changed

+752
-347
lines changed

Changelog.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,24 @@
11
Change Log
22
============
33

4+
2.5.0
5+
+++++
6+
7+
Changes
8+
-------
9+
10+
* Python 2 no longer supported.
11+
* Updated class arguments, refactor for ``pssh.clients.native.tunnel``.
12+
13+
Fixes
14+
-----
15+
16+
* Closed clients with proxy host enabled would not shutdown their proxy servers.
17+
* Clients with proxy host enabled would not disconnect the proxy client on ``.disconnect`` being called.
18+
* Default identity files would not be used when private key was not specified - #222.
19+
* ``ParallelSSHClient(<..>, identity_auth=False`` would not be honoured.
20+
21+
422
2.4.0
523
+++++
624

README.rst

Lines changed: 65 additions & 140 deletions
Large diffs are not rendered by default.

doc/advanced.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -513,7 +513,7 @@ Stderr is empty:
513513

514514
.. code-block:: python
515515
516-
for line in output[client.hosts[0]].stderr:
516+
for line in output[0].stderr:
517517
print(line)
518518
519519
No output from ``stderr``.
@@ -523,9 +523,9 @@ No output from ``stderr``.
523523
SFTP and SCP
524524
*************
525525

526-
SFTP and SCP are both supported by ``parallel-ssh`` and functions are provided by the client for copying files with SFTP to and from remote servers - default native client only.
526+
SFTP and SCP are both supported by ``parallel-ssh`` and functions are provided by the client for copying files to and from remote servers - default native clients only.
527527

528-
Neither SFTP nor SCP have a shell interface and no output is provided for any SFTP/SCP commands.
528+
Neither SFTP nor SCP have a shell interface and no output is sent for any SFTP/SCP commands.
529529

530530
As such, SFTP functions in ``ParallelSSHClient`` return greenlets that will need to be joined to raise any exceptions from them. :py:func:`gevent.joinall` may be used for that.
531531

@@ -542,15 +542,15 @@ To copy the local file with relative path ``../test`` to the remote relative pat
542542
543543
client = ParallelSSHClient(hosts)
544544
545-
greenlets = client.copy_file('../test', 'test_dir/test')
546-
joinall(greenlets, raise_error=True)
545+
cmds = client.copy_file('../test', 'test_dir/test')
546+
joinall(cmds, raise_error=True)
547547
548548
To recursively copy directory structures, enable the ``recurse`` flag:
549549

550550
.. code-block:: python
551551
552-
greenlets = client.copy_file('my_dir', 'my_dir', recurse=True)
553-
joinall(greenlets, raise_error=True)
552+
cmds = client.copy_file('my_dir', 'my_dir', recurse=True)
553+
joinall(cmds, raise_error=True)
554554
555555
.. seealso::
556556

@@ -570,8 +570,8 @@ Copying remote files in parallel requires that file names are de-duplicated othe
570570
571571
client = ParallelSSHClient(hosts)
572572
573-
greenlets = client.copy_remote_file('remote.file', 'local.file')
574-
joinall(greenlets, raise_error=True)
573+
cmds = client.copy_remote_file('remote.file', 'local.file')
574+
joinall(cmds, raise_error=True)
575575
576576
The above will create files ``local.file_host1`` where ``host1`` is the host name the file was copied from.
577577

@@ -855,7 +855,7 @@ Clients for hosts that are no longer on the host list are removed on host list a
855855
<..>
856856
857857
858-
When wanting to reassign host list frequently, it is best to sort or otherwise ensure order is maintained to avoid reconnections on hosts that are still in the host list but in a different order.
858+
When reassigning host list frequently, it is best to sort or otherwise ensure order is maintained to avoid reconnections on hosts that are still in the host list but in a different position.
859859

860860
For example, the following will cause reconnections on both hosts, though both are still in the list.
861861

doc/alternatives.rst

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
Comparison With Alternatives
2+
*****************************
3+
4+
There are not many alternatives for SSH libraries in Python. Of the few that do exist, here is how they compare with ``parallel-ssh``.
5+
6+
As always, it is best to use a tool that is suited to the task at hand. ``parallel-ssh`` is a library for programmatic and non-interactive use. If requirements do not match what it provides then it best not be used. Same applies for the tools described below.
7+
8+
Paramiko
9+
________
10+
11+
The default SSH client library in ``parallel-ssh<=1.6.x`` series.
12+
13+
Pure Python code, while having native extensions as dependencies, with poor performance and numerous bugs compared to both OpenSSH binaries and the ``libssh2`` based native clients in ``parallel-ssh`` ``1.2.x`` and above. Recent versions have regressed in performance and have `blocker issues <https://github.com/ParallelSSH/parallel-ssh/issues/83>`_.
14+
15+
It does not support non-blocking mode, so to make it non-blocking monkey patching must be used which affects all other uses of the Python standard library.
16+
17+
Based on its use in historical ``parallel-ssh`` releases as well as `performance testing <https://parallel-ssh.org/post/parallel-ssh-libssh2>`_, paramiko is very far from being mature enough to be used.
18+
19+
This is why ``parallel-ssh`` has moved away from paramiko entirely since ``2.0.0`` where it was dropped as a dependency.
20+
21+
asyncssh
22+
________
23+
24+
Pure Python ``asyncio`` framework using client library. License (`EPL`) is not compatible with GPL, BSD or other open source licenses and `combined works cannot be distributed <https://www.eclipse.org/legal/eplfaq.php#USEINANOTHER>`_.
25+
26+
Therefore unsuitable for use in many projects, including ``parallel-ssh``.
27+
28+
Fabric
29+
______
30+
31+
Port of Capistrano from Ruby to Python. Intended for command line use and is heavily systems administration oriented rather than non-interactive library. Same maintainer as Paramiko.
32+
33+
Uses Paramiko and suffers from the same limitations. More over, uses threads for parallelisation, while `not being thread safe <https://github.com/fabric/fabric/issues/1433>`_, and exhibits very poor performance and extremely high CPU usage even for limited number of hosts - 1 to 10 - with scaling limited to one core.
34+
35+
Library API is non-standard, poorly documented and with numerous issues as API use is not intended.
36+
37+
Ansible
38+
_______
39+
40+
A configuration management and automation tool that makes use of SSH remote commands. Uses, in parts, both Paramiko and OpenSSH binaries.
41+
42+
Similarly to Fabric, uses threads for parallelisation and suffers from the poor scaling that this model offers.
43+
44+
See `The State of Python SSH Libraries <https://parallel-ssh.org/post/ssh2-python/>`_ for what to expect from scaling SSH with threads, as compared `to non-blocking I/O <https://parallel-ssh.org/post/parallel-ssh-libssh2/>`_ with ``parallel-ssh``.
45+
46+
Again similar to Fabric, its intended and documented use is interactive via command line rather than library API based. It may, however, be an option if Ansible is already being used for automation purposes with existing playbooks, the number of hosts is small, and when the use case is interactive via command line.
47+
48+
``parallel-ssh`` is, on the other hand, a suitable option for Ansible as an SSH client that would improve its parallel SSH performance significantly.
49+
50+
ssh2-python
51+
___________
52+
53+
Bindings for ``libssh2`` C library. Used by ``parallel-ssh`` as of ``1.2.0`` and is by same author.
54+
55+
Does not do parallelisation out of the box but can be made parallel via Python's ``threading`` library relatively easily and as it is a wrapper to a native library that releases Python's GIL, can scale to multiple cores.
56+
57+
``parallel-ssh`` uses ``ssh2-python`` in its native non-blocking mode with event loop and co-operative sockets provided by ``gevent`` for an extremely high performance library without the side-effects of monkey patching - see `benchmarks <https://parallel-ssh.org/post/parallel-ssh-libssh2>`_.
58+
59+
In addition, ``parallel-ssh`` uses native threads to offload CPU bound tasks like authentication in order to scale to multiple cores while still remaining non-blocking for network I/O.
60+
61+
``pssh.clients.native.SSHClient`` is a single host natively non-blocking client for users that do not need parallel capabilities but still want a fully featured client with native code performance.
62+
63+
Out of all the available Python SSH libraries, ``libssh2`` and ``ssh2-python`` have been shown, see benchmarks above, to perform the best with the least resource utilisation and ironically for a native code extension the least amount of dependencies. Only ``libssh2`` C library and its dependencies which are included in binary wheels.
64+
65+
However, it lacks support for some SSH features present elsewhere like GSS-API and certificate authentication.
66+
67+
ssh-python
68+
__________
69+
70+
Bindings for ``libssh`` C library. A client option in ``parallel-ssh``, same author. Similar performance to ssh2-python above.
71+
72+
For non-blocking use, only certain functions are supported. SCP/SFTP in particular cannot be used in non-blocking mode, nor can tunnels.
73+
74+
Supports more authentication options compared to ``ssh2-python`` like GSS-API (Kerberos) and certificate authentication.

doc/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ Single host client is also available with similar API.
7676
advanced
7777
api
7878
clients
79+
scaling
80+
alternatives
7981
Changelog
8082
api_upgrade_2_0
8183

doc/installation.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,3 +56,14 @@ Or for developing changes:
5656
5757
pip install -r requirements_dev.txt
5858
59+
60+
Python 2
61+
--------
62+
63+
As of January 2021, Python 2 is no longer supported by the Python Software Foundation nor ``parallel-ssh`` - see `Sunset Python 2 <https://www.python.org/doc/sunset-python-2/>`_.
64+
65+
Versions of ``parallel-ssh<=2.4.0`` will still work.
66+
67+
Future releases are not guaranteed to be compatible or work at all with Python 2.
68+
69+
If your company requires Python 2 support contact the author directly at the email address on Github commits to discuss rates.

doc/quickstart.rst

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ Output::
6868

6969

7070
Step by Step
71-
-------------
71+
============
7272

7373
Make a list or other iterable of the hosts to run on:
7474

@@ -119,7 +119,7 @@ Standard output, aka ``stdout``, for a given :py:class:`HostOutput <pssh.output.
119119
120120
Iterating over ``stdout`` will only end when the remote command has finished unless interrupted.
121121

122-
The ``timeout`` keyword argument to ``run_command`` may be used to cause output generators to timeout if no output is received after the given number of seconds - see `join and output timeouts <advanced.html#join-and-output-timeouts>`_.
122+
The ``read_timeout`` keyword argument to ``run_command`` may be used to cause reading to timeout if no output is received after the given number of seconds - see `join and output timeouts <advanced.html#join-and-output-timeouts>`_.
123123

124124
``stdout`` is a generator. To retrieve all of stdout can wrap it with list, per below.
125125

@@ -176,8 +176,8 @@ First, ensure that all commands have finished by either joining on the output ob
176176
.. code-block:: python
177177
178178
client.join(output)
179-
for host, host_output in output:
180-
print("Host %s exit code: %s" % (host, host_output.exit_code))
179+
for host_output in output:
180+
print("Host %s exit code: %s" % (host_output.host, host_output.exit_code))
181181
182182
As of ``1.11.0``, ``client.join`` is not required as long as output has been gathered.
183183

@@ -235,8 +235,6 @@ To use files under a user's ``.ssh`` directory:
235235

236236
.. code-block:: python
237237
238-
import os
239-
240238
client = ParallelSSHClient(hosts, pkey='~/.ssh/my_pkey')
241239
242240
@@ -271,8 +269,8 @@ The helper function :py:func:`pssh.utils.enable_host_logger` will enable host lo
271269
from pssh.utils import enable_host_logger
272270
enable_host_logger()
273271
274-
output = client.run_command('uname')
275-
client.join(output, consume_output=True)
272+
client.run_command('uname')
273+
client.join(consume_output=True)
276274
277275
:Output:
278276
.. code-block:: python
@@ -288,10 +286,10 @@ The ``stdin`` attribute on :py:class:`HostOutput <pssh.output.HostOutput>` is a
288286

289287
.. code-block:: python
290288
291-
output = client.run_command('read')
289+
output = client.run_command('read line; echo $line')
292290
host_output = output[0]
293291
stdin = host_output.stdin
294-
stdin.write("writing to stdin\\n")
292+
stdin.write("writing to stdin\n")
295293
stdin.flush()
296294
for line in host_output.stdout:
297295
print(line)
@@ -325,8 +323,8 @@ With this flag, the ``exception`` output attribute will contain the exception on
325323
:Output:
326324
.. code-block:: python
327325
328-
host1: 0, None
329-
host2: None, AuthenticationError <..>
326+
Host host1: exit code 0, exception None
327+
Host host2: exit code None, exception AuthenticationError <..>
330328
331329
.. seealso::
332330

doc/scaling.rst

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
********
2+
Scaling
3+
********
4+
5+
Some guide lines on scaling ``parallel-ssh`` and pool size numbers.
6+
7+
In general, long lived commands with little or no output *gathering* will scale better. Pool sizes in the multiple thousands have been used successfully with little CPU overhead in the single thread running them in these use cases.
8+
9+
Conversely, many short lived commands with output gathering will not scale as well. In this use case, smaller pool sizes in the hundreds are likely to perform better with regards to CPU overhead in the event loop.
10+
11+
Multiple Python native threads, each of which can get its own event loop, may be used to scale this use case further as number of CPU cores allows. Note that ``parallel-ssh`` imports *must* be done within the target function of the newly started thread for it to receive its own event loop. ``gevent.get_hub()`` may be used to confirm that the worker thread event loop differs from the main thread.
12+
13+
Gathering is highlighted here as output generation does not affect scaling. Only when output is gathered either over multiple still running commands, or while more commands are being triggered, is overhead increased.
14+
15+
Technical Details
16+
******************
17+
18+
To understand why this is, consider that in co-operative multi tasking, which is being used in this project via the ``gevent`` library, a co-routine (greenlet) needs to ``yield`` the event loop to allow others to execute - *co-operation*. When one co-routine is constantly grabbing the event loop in order to gather output, or when co-routines are constantly trying to start new short-lived commands, it causes contention with other co-routines that also want to use the event loop.
19+
20+
This manifests itself as increased CPU usage in the process running the event loop and reduced performance with regards to scaling improvements from increasing pool size.
21+
22+
On the other end of the spectrum, long lived remote commands that generate *no* output only need the event loop at the start, when they are establishing connections, and at the end, when they are finished and need to gather exit codes, which results in practically zero CPU overhead at any time other than start or end of command execution.
23+
24+
Output *generation* is done remotely and has no effect on the event loop until output is gathered - output buffers are iterated on. Only at that point does the event loop need to be held.

pssh/clients/base/parallel.py

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -88,12 +88,7 @@ def hosts(self, _hosts):
8888
def _check_host_config(self):
8989
if self.host_config is None:
9090
return
91-
host_len = 0
92-
try:
93-
host_len = len(self.hosts)
94-
except TypeError:
95-
# Generator
96-
return
91+
host_len = len(self.hosts)
9792
if host_len != len(self.host_config):
9893
raise ValueError(
9994
"Host config entries must match number of hosts if provided. "
@@ -169,8 +164,10 @@ def join_shells(self, shells, timeout=None):
169164
finished_shells = [g.get() for g in finished]
170165
unfinished_shells = list(set(shells).difference(set(finished_shells)))
171166
if len(unfinished_shells) > 0:
172-
raise Timeout("Timeout of %s sec(s) reached with commands "
173-
"still running", timeout, finished_shells, unfinished_shells)
167+
raise Timeout(
168+
"Timeout of %s sec(s) reached with commands still running",
169+
timeout, finished_shells, unfinished_shells,
170+
)
174171

175172
def run_command(self, command, user=None, stop_on_errors=True,
176173
host_args=None, use_pty=False, shell=None,
@@ -354,8 +351,10 @@ def join(self, output=None, consume_output=False, timeout=None,
354351
if unfinished_cmds:
355352
finished_output = self.get_last_output(cmds=finished_cmds)
356353
unfinished_output = list(set.difference(set(output), set(finished_output)))
357-
raise Timeout("Timeout of %s sec(s) reached with commands "
358-
"still running", timeout, finished_output, unfinished_output)
354+
raise Timeout(
355+
"Timeout of %s sec(s) reached with commands still running",
356+
timeout, finished_output, unfinished_output,
357+
)
359358

360359
def _join(self, host_out, consume_output=False, timeout=None,
361360
encoding="utf-8"):

pssh/clients/base/single.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -323,7 +323,10 @@ def auth(self):
323323
def _password_auth(self):
324324
raise NotImplementedError
325325

326-
def _pkey_auth(self, password=None):
326+
def _pkey_auth(self, pkey_file, password=None):
327+
raise NotImplementedError
328+
329+
def _open_session(self):
327330
raise NotImplementedError
328331

329332
def open_session(self):
@@ -500,9 +503,7 @@ def copy_file(self, local_file, remote_file, recurse=False,
500503
raise NotImplementedError
501504

502505
def _sftp_put(self, remote_fh, local_file):
503-
with open(local_file, 'rb') as local_fh:
504-
for data in local_fh:
505-
self._eagain(remote_fh.write, data)
506+
raise NotImplementedError
506507

507508
def sftp_put(self, sftp, local_file, remote_file):
508509
raise NotImplementedError

0 commit comments

Comments
 (0)