SimpleSSH: A Basic Interoperability Profile for the SSH Protocol
================================================================

Peter Gutmann, circa 2009.

Abstract
--------

The widespread adoption of SSH has seen the emergence of numerous SSH
implementations, but also numerous interoperability problems among many of the
non-mainstream versions.  This problem arises because the complexity and in
places ambiguity of the specification makes it possible to create
specification-compliant but non-interoperable implementations, and is
exacerbated by the fact that in many cases where SSH is used, for example for
the control interface of an embedded device or a Windows file transfer
facility, the developers are required to implement a specification designed to
provide a full-blown Unix VPN solution even though in their case they'll never
use the majority of its facilities.

This document describes a simplified profile of SSH that provides a standard
minimal feature set for use in applications that just require a basic
no-frills secure channel from A to B, building on a decade of SSH
implementation experience to avoid known problem areas in the SSH protocol.
As a side-effect this minimal profile reduces the large attack surface of SSH
to a more manageable level by eliminating much of the complexity in the
protocol.

Introduction
------------

The adoption of SSH has spawned numerous implementations for a variety of
platforms, may of which are some way removed from the original Unix design
target.  Because of the complexity of the SSH protocol it's become somewhat
standard practice to use the most common SSH server (let's call it O) and the
most common SSH client (let's call it P) as a form of compliance-test suite.
Once a new SSH implementation can successfully connect to O or receive a
connection from P, it's declared complete and shipped to users.  Unfortunately
when two implementations that aren't O or P meet, all manner of problems can
arise.  Since O and P communicate using a stereotyped subset of SSH's
capabilities, any deviation from this stereotyped exchange can result in
anything from the remote system crashing or hanging through to security
breaches via carefully-chosen (but fully specification-compliant) message
exchanges (each section of this document contains a rationale explaining the
reasons for any changes made, along with specific examples of how the original
version has caused interoperability or security problems).

Part of this problem is due to the fact that SSH started life as a general-
purpose Unix VPN solution (whether it was deliberately intended as such or
not), with the result that it contains many facilities that only make sense in
a Unix environment (ptys, shells, a stderr facility, PAM authentication, and
so on).  Implementers creating SSH servers or clients for other systems
typically resort to going through the motions sufficiently convincingly to
satisfy the other side (where the other side is either O or P as appropriate)
but no further.

In order to provide for a simplified secure-pipe implementation, SimpleSSH
cuts down the number of options and capabilities to the minimum required for
this purpose, creating a profile that's tailored for a single channel of
either encrypted terminal traffic (for example to remotely administer an
embedded network device like a router) or file transfer (for example to
securely copy a file from one Windows PC to another).  The intent is to
increase interoperability among implementations that don't require a full Unix
VPN solution by removing much of the protocol's complexity and ambiguity.  A
side effect of this removal of complexity and locking down of details is that
the SSH attack surface is greatly reduced, since only a small number of
(hopefully) well-defined operations are now possible, and in particular ones
that have proven particularly troublesome in the past are removed and left to
implementations of the full SSH protocol.

[NB: The above is intended purely to provide a problem statement, it's not
meant as a criticism of any SSH implementation].

Design Goals
------------

The overall design goal for SimpleSSH is to provide a basic profile of the SSH
protocol to simplify implementation and increase interoperability among
implementations that don't require the full-scale VPN capabilities provided by
the SSH protocol, at the same time reducing the attack surface by constraining
the protocol flow to a subset of basic, well-defined operations.  This overall
design goal is achieved through the following sub-goals:

- Ambiguous protocol operations are locked down.  For example the SSH
specification leaves the order of some messages undefined, leading to
interoperability problems among implementations that don't follow certain
stereotyped patterns.

- Known problem areas are eliminated if possible.  For example rehandshakes
have been an ongoing source of trouble for implementations and are only really
useful for long-running VPN sessions (if that), so they're left to
implementations of the full SSH protocol to support.

- Complex and error-prone aspects of the protocol are reduced to basic
operations sufficient for most tasks.  For example the SSH authentication
process allows for arbitrarily iterated exchanges with partial results
triggering further exchanges, when a typical encrypted-telnet session to a
router requires no more than a user name and password.

- Encompassing all of the above, a side-effect of the simplification of the
protocol is a significant reduction in its attack surface, which is of
particular importance in embedded systems which often contain very cut-down
implementations due to resource constraints, or Windows implementations that
merely contain SSH functionality as a feature-checkbox add-on to an
established application rather than being the entire application as O and P
are.

Signalling SimpleSSH Use
------------------------

The SSH protocol contains a (currently unused) flags field in the header of
the SSH_MSG_KEXINIT message for use with future extensions.  This profile uses
two bits in the field to indicate an implementation's compliance with
SimpleSSH.  The two flags are SSH_FLAG_MAY_SIMPLESSH to indicate that the peer
may, at their discretion, choose to employ the SimpleSSH profile to
communicate, and SSH_FLAG_MUST_SIMPLESSH, to indicate that the peer must use
the SimpleSSH profile to communicate:

  Symbolic name                Value
  -------------                ------
  SSH_FLAG_MAY_SIMPLESSH        0x01
  SSH_FLAG_MUST_SIMPLESSH       0x02

If the SSH_FLAG_MUST_SIMPLESSH is asserted by either side then only SimpleSSH
is permitted.  If one side is unwilling or unable to communicate using
SimpleSSH then it must discontinue the handshake by disconnecting with an
SSH_MSG_DISCONNECT.

If the SSH_FLAG_MAY_SIMPLESSH is asserted by both sides then the handshake
continues using the SimpleSSH profile.  If only one side asserts
SSH_FLAG_MAY_SIMPLESSH then the handshake continues with standard SSH.

In the event that an implementation (erroneously) asserts both
SSH_FLAG_MAY_SIMPLESSH and SSH_FLAG_MUST_SIMPLESSH, the behaviour for
SSH_FLAG_MUST_SIMPLESSH takes precedence.

Rationale
---------

The SimpleSSH flags are intended to be used as a straightforward means for a
client or server to announce their intentions (or lack thereof) in regard to
using SSH.  If either side definitely must use SimpleSSH (either for
implementation or security reasons) then the semantics of the
SSH_FLAG_MUST_SIMPLESSH ensure that both sides use SimpleSSH, or the handshake
negotiations end.  If either side can optionally use SimpleSSH and the other
side decides not to then the handshake continues with standard SSH semantics.
In other words in the very first message exchange the two sides decide whether
they wish to continue using the full SSH protocol or the SimpleSSH profile.

If SimpleSSH is being used for security reasons then there is a potential risk
of a rollback attack in which an attacker clears the SSH_FLAG_MUST_SIMPLESSH /
SSH_FLAG_MAY_SIMPLESSH flags in the first message to cause a fallback from the
restricted SimpleSSH to the full SSH protocol and then employs an attack that
takes advantage of the SSH protocol's complexity before the final
authentication exchange can detect the change to the first message.  If
SSH_FLAG_MUST_SIMPLESSH was asserted then this attack isn't possible because
any attempt to perform protocol steps outside the SimpleSSH framework should
cause the handshake to fail.  However if SSH_FLAG_MAY_SIMPLESSH was asserted
then this rollback can't be easily detected until the final authentication
step.  If the motivation for the use of SimpleSSH is primarily security then
SSH_FLAG_MUST_SIMPLESSH should be used in preference to SSH_FLAG_MAY_SIMPLESSH
in order to force the use of the restricted SimpleSSH profile.

Neither RFC 4250 nor RFC 4253 define any process for assigning values to the
flags field.  In the absence of any other information, this document assumes
that the process specified in RFC 4250 for SSH assigned numbers will be
followed for any future extensions to this field.

SSH Transport
-------------

When performing the protocol version exchange both sides send a single line of
version information with no additional preceding lines of data.  The version ID
is formatted as "SSH-vvv-name-yyy", where 'vvv' is the SSH version (currently
'2.0'), 'name' is the SSH implementation name, for example 'FooSSH', and 'yyy'
is an optional software version, for example '1.95'.  For a hypothetical
FooSSH the version string would be "SSH-2.0-FooSSH-1.95", or "SSH-2.0-FooSSH"
if the version string is omitted.  Ref: TRANS 4.2.

When sending the SSH_MSG_KEXINIT the server always speaks first.  The client
waits for the server's SSH_MSG_KEXINIT and then responds in kind.  Since the
client is responding to the server's options its own SSH_MSG_KEXINIT contains
only the single algorithm choice that it prefers for each option, not the full
range of available algorithm options. The complex algorithm-matching process
outlined in the SSH specification is not required.  Ref: TRANS 6.1.

The algorithm pairs specified in the SSH_MSG_KEXINIT (client-to-server and
server-to-client) are the same in both directions.  In other words if the
client-to-server algorithm list is "a,b,c" then the server-to-client list must
also be "a,b,c".  Ref: TRANS 6.1.

Key exchange guessing is not used, i.e. first_kex_packet_follows in
SSH_MSG_KEXINIT is always false.

The only key and signature format used is "ssh-rsa" or "ssh-dss", not "spki-
sign-rsa" or "pgp-sign-dss" or "x509v3-sign-rsa" or ... Ref: TRANS 5.6.

As for SSH_MSG_KEXINIT, the server speaks first for SSH_MSG_NEWKEYS.  Ref:
TRANS 6.3.

DH_GEX_REQ implements SSH_MSG_KEX_DH_GEX_REQUEST, not
SSH_MSG_KEX_DH_GEX_REQUEST_OLD.

[NB: Could this be a problem?  At one point a number of implementations would
fail if sent a SSH_MSG_KEX_DH_GEX_REQUEST rather than a
SSH_MSG_KEX_DH_GEX_REQUEST_OLD].

Rehandshake is never performed.  Ref: TRANS 8.

Rationale
---------

The SSH specification allows the version string to be preceded by arbitrary
amounts of free-form text.  This comes as a complete surprise to many
implementations, who simply disconnect.  The version string itself has no
defined format (apart from the "SSH-2.0" at the start) which causes
considerable problems for implementations that parse the version string to
work around bugs (or at least specification-compliant but unexpected
features), with the result that developers resort to complex
regular-expression parsing front-ends to sort out the different versions.
This is further complicated by the fact that the lack of a clear format means
that some SSH implementations use ID formats that result in them being mis-
identified as older (buggy) versions of other implementations, with the
inevitable effect on interoperability.  Requiring a basic ID in a predictable
format should remove these problems.

Since the software version reveals details about potentially vulnerable
implementations, it's left optional.

The SSH_MSG_KEXINIT is one of the locations where the SSH specification leaves
the order of messages undefined so that if two implementations that disagree
over the order meet, the result is deadlock.  Alternatively, both sides can
speak at the same time, with the specification containing an awkward conflict-
resolution mechanism that complicates implementation and leads to
interoperability problems.  This profile simplifies the process by requiring
that the server speak first, offering its available algorithms and mechanisms,
and the client respond, specifying in its response the one algorithm that it
has chosen for each purpose.  In this manner the complexity and guesswork of
the initial negotiations vanish.

The SSH specification (theoretically) allows for different algorithms to be
used in different directions.  A straw poll on the SSH list indicated that
nothing actually does this, this profile makes the behaviour explicit.

A number of SSH key and signature formats are underspecified or ambiguous,
with some actually being dropped as the RFC draft progressed when no-one could
figure out what the format required.  This profile simplifies implementation
requirements by requiring only the two universally-supported formats "ssh-rsa"
and "ssh-dss", and by extension any future successors to these well-defined
formats, for example one that replaces the SHA-1 used in "ssh-rsa" with SHA-2
while retaining the same general format.  The intent is not to prohibit any
future use of new key and signature formats but to select the subset of
currently-defined formats that are universally supported and interoperable
with other implementations.

SSH_MSG_NEWKEYS is another undefined-order message in the pattern of
SSH_MSG_KEXINIT, leading to similar problems.

The SSH rehandshake process is probably the single biggest problem area in the
specification, with things becoming so bad at one point that some major
implementations would detect when they were talking to a different
implementation and disabling rehandshake completely in order to ensure
interoperability.  Many of the non-mainstream versions either don't support
rehandshake at all or if sent an SSH_MSG_KEXINIT packet in the middle of an
ongoing exchange eventually become confused, leading to communications
breaking down.  This leads to erratic, hard-to-diagnose errors (the exact
details depend on what the SSH layers are doing at the time of the problem),
typically a bad-packet error when the other side tries to interpret a
connection-layer packet as part of the rehandshake, or when the two sides
disagree on when to switch keys and one of the two decrypts with the wrong
keys and gets a garbled packet type.  To make things even messier, neither
side can avoid the problem by ignoring the SSH_MSG_KEXINIT because the lack of
SSG_MSG_WINDOW_ADJUST messages will mess up flow control and lead to deadlock.

Rehandshake is something that's really only necessary for long-running VPN
sessions (if that), the sort of thing that's left to the full SSH protocol.
Removing it in SimpleSSH acknowledges the fact that many implementations don't
support it anyway, and eliminates a significant source of interoperability
problems.

SSH Authentication
------------------

This profile simplifies the SSH authentication process by requiring a
straightforward request/response exchange beginning with an optional request
for available authentication methods followed by one or more authentication
attempts, with retries only being permitted if the authentication method is
"password".  The exact message flow is:

Step 0 (optional):

  Client sends SSH_MSG_USERAUTH_REQUEST with method "none" to query available
  authentication method types.

  Server responsd with SSH_MSG_USERAUTH_FAILURE listing available methods.

Step 1:

  Client sends SSH_MSG_USERAUTH_REQUEST with method "password" or "publickey"
  and password data or a digital signature as appropriate.

Step 2, one of:

  a. Server responds with SSH_MSG_USERAUTH_SUCCESS and the authentication
  exchange terminates.

  b. Server responds to method "password" with SSH_MSG_USERAUTH_FAILURE, the
  client may retry step 1 if permitted by the server as described in the SSH
  specification.

  c. Server responds to method "publickey" with SSH_MSG_USERAUTH_FAILURE and
  the authentication exchange terminates.

The server or client may only send one message at each step in the above
exchange, after which they must stop and wait for the other side's response.
Step 0 is optional, a server must be able to process an authentication request
with method "password" or "publickey" without first requiring a wakeup call
with method "none".  Because the authentication is atomic and cannot be
performed in parts the partial_success flag is always false.

If the authentication is iterated to allow a user to retype their password
then only the authenticator (i.e. the password) but not any other portions of
the message such as the user name, service name, or method name, can change.

The server may only advertise authentication methods that it supports, and the
client may only send authentication requests that it knows that the server
supports (this may sound redundant but the SSH specification specifically
allows the server to request that the client use an authentication method that
the server knows it can't support, and vice versa).

For any SSH_MSG_USERAUTH_REQUEST other than the optional initial one with
method name "none", if the server responds with SSH_MSG_USERAUTH_FAILURE it
must return an available_auth_types that matches the one used by the client if
the authentication failed, or any available_auth_types containing the
available authentication methods if the wrong authentication method was used.
In other words if the client supplied the wrong password then the
SSH_MSG_USERAUTH_FAILURE available_auth_types will be "password" and if the
client used password authentication when public-key authentication was
required then the SSH_MSG_USERAUTH_FAILURE available_auth_types will be
"publickey".

If public-key authentication is used then the client must send a signature
packet directly, without any of the additional message exchanges described in
the specification.  In other words the only public key sub-message permitted
is one that corresponds to the (unnamed) flag parameter in the
SSH_MSG_USERAUTH_REQUEST message being set to true.

In addition to the standard list of available authentication methods the
server can also respond with the method "no-auth" to indicate that no
authentication is required, for example if the authentication is performed via
external means or via the protocol that's being tunnelled over the SSH link.
[How can this be handled properly? No-auth currently relies on side-effects of
implementations to work, see the rationale].

The SSH specification recommends that clients be given up to 10 minutes and 20
retries to get their password right.  Although this profile leaves issues such
as setting bounds for authentication attempts as a server configuration issue,
in the light of DoS and SSH port-scanning attacks a significantly lower
timeout and a retry limit set to the de facto industry standard of three
attempts is recommended.

Rationale
---------

As currently specified the SSH authentication exchange can be iterated
arbitrarily, the authentication is carried out in bits and pieces with
malicious clients able to change the details at each iteration, the client is
allowed to send requests that it knows the server can't handle (the server is
supposed to ignore them and wait for other requests and in compensation is
allowed to instruct the client to use authentication methods that the server
knows it can't handle (this is explicitly stated in the specification!)), and
the client can spray requests at a server without having to wait for
responses, with complicated races possible if a request at position n results
in a further exchange of messages but gets overtaken by the request already
sent at position n+1, and so on.  Ref: AUTH 3.1.

From experimentation with sending permitted but slightly unexpected sequences
of authentication requests to servers it seems to be mostly coincidence that
some servers can handle authentication, since any deviation from the
stereotyped pattern set by a few widespread clients, typically P, often
results in the server becoming profoundly confused and either disconnecting
suddenly, hanging, or (in a few notable cases) allowing access when it
shouldn't.

In order to simplify the authentication exchange and reduce its considerable
attack surface this profile reduces the arbitrarily complex process to a
straightforward request/response exchange with well-defined semantics.  The
result is a standard password- or public-key based authentication without the
large amount of leeway provided by the original specification.  It should be
noted that much of the behaviour specified here is already implemented as a de
facto standard in many clients, with for example a failed password
authentication result prompting the user to retype their password and then
sending a new request with the same user name and method but with the new
password.

The SSH specification defines a somewhat schizophrenic way of indicating
authentication failures which requires complex decoding in the client to sort
out whether the failure occurred because the wrong method was used or the
wrong authenticator was supplied, which in turn leads to confusing error
messages being displayed to users.  This profile makes it easier for the
client implementation to distinguish "wrong password" (or key) from "wrong
authentication method used".

The no-authentication situation is currently handled in a rather ad-hoc manner
with the server returning a somewhat unexpected SSH_MSG_USERAUTH_SUCCESS in
response to a query for available authentication methods (although there are
creative-interpretation implementations that return an
SSH_MSG_USERAUTH_FAILURE with available_auth_types set to an empty string to
indicate that no-authentication is allowed to continue because there's no way
defined in the specification to distinguish "no authentication is allowed to
continue" from "'no-authentication' is allowed to continue").  This results in
the protocol stalling at this point if the client doesn't perform the initial
method query since there's no other way beyond this side-effect of the method
query to indicate that no authentication is required, or at least that
authentication is performed via out-of-band means (the server is explicitly
prohibited from advertising "none" as a permitted authentication method).
This profile makes the no-authentication behaviour explicit by treating it as
a standard authentication method type.

The SSH specification allows the public-key authentication process to be
broken down into further sub-exchanges in which the client can send queries to
the server and perform assorted other operations.  As with the overall
authentication process, this profile limits the messages exchanged to a basic
authentication request followed by a response.

A downside to immediately sending the signature without an additional exchange
of messages is that if for some reason the wrong key is used it may cause
extra overhead on low-powered devices that could be avoided if further steps
in the handshake were used.  However this overhead is likely to be much
smaller than the overhead of the additional message exchange (even on
low-powered ARM CPUs such a signature can be generated in a fraction of a
second), and re-adding this protocol complexity defeats the purpose of
SimpleSSH being a minimal interoperability profile.  In addition it's not
clear in which actual real-world scenario (as opposed to hypothetical
situation) this would be an issue.  Another way of looking at this is that TLS
authentication has relied on doing it this way for 15-odd years without any
real problems, and TLS implementations, because of how the protocol is used,
run on much lower-powered devices than SSH does.

SSH Connection
--------------

The client immediately follows the user authentication with all messages
related to a session/channel open.  It cannot wait an arbitrary amount of time
after the user authentication has completed before proceeding or perform the
open in parts, for example by sending the SSH_MSG_CHANNEL_OPEN on connect and
the SSH_MSG_CHANNEL_REQUEST that renders it usable at some future date.

The client only ever opens one channel, with the sender and recipient channel
having channel number 0.  Since there's only one channel, no window adjusts
are used and the initial window size is set to the special-case value
0xFFFFFFFF.

All data exchanged is standard channel data SSH_MSG_CHANNEL_DATA, not
SSH_MSG_CHANNEL_EXTENDED_DATA.  Ref: CONN 5.2.

Rationale
---------

The SSH specification doesn't place any limits on when session-control
messages such as channel opens can be sent, and exacerbates the problem by
breaking many session control operations into multiple bits and pieces, with
implementations free to send the various messages that go into an operation at
any time they want.  This means that implementations either need to maintain a
background thread that awaits the arrival of control data or perform
continuous polling to check whether anything new has arrived.  This is
particularly problematic for SSH libraries and wrapper implementations that
provide standard BSD-sockets style semantics because alongside the standard
send() and recv() there's now also a requirement for a
check_for_and_process_control_message() which can in turn affect the semantics
of subsequent send() and recv() operations.

In order to remove this problem, this profile requires that session-control
messages that initiate sessions or channels be sent as part of the initial SSH
handshake and not an arbitrary amount of time later.  In this manner the full
session setup is processed at the initial handshake stage, leaving the session
ready for normal data exchange once the handshake has completed.

Since there's only one channel there's no need for flow control as there
aren't multiple channels multiplexed onto the SSH link, and any higher-level
flow control is handled by TCP/IP's flow-control mechanisms.

SSH_MSG_CHANNEL_EXTENDED_DATA is a Unix-ism used to handle stderr output,
other systems can't do much with this information and at the programmers' whim
either drop it or treat it as standard channel data.  This profile only allows
a single, unambiguous data type in order to eliminate this confusion.

General
-------

Neither client nor server have the HMAC keysize bug or the DSA signature format
bug or the RSA signature padding bug or the ... (this is mostly attack surface
reduction since the client or server can now disable special-case handling
for all of these bugs if it's talking to a SimpleSSH peer).