\A{authplugin} PuTTY authentication plugin protocol

This appendix contains the specification for the protocol spoken over
local IPC between PuTTY and an authentication helper plugin.

If you already have an authentication plugin and want to configure
PuTTY to use it, see \k{config-ssh-authplugin} for how to do that.
This appendix is for people writing new authentication plugins.

\H{authplugin-req} Requirements

The following requirements informed the specification of this protocol.

\s{Automate keyboard-interactive authentication.} We're motivated in
the first place by the observation that the general SSH userauth
method \cq{keyboard-interactive} (defined in \k{authplugin-ref-ki})
can be used for many kinds of challenge/response or one-time-password
styles of authentication, and in more than one of those, the necessary
responses might be obtained from an auxiliary network connection, such
as an HTTPS transaction. So it's useful if a user doesn't have to
manually copy-type or copy-paste from their web browser into their SSH
client, but instead, the process can be automated.

\s{Be able to pass prompts on to the user.} On the other hand, some
userauth methods can be only \e{partially} automated; some of the
server's prompts might still require human input. Also, the plugin
automating the authentication might need to ask its own questions that
are not provided by the SSH server. (For example, \q{please enter the
master key that the real response will be generated by hashing}.) So
after the plugin intercepts the server's questions, it needs to be
able to ask its own questions of the user, which may or may not be the
same questions sent by the server.

\s{Allow automatic generation of the username.} Sometimes, the
authentication method comes with a mechanism for discovering the
username to be used in the SSH login. So the plugin has to start up
early enough that the client hasn't committed to a username yet.

\s{Future expansion route to other SSH userauth flavours.} The initial
motivation for this protocol is specific to keyboard-interactive. But
other SSH authentication methods exist, and they may also benefit from
automation in future. We're making no attempt here to predict what
those methods might be or how they might be automated, but we do need
to leave a space where they can be slotted in later if necessary.

\s{Minimal information loss.} Keyboard-interactive prompts and replies
should be passed to and from the plugin in a form as close as possible
to the way they look on the wire in SSH itself. Therefore, the
protocol resembles SSH in its data formats and marshalling (instead
of, for example, translating from SSH binary packet style to another
well-known format such as JSON, which would introduce edge cases in
character encoding).

\s{Half-duplex.} Simultaneously trying to read one I/O stream and
write another adds a lot of complexity to software. It becomes
necessary to have an organised event loop containing \cw{select} or
\cw{WaitForMultipleObjects} or similar, which can invoke the handler
for whichever event happens soonest. There's no need to add that
complexity in an application like this, which isn't transferring large
amounts of bulk data or multiplexing unrelated activities. So, to keep
life simple for plugin authors, we set the ground rule that it must
always be 100% clear which side is supposed to be sending a message
next. That way, the plugin can be written as sequential code
progressing through the protocol, making simple read and write calls
to receive or send each message.

\s{Communicate success/failure, to facilitate caching in the plugin.}
A plugin might want to cache recently used data for next time, but
only in the case where authentication using that data was actually
successful. So the client has to tell the plugin what the outcome was,
if it's known. (But this is best-effort only. Obviously the plugin
cannot \e{depend} on hearing the answer, because any IPC protocol at
all carries the risk that the other end might crash or be killed by
things outside its control.)

\H{authplugin-transport} Transport and configuration

Plugins are executable programs on the client platform.

The SSH client must be manually configured to use a plugin for a
particular connection. The configuration takes the form of a command
line, including the location of the plugin executable, and optionally
command-line arguments that are meaningful to the particular plugin.

The client invokes the plugin as a subprocess, passing it a pair of
8-bit-clean pipes as its standard input and output. On those pipes,
the client and plugin will communicate via the protocol specified
below.

\H{authplugin-formats} Data formats and marshalling

This protocol borrows the low-level data formatting from SSH itself,
in particular the following wire encodings from
\k{authplugin-ref-arch} section 5:

\dt \s{byte}

\dd An integer between 0 and 0xFF inclusive, transmitted as a single
byte of binary data.

\dt \s{boolean}

\dd The values \q{true} or \q{false}, transmitted as the bytes 1 and 0
respectively.

\dt \s{uint32}

\dd An integer between 0 and 0xFFFFFFFF inclusive, transmitted as 4
bytes of binary data, in big-endian (\q{network}) byte order.

\dt \s{string}

\dd A sequence of bytes, preceded by a \s{uint32} giving the number of
bytes in the sequence. The length field does not include itself. For
example, the empty string is represented by four zero bytes (the
\s{uint32} encoding of 0); the string "AB" is represented by the six
bytes 0,0,0,2,'A','B'.

Unlike SSH itself, the protocol spoken between the client and the
plugin is unencrypted, because local inter-process pipes are assumed
to be secured by the OS kernel. So the binary packet protocol is much
simpler than SSH proper, and is similar to SFTP and the OpenSSH agent
protocol.

The data sent in each direction of the conversation consists of a
sequence of \s{messages} exchanged between the SSH client and the
plugin. Each message is encoded as a \s{string}. The contents of the
string begin with a \s{byte} giving the message type, which determines
the format of the rest of the message.

\H{authplugin-version} Protocol versioning

This protocol itself is versioned. At connection setup, the client
states the highest version number it knows how to speak, and then the
plugin responds by choosing the version number that will actually be
spoken (which may not be higher than the client's value).

Including a version number makes it possible to make breaking changes
to the protocol later.

Even version numbers represent released versions of this spec. Odd
numbers represent drafts or development versions in between releases.
A client and plugin negotiating an odd version number are not
guaranteed to interoperate; the developer testing the combination is
responsible for ensuring the two are compatible.

This document describes version 2 of the protocol, the first released
version. (The initial drafts had version 1.)

\H{authplugin-overview} Overview and sequence of events

At the very beginning of the user authentication phase of SSH, the
client launches the plugin subprocess, if one is configured. It
immediately sends the \cw{PLUGIN_INIT} message, telling the plugin
some initial information about where the SSH connection is to.

The plugin responds with \cw{PLUGIN_INIT_RESPONSE}, which may
optionally tell the SSH client what username to use.

The client begins trying to authenticate with the SSH server in the
usual way, using the username provided by the plugin (if any) or
alternatively one obtained via its normal (non-plugin) policy.

The client follows its normal policy for selecting authentication
methods to attempt. If it chooses a method that this protocol does not
cover, then the client will perform that method in its own way without
consulting the plugin.

However, if the client and server decide to attempt a method that this
protocol \e{does} cover, then the client sends \cw{PLUGIN_PROTOCOL}
specifying the SSH protocol id for the authentication method being
used. The plugin responds with \cw{PLUGIN_PROTOCOL_ACCEPT} if it's
willing to assist with this auth method, or
\cw{PLUGIN_PROTOCOL_REJECT} if it isn't.

If the plugin sends \cw{PLUGIN_PROTOCOL_REJECT}, then the client will
proceed as if the plugin were not present. Later, if another auth
method is negotiated (either because this one failed, or because it
succeeded but the server wants multiple auth methods), the client may
send a further \cw{PLUGIN_PROTOCOL} and try again.

If the plugin sends \cw{PLUGIN_PROTOCOL_ACCEPT}, then a protocol
segment begins that is specific to that auth method, terminating in
either \cw{PLUGIN_AUTH_SUCCESS} or \cw{PLUGIN_AUTH_FAILURE}. After
that, again, the client may send a further \cw{PLUGIN_PROTOCOL}.

Currently the only supported method is \cq{keyboard-interactive},
defined in \k{authplugin-ref-ki}. Once the client has announced this
to the server, the followup protocol is as follows:

Each time the server sends an \cw{SSH_MSG_USERAUTH_INFO_REQUEST}
message requesting authentication responses from the user, the SSH
client translates the message into \cw{PLUGIN_KI_SERVER_REQUEST} and
passes it on to the plugin.

At this point, the plugin may optionally send back
\cw{PLUGIN_KI_USER_REQUEST} containing prompts to be presented to the
actual user. The client will reply with a matching
\cw{PLUGIN_KI_USER_RESPONSE} after asking the user to reply to the
question(s) in the request message. The plugin can repeat this cycle
multiple times.

Once the plugin has all the information it needs to respond to the
server's authentication prompts, it sends \cw{PLUGIN_KI_SERVER_RESPONSE}
back to the client, which translates it into
\cw{SSH_MSG_USERAUTH_INFO_RESPONSE} to send on to the server.

After that, as described in \k{authplugin-ref-ki}, the server is free
to accept authentication, reject it, or send another
\cw{SSH_MSG_USERAUTH_INFO_REQUEST}. Each
\cw{SSH_MSG_USERAUTH_INFO_REQUEST} is dealt with in the same way as
above.

If the server terminates keyboard-interactive authentication with
\cw{SSH_MSG_USERAUTH_SUCCESS} or \cw{SSH_MSG_USERAUTH_FAILURE}, the
client informs the plugin by sending either \cw{PLUGIN_AUTH_SUCCESS}
or \cw{PLUGIN_AUTH_FAILURE}. \cw{PLUGIN_AUTH_SUCCESS} is sent when
\e{that particular authentication method} was successful, regardless
of whether the SSH server chooses to request further authentication
afterwards: in particular, \cw{SSH_MSG_USERAUTH_FAILURE} with the
\q{partial success} flag (see \k{authplugin-ref-userauth} section 5.1) translates
into \cw{PLUGIN_AUTH_SUCCESS}.

The plugin's standard input will close when the client no longer
requires the plugin's services, for any reason. This could be because
authentication is complete (with overall success or overall failure),
or because the user has manually aborted the session in
mid-authentication, or because the client crashed.

\H{authplugin-messages} Message formats

This section describes the format of every message in the protocol.

As described in \k{authplugin-formats}, every message starts with the same two
fields:

\b \s{uint32}: overall length of the message

\b \s{byte}: message type.

The length field does not include itself, but does include the type
code.

The following subsections each give the format of the remainder of the
message, after the type code.

The type codes themselves are defined here:

\c #define PLUGIN_INIT                   1
\c #define PLUGIN_INIT_RESPONSE          2
\c #define PLUGIN_PROTOCOL               3
\c #define PLUGIN_PROTOCOL_ACCEPT        4
\c #define PLUGIN_PROTOCOL_REJECT        5
\c #define PLUGIN_AUTH_SUCCESS           6
\c #define PLUGIN_AUTH_FAILURE           7
\c #define PLUGIN_INIT_FAILURE           8
\c
\c #define PLUGIN_KI_SERVER_REQUEST     20
\c #define PLUGIN_KI_SERVER_RESPONSE    21
\c #define PLUGIN_KI_USER_REQUEST       22
\c #define PLUGIN_KI_USER_RESPONSE      23

If this protocol is extended to be able to assist with further auth
methods, their message type codes will also begin from 20, overlapping
the codes for keyboard-interactive.

\S{PLUGIN_INIT} \cw{PLUGIN_INIT}

\s{Direction}: client to plugin

\s{When}: the first message sent at connection startup

\s{What happens next}: the plugin will send \cw{PLUGIN_INIT_RESPONSE}
or \cw{PLUGIN_INIT_FAILURE}

\s{Message contents after the type code}:

\b \s{uint32}: the highest version number of this protocol that the
client knows how to speak.

\b \s{string}: the hostname of the server. This will be the \e{logical}
hostname, in cases where it differs from the physical destination of
the network connection. Whatever name would be used by the SSH client
to cache the server's host key, that's the same name passed in this
message.

\b \s{uint32}: the port number on the server. (Together with the host
name, this forms a primary key identifying a particular server. Port
numbers may be vital because a single host can run two unrelated SSH
servers with completely different authentication requirements, e.g.
system sshd on port 22 and Gerrit on port 29418.)

\b \s{string}: the username that the client will use to log in, if the
plugin chooses not to override it. An empty string means that the
client has no opinion about this (and might, for example, prompt the
user).

\S{PLUGIN_INIT_RESPONSE} \cw{PLUGIN_INIT_RESPONSE}

\s{Direction}: plugin to client

\s{When}: response to \cw{PLUGIN_INIT}

\s{What happens next}: the client will send \cw{PLUGIN_PROTOCOL}, or
perhaps terminate the session (if no auth method is ever negotiated
that the plugin can help with)

\s{Message contents after the type code}:

\b \s{uint32}: the version number of this protocol that the connection
will use. Must be no greater than the max version number sent by the
client in \cw{PLUGIN_INIT}.

\b \s{string}: the username that the plugin suggests the client use. An
empty string means that the plugin has no opinion and the client
should stick with the username it already had (or prompt the user, if
it had none).

\S{PLUGIN_INIT_FAILURE} \cw{PLUGIN_INIT_FAILURE}

\s{Direction}: plugin to client

\s{When}: response to \cw{PLUGIN_INIT}

\s{What happens next}: the session is over

\s{Message contents after the type code}:

\b \s{string}: an error message to present to the user indicating why
the plugin was unable to start up.

\S{PLUGIN_PROTOCOL} \cw{PLUGIN_PROTOCOL}

\s{Direction}: client to plugin

\s{When}: sent after \cw{PLUGIN_INIT_RESPONSE}, or after a previous
auth phase terminates with \cw{PLUGIN_AUTH_SUCCESS} or
\cw{PLUGIN_AUTH_FAILURE}

\s{What happens next}: the plugin will send
\cw{PLUGIN_PROTOCOL_ACCEPT} or \cw{PLUGIN_PROTOCOL_REJECT}

\s{Message contents after the type code}:

\b \s{string}: the SSH protocol id of the auth method the client
intends to attempt. Currently the only method specified for use in
this protocol is \cq{keyboard-interactive}.

\S{PLUGIN_PROTOCOL_REJECT} \cw{PLUGIN_PROTOCOL_REJECT}

\s{Direction}: plugin to client

\s{When}: sent after \cw{PLUGIN_PROTOCOL}

\s{What happens next}: the client will either send another
\cw{PLUGIN_PROTOCOL} or terminate the session

\s{Message contents after the type code}:

\b \s{string}: an error message to present to the user, explaining why
the plugin cannot help with this authentication protocol.

\lcont{

An example might be \q{unable to open <config file>: <OS error
message>}, if the plugin depends on some configuration that the user
has not set up.

If the plugin does not support this this particular authentication
protocol at all, this string should be left blank, so that no message
will be presented to the user at all.

}

\S{PLUGIN_PROTOCOL_ACCEPT} \cw{PLUGIN_PROTOCOL_ACCEPT}

\s{Direction}: plugin to client

\s{When}: sent after \cw{PLUGIN_PROTOCOL}

\s{What happens next}: depends on the auth protocol agreed on. For
keyboard-interactive, the client will send
\cw{PLUGIN_KI_SERVER_REQUEST} or \cw{PLUGIN_AUTH_SUCCESS} or
\cw{PLUGIN_AUTH_FAILURE}. No other method is specified.

\s{Message contents after the type code}: none.

\S{PLUGIN_KI_SERVER_REQUEST} \cw{PLUGIN_KI_SERVER_REQUEST}

\s{Direction}: client to plugin

\s{When}: sent after \cw{PLUGIN_PROTOCOL}, or after a previous
\cw{PLUGIN_KI_SERVER_RESPONSE}, when the SSH server has sent
\cw{SSH_MSG_USERAUTH_INFO_REQUEST}

\s{What happens next}: the plugin will send either
\cw{PLUGIN_KI_USER_REQUEST} or \cw{PLUGIN_KI_SERVER_RESPONSE}

\s{Message contents after the type code}: the exact contents of the
\cw{SSH_MSG_USERAUTH_INFO_REQUEST} just sent by the server. See
\k{authplugin-ref-ki} section 3.2 for details. The summary:

\b \s{string}: name of this prompt collection (e.g. to use as a
dialog-box title)

\b \s{string}: instructions to be displayed before this prompt
collection

\b \s{string}: language tag (deprecated)

\b \s{uint32}: number of prompts in this collection

\b That many copies of:

\lcont{

\b \s{string}: prompt (in UTF-8)

\b \s{boolean}: whether the response to this prompt is safe to echo to
the screen

}

\S{PLUGIN_KI_SERVER_RESPONSE} \cw{PLUGIN_KI_SERVER_RESPONSE}

\s{Direction}: plugin to client

\s{When}: response to \cw{PLUGIN_KI_SERVER_REQUEST}, perhaps after one
or more intervening pairs of \cw{PLUGIN_KI_USER_REQUEST} and
\cw{PLUGIN_KI_USER_RESPONSE}

\s{What happens next}: the client will send a further
\cw{PLUGIN_KI_SERVER_REQUEST}, or \cw{PLUGIN_AUTH_SUCCESS} or
\cw{PLUGIN_AUTH_FAILURE}

\s{Message contents after the type code}: the exact contents of the
\cw{SSH_MSG_USERAUTH_INFO_RESPONSE} that the client should send back
to the server. See \k{authplugin-ref-ki} section 3.4 for details. The
summary:

\b \s{uint32}: number of responses (must match the \q{number of
prompts} field from the corresponding server request)

\b That many copies of:

\lcont{

\b \s{string}: response to the \e{n}th prompt (in UTF-8)

}

\S{PLUGIN_KI_USER_REQUEST} \cw{PLUGIN_KI_USER_REQUEST}

\s{Direction}: plugin to client

\s{When}: response to \cw{PLUGIN_KI_SERVER_REQUEST}, if the plugin
cannot answer the server's auth prompts without presenting prompts of
its own to the user

\s{What happens next}: the client will send \cw{PLUGIN_KI_USER_RESPONSE}

\s{Message contents after the type code}: exactly the same as in
\cw{PLUGIN_KI_SERVER_REQUEST} (see \k{PLUGIN_KI_SERVER_REQUEST}).

\S{PLUGIN_KI_USER_RESPONSE} \cw{PLUGIN_KI_USER_RESPONSE}

\s{Direction}: client to plugin

\s{When}: response to \cw{PLUGIN_KI_USER_REQUEST}

\s{What happens next}: the plugin will send
\cw{PLUGIN_KI_SERVER_RESPONSE}, or another \cw{PLUGIN_KI_USER_REQUEST}

\s{Message contents after the type code}: exactly the same as in
\cw{PLUGIN_KI_SERVER_RESPONSE} (see \k{PLUGIN_KI_SERVER_RESPONSE}).

\S{PLUGIN_AUTH_SUCCESS} \cw{PLUGIN_AUTH_SUCCESS}

\s{Direction}: client to plugin

\s{When}: sent after \cw{PLUGIN_KI_SERVER_RESPONSE}, or (in unusual
cases) after \cw{PLUGIN_PROTOCOL_ACCEPT}

\s{What happens next}: the client will either send another
\cw{PLUGIN_PROTOCOL} or terminate the session

\s{Message contents after the type code}: none

\S{PLUGIN_AUTH_FAILURE} \cw{PLUGIN_AUTH_FAILURE}

\s{Direction}: client to plugin

\s{When}: sent after \cw{PLUGIN_KI_SERVER_RESPONSE}, or (in unusual
cases) after \cw{PLUGIN_PROTOCOL_ACCEPT}

\s{What happens next}: the client will either send another
\cw{PLUGIN_PROTOCOL} or terminate the session

\s{Message contents after the type code}: none

\H{authplugin-refs} References

\B{authplugin-ref-arch} \W{https://www.rfc-editor.org/rfc/rfc4251}{RFC 4251}, \q{The Secure Shell (SSH) Protocol
Architecture}.

\B{authplugin-ref-userauth} \W{https://www.rfc-editor.org/rfc/rfc4252}{RFC
4252}, \q{The Secure Shell (SSH) Authentication Protocol}.

\B{authplugin-ref-ki}
\W{https://www.rfc-editor.org/rfc/rfc4256}{RFC 4256},
\q{Generic Message Exchange Authentication for the Secure Shell
Protocol (SSH)} (better known by its wire id
\q{keyboard-interactive}).

\BR{authplugin-ref-arch} [RFC4251]

\BR{authplugin-ref-userauth} [RFC4252]

\BR{authplugin-ref-ki} [RFC4256]