When talking to a web proxy which requires a password, our HTTP proxy
code sends an initial attempt to connect without authentication,
receives the 407 status indicating that authentication was required
(and which kind), and then closes and reopens the connection (if given
"Connection: close"). Then, on the next attempt, we try again with
authentication, and expect the first thing in the input bufchain to be
the HTTP status response to the revised request.
But what happened about the error document that followed those HTTP
headers? It - or at least some of it - would already have been in the
input bufchain.
With an HTTP/1.1 proxy, we already read it and discarded it (either
via a Content-length header or chunked transfer encoding), before we
set the 'reconnect' flag. So, assuming the proxy HTTP server is
behaving sensibly, our input bufchain ought to be empty at the point
when we start the fresh connection.
But if the proxy only speaks HTTP/1.0 (which does still happen -
'tinyproxy' is a still-current example), then we didn't get a
Content-length header either, so we didn't run any of the code that
skips over the error document. (And HTTP/1.0 implicitly has
"Connection: close" semantics, which is why that doesn't matter.) As a
result, some of it would still be in the input bufchain, and never got
cleared out, and we'd try to parse _that_ as if it was the HTTP
response from the second network connection.
The simple solution is that when we close and reopen the proxy network
connection, we also clear the input bufchain, so that the fresh
connection starts from a clean slate.
This is a simple tweak to the existing in-process SSH jump host
support, where instead of opening a direct-tcpip channel to the
destination host, we open a session channel and run a process in it to
make the connection to the destination.
So, where the existing jump host support replaced a local proxy
command along the lines of "plink %proxyhost -nc %host %port", this
one replaces "plink %proxyhost run-some-command".
Also added a corresponding option to use a subsystem to make the
connection. (Someone could configure an SSH server to support specific
subsystem names for particular destinations, or a general schema of
subsystem names that include the destination address in some standard
format.)
To avoid overflowing the already-full Proxy config panel with an extra
subtype selector, I've put these in as additional top-level proxy
types, so that instead of just PROXY_SSH we now have three
PROXY_SSH_foo.
Another awkward thing that FreeProxy does is to slam the connection
shut after sending its 407 response, at least in Basic auth mode. (It
keeps the connection alive in digest mode, which makes sense to me,
because that's a more stateful system.)
It was surprisingly easy to make the proxy code able to tolerate this!
I've set it up so that a ProxyNegotiator can just set its 'reconnect'
flag on return from the negotiation coroutine, and the effect will be
that proxy.c makes a new connection to the same proxy server before
doing anything else. In particular, you can set that flag _and_ put
data in the output bufchain, and there's no problem - the output data
will be queued directly into the new socket.
All the seat functions that request an interactive prompt of some kind
to the user - both the main seat_get_userpass_input and the various
confirmation dialogs for things like host keys - were using a simple
int return value, with the general semantics of 0 = "fail", 1 =
"proceed" (and in the case of seat_get_userpass_input, answers to the
prompts were provided), and -1 = "request in progress, wait for a
callback".
In this commit I change all those functions' return types to a new
struct called SeatPromptResult, whose primary field is an enum
replacing those simple integer values.
The main purpose is that the enum has not three but _four_ values: the
"fail" result has been split into 'user abort' and 'software abort'.
The distinction is that a user abort occurs as a result of an
interactive UI action, such as the user clicking 'cancel' in a dialog
box or hitting ^D or ^C at a terminal password prompt - and therefore,
there's no need to display an error message telling the user that the
interactive operation has failed, because the user already knows,
because they _did_ it. 'Software abort' is from any other cause, where
PuTTY is the first to know there was a problem, and has to tell the
user.
We already had this 'user abort' vs 'software abort' distinction in
other parts of the code - the SSH backend has separate termination
functions which protocol layers can call. But we assumed that any
failure from an interactive prompt request fell into the 'user abort'
category, which is not true. A couple of examples: if you configure a
host key fingerprint in your saved session via the SSH > Host keys
pane, and the server presents a host key that doesn't match it, then
verify_ssh_host_key would report that the user had aborted the
connection, and feel no need to tell the user what had gone wrong!
Similarly, if a password provided on the command line was not
accepted, then (after I fixed the semantics of that in the previous
commit) the same wrong handling would occur.
So now, those Seat prompt functions too can communicate whether the
user or the software originated a connection abort. And in the latter
case, we also provide an error message to present to the user. Result:
in those two example cases (and others), error messages should no
longer go missing.
Implementation note: to avoid the hassle of having the error message
in a SeatPromptResult being a dynamically allocated string (and hence,
every recipient of one must always check whether it's non-NULL and
free it on every exit path, plus being careful about copying the
struct around), I've instead arranged that the structure contains a
function pointer and a couple of parameters, so that the string form
of the message can be constructed on demand. That way, the only users
who need to free it are the ones who actually _asked_ for it in the
first place, which is a much smaller set.
(This is one of the rare occasions that I regret not having C++'s
extra features available in this code base - a unique_ptr or
shared_ptr to a string would have been just the thing here, and the
compiler would have done all the hard work for me of remembering where
to insert the frees!)
This will mean that platform-specific proxy types will also be able to
set themselves up as child Interactors and prompt the user
interactively for passwords and the like.
NFC: nothing uses the new parameter yet.
This fixes the Telnet proxy, which was the only one of the proxy types
I forgot to test when I pushed the previous patch series, and
therefore, naturally, the one I left a bug in: if a ProxyNegotiator
returns both some output to be transmitted _and_ the 'done' flag, we
were forgetting to do anything with the former. So the proxy command
was being carefully constructed by TelnetProxyNegotiator, and then
promptly dropped on the floor by the owning ProxySocket.
This lays all the groundwork for ProxyNegotiators to be able to issue
username and password prompts: ProxySocket now implements the
Interactor trait, it will borrow and return a Seat if one is
available, and it will present an Interactor of its own to the
ProxyNegotiator which can use it (via interactor_announce as usual) to
get a Seat to send prompts to. Also, proxy.c provides a centralised
system for making a prompts_t with an appropriate callback in it, and
dealing with the results of that callback.
No actual ProxyNegotiator implementation uses it yet, though.
Previously, the proxy negotiation functions were written as explicit
state machines, with ps->state being manually set to a sequence of
positive integer values which would be tested by if statements in the
next call to the same negotiation function.
That's not how this code base likes to do things! We have a coroutine
system to allow those state machines to be implicit rather than
explicit, so that we can use ordinary control flow statements like
while loops. Reorganised each proxy negotiation function into a
coroutine-based system like that.
While I'm at it, I've also moved each proxy negotiator out into its
own source file, to make proxy.c less overcrowded and monolithic. And
_that_ gave me the opportunity to define each negotiator as an
implementation of a trait rather than as a single function - which
means now each one can define its own local variables and have its own
cleanup function, instead of all of them having to share the variables
inside the main ProxySocket struct.
In the new coroutine system, negotiators don't have to worry about the
mechanics of actually sending data down the underlying Socket any
more. The negotiator coroutine just appends to a bufchain (via a
provided bufchain_sink), and after every call to the coroutine,
central code in proxy.c transfers the data to the Socket itself. This
avoids a lot of intermediate allocations within the negotiators, which
previously kept having to make temporary strbufs or arrays in order to
have something to point an sk_write() at; now they can just put
formatted data directly into the output bufchain via the marshal.h
interface.
In this version of the code, I've also moved most of the SOCKS5 CHAP
implementation from cproxy.c into socks5.c, so that it can sit in the
same coroutine as the rest of the proxy negotiation control flow.
That's because calling a sub-coroutine (co-subroutine?) is awkward to
set up (though it is _possible_ - we do SSH-2 kex that way), and
there's no real need to bother in this case, since the only thing that
really needs to go in cproxy.c is the actual cryptography plus a flag
to tell socks5.c whether to offer CHAP authentication in the first
place.
These were just boilerplate in all the proxy negotiation functions:
every negotiator had to contain a handler for each of these events,
and they all handled them in exactly the same way. Remove them and
centralise the handling in the shared code.
A long time ago, some of these event codes were added with purpose in
mind. PROXY_CHANGE_CLOSING was there to anticipate the possibility
that you might need to make multiple TCP connections to the proxy
server (e.g. retrying with different authentication) before
successfully getting a connection you could use to talk to the
ultimate destination. And PROXY_CHANGE_ACCEPTING was there so that we
could use the listening side of SOCKS (where you ask the proxy to open
a listening socket on your behalf). But neither of them has ever been
used, and now that the code has evolved, I think probably if we do
ever need to do either of those things then they'll want to be done
differently.
This seemed like a worthwhile cleanup to do while I was working on
this code anyway. Now all the magic numbers are defined in a header
file by macro names indicating their meaning, and used by both the
SOCKS client code in the proxy subdirectory and the SOCKS server code
in portfwd.c.
It's variously 'ps' and 'p' in functions that already receive one, and
it's 'ret' in the main function that initially constructs one. Let's
call it 'ps' consistently, so that the code idioms are the same
everywhere.
marshal.h now provides a macro put_fmt() which allows you to write
arbitrary printf-formatted data to an arbitrary BinarySink.
We already had this facility for strbufs in particular, in the form of
strbuf_catf(). That was able to take advantage of knowing the inner
structure of a strbuf to minimise memory allocation (it would snprintf
directly into the strbuf's existing buffer if possible). For a general
black-box BinarySink we can't do that, so instead we dupvprintf into a
temporary buffer.
For consistency, I've removed strbuf_catf, and converted all uses of
it into the new put_fmt - and I've also added an extra vtable method
in the BinarySink API, so that put_fmt can still use strbuf_catf's
more efficient memory management when talking to a strbuf, and fall
back to the simpler strategy when that's not available.
Passing an operating-system-specific error code to plug_closing(),
such as errno or GetLastError(), was always a bit weird, given that it
generally had to be handled by cross-platform receiving code in
backends. I had the platform.h implementations #define any error
values that the cross-platform code would have to handle specially,
but that's still not a great system, because it also doesn't leave
freedom to invent error representations of my own that don't
correspond to any OS code. (For example, the ones I just removed from
proxy.h.)
So now, the OS error code is gone from the plug_closing API, and in
its place is a custom enumeration of closure types: normal, error, and
the special case BROKEN_PIPE which is the only OS error code we have
so far needed to handle specially. (All others just mean 'abandon the
connection and print the textual message'.)
Having already centralised the handling of OS error codes in the
previous commit, we've now got a convenient place to add any further
type codes for errors needing special handling: each of Unix
plug_closing_errno(), Windows plug_closing_system_error(), and Windows
plug_closing_winsock_error() can easily grow extra special cases if
need be, and each one will only have to live in one place.
Having a single plug_closing() function covering various kinds of
closure is reasonably convenient from the point of view of Plug
implementations, but it's annoying for callers, who all have to fill
in pointless NULL and 0 parameters in the cases where they're not
used.
Added some inline helper functions in network.h alongside the main
plug_closing() dispatch wrappers, so that each kind of connection
closure can present a separate API for the Socket side of the
interface, without complicating the vtable for the Plug side.
Also, added OS-specific extra helpers in the Unix and Windows
directories, which centralise the job of taking an OS error code (of
whatever kind) and translating it into its error message.
In passing, this removes the horrible ad-hoc made-up error codes in
proxy.h, which is OK, because nothing checked for them anyway, and
also I'm about to do an API change to plug_closing proper that removes
the need for them.
Thanks to the previous commit, this new parameter can replace two of
the existing ones: instead of passing a LogPolicy and a Seat, we now
pass just an Interactor, from which any proxy implementation can
extract the LogPolicy and the Seat anyway if they need it.
There are quite a few of them already, and I'm about to make another
one, so let's start with a bit of tidying up.
The CMake build organisation is unchanged: I haven't put the proxy
object files into a separate library, just moved the locations of the
source files. (Organising proxying as a library would be tricky
anyway, because of the various overrides for tools that want to avoid
cryptography.)