I'm about to want to use the same code to check for something else.
It's only a handful of lines, but even so.
Also, since the string constants are mentioned several times, this
seems like a good moment to lift them out into reusable static const
ptrlens.
ssh2_scan_kexinits must check to see whether it's behaving as an SSH
client or server, in order to decide whether to look for "ext-info-s"
in the server's KEXINIT or "ext-info-c" in the client's, respectively.
This check was done by testing the pointer 'server_hostkeys' to see if
it was non-NULL. I think I must have imagined that a variable of that
name meant "the host keys we have available to offer the client, if we
are the server", as the similarly named parameter 'our_hostkeys' in
write_kexinit_lists in fact does mean. So I expected it to be non-NULL
for the server and NULL for the client, and coded accordingly.
But in fact it's used by the client: it collects host key types the
client has _seen_ from the server, in order to offer them as cross-
certification actions in the specials menu. Moreover, it's _always_
non-NULL, because in the server, it's easier to leave it present but
empty than to get rid of it.
So this code was always behaving as if it was the server, i.e. it was
looking for "ext-info-c" in the client KEXINIT. When it was in fact
the client, that test would always succeed, because we _sent_ that
KEXINIT ourselves!
But nobody ever noticed, because when we're the client, it doesn't
matter whether we saw "ext-info-c", because we don't have any reason
to send EXT_INFO from client to server. We're only concerned with
server-to-client EXT_INFO. So this embarrassing bug had no actual
effect.
I encountered an instance of this sequence in the log files from a
clang CI build. The payload text inside the wrapper was
"bk;t=1697630539879"; I don't know what the "bk" stood for, but the
second half appears to be a timestamp in milliseconds since the Unix
epoch.
I don't think there's anything we can (or should) actually _do_ with
this sequence, but I think it's useful to at least recognise it, so
that it can be conveniently discarded.
(cherry picked from commit 7b10e34b8f)
Shortly after the previous commit I spotted another definitely missing
display update: if you send the byte 0x7F, aka 'destructive
backspace', then the display didn't update immediately.
That was two in a row, so I did an eyeball review of the whole
terminal state machine to the best of my ability. Found a couple more
borderline ones, but also, found that the entire VT52 sub-state-
machine had a blanket seen_disp_event which really _shouldn't_ have
been there, because half the VT52 sequences aren't actually display-
modifying updates.
To make this _slightly_ less error-prone, I've sunk a number of
seen_disp_update calls into subroutines that aren't the top-level
term_out(). For example, erase_lots(), scroll(), move() and
swap_screen() now all call seen_disp_update within themselves, so
their call sites don't all have to remember to.
There are probably further bugs after this upheaval, but I think it's
moving in generally the right direction.
(cherry picked from commit 6a6efd36aa)
These escape sequences immediately change the display of the line
they're invoked on, so they need to trigger a display update. But they
weren't, and I suppose we must have never noticed before due to the
complete confusion fixed in commit bdbd5f429c.
(cherry picked from commit aa1552bc82)
Just happened to jump out at me in an eyeball inspection just now. I
carefully moved all the protocol byte-value constants into a header
file with mnemonic names, but I still hard-coded SOCKS4_REPLY_VERSION
in the text of one diagnostic, and I got the wrong one of
SOCKS5_REQUEST_VERSION and SOCKS5_REPLY_VERSION at one point in the
code. Both benign (the right value was there, juste called by the
wrong name).
Also fixed some missing whitespace, in passing. (Probably the line it
was missing from had once been squashed up closer to the right margin.)
(cherry picked from commit 1cd0f1787f)
Recently I encountered a CLI tool that took tens of seconds to run,
and produced no _visible_ output, but wrote ESC[0m to the terminal a
few times during its operation. (Probably by mistake. In other modes
it does print colourful messages, so I expect a 'reset colour' call
was accidentally outside the 'if' statement containing the rest of the
diagnostic it followed. Or something along those lines.)
I noticed this because every ESC[0m reset my pterm scrollback to the
bottom, which wasn't very helpful, and was unintentional on pterm's
part (as _well_ as on the part of the tool). But I can fix pterm!
At first glance the code _looked_ sensible: terminal.c contains calls
to seen_disp_event(term) whenever terminal output does something that
requires a redraw of the terminal window. Those are also the updates
that should count as 'reset scrollback on display activity'. And
ESC[0m, along with the rest of the SGR handler, correctly contained no
such call. So how did a display update happen at all?
The code was confusingly tangled up with the code that responds to
terminal activity by resetting the phase of the blinking cursor (if
any). term_reset_cblink() was calling seen_disp_event() (when surely
it should be the other way round!), and also, term_reset_cblink() was
called whenever _any_ terminal output data arrived. That combination
meant that any byte output to the terminal at all turned out to count
as display activity, whether or not it changed the screen contents.
Additionally, the other scrollback-reset flag, 'reset scrollback on
keypress', was handled by calling seen_disp_event() from the keyboard
handler. But display events and keyboard events are supposed to be
_independent_ potential causes of scrollback resets - it doesn't make
any sense to handle one by treating it as the other!
So I've reorganised the code completely:
- the seen_disp_event *flag* is now gone. Instead, the
seen_disp_event function tests the scroll_on_disp flag, and if set,
resets the scroll position immediately and sets the general
'scrollbar needs updating' flag.
- keyboard input is handled by doing exactly the same thing except
testing the scroll_on_key flag, so the two systems are properly
independent. That code calls term_schedule_update so that the
terminal will be redrawn as a result of the scroll, but doesn't
also call seen_disp_event() for the rest of the full treatment.
- the term_update code that does the scrollbar update is much
simpler, since now it only needs to test that one flag.
- I also had to set that flag explicitly in scroll() so that the
scrollbar would still be updated as a result of the scrollback size
changing. I think that must have been happening entirely by
accident before.
- term_reset_cblink is subsumed into seen_disp_event, so that only
_substantive_ display updates cause the cursor blink phase to reset
to the start of the solid period.
Result: if programs output no-op sequences like ESC[0m, or if you
press keys that don't echo, then the cursor will carry on blinking
normally, and (if you don't also have scroll_on_key set) the
scrollback won't be reset. And the code is slightly shorter than it
was before, and hopefully more sensible too.
(However, other classes of no-op activity _will_ still cause a cursor
blink phase change and a scrollback reset, such as sending a
cursor-positioning sequence that puts the cursor in the same place it
was already - even something as simple as ^M when already at the start
of the line. It might be nice to fix that, but it's much more
difficult: you'd have to either put a complicated and error-prone test
at every seen_disp_event call site, or else expensively diff the
entire visible terminal state against how it was before. And to avoid
a nondeterministic dependency on the terminal update cooldown, that
diff would have to be done at the granularity of individual control
sequences rather than a bounded number of times a second. I'd rather
not!)
(cherry picked from commit bdbd5f429c)
(cherry-picker's note: I also had to remove the initialisation of
the removed field term->seen_disp_event from term_init(), which didn't
need doing on main because commit 74aa3cb7fb had already replaced
all the boring parts of term_init with a big memset)
A user just reported that 0.79 doesn't build out of the box on Ubuntu
14.04 (trusty), because although its gcc (4.8.4) does _support_ C99,
it doesn't enable it without a non-default -std option. The user was
able to work around the problem by defining CMAKE_C_FLAGS=-std=gnu99,
but it would have been nicer if we'd done that automatically. Setting
CMAKE_C_STANDARD causes cmake to do so.
(This isn't a regression of 0.79 over 0.78 as far as I know; the user
in question said they had last built 0.76.)
I was surprised to find Ubuntu 14.04 still in use at all, but a quick
web search revealed that its support has been extended until next
year, so fair enough, I suppose. It's also running a cmake way older
than we support, but apparently that can be worked around via
Kitware's binary tarball downloads (which do still run on 14.04).
This is a bit unsatisfactory: I'd prefer to ask for C standards
support of _at least_ C99 level, and C11 if possible. Then I could
test for the presence of C11 features via check_c_source_compiles, and
use them opportunistically (e.g. in macro definitions). But as far as
I can see, cmake has no built-in support for asking for a standards
level of 'as new as you can get, but no older than 99'. Oh well.
(In any case, the thing I'd find most useful from C11 is _Generic, and
since that's in implementation namespace, compilers can - and do -
support it in C99 mode anyway. So it's probably fine, at least for now.)
(cherry picked from commit bd27962cd9)
Experimenting with different compile flags pointed out two instances
of typedefing the same name twice (though benignly, with the same
definition as well). PsocksDataSink was typedefed a couple of lines
above its struct definition and then again _with_ its struct
definition; cliloop_continue_t was typedefed in unix/platform.h and
didn't need defining again in unix/psocks.c.
(cherry picked from commit 3d34007889)
That's two releases running I've got most of the way through the
mechanical upload processes and suddenly realised I still have a piece
of creative writing to do. A small one, but even so, there's no reason
it couldn't have been prepared a week in advance like the rest of the
announcements and changelogs. The only reason I didn't is that the
checklist didn't remind me to. Now it does.
(cherry picked from commit da550c3158)
I encountered an instance of this sequence in the log files from a
clang CI build. The payload text inside the wrapper was
"bk;t=1697630539879"; I don't know what the "bk" stood for, but the
second half appears to be a timestamp in milliseconds since the Unix
epoch.
I don't think there's anything we can (or should) actually _do_ with
this sequence, but I think it's useful to at least recognise it, so
that it can be conveniently discarded.
Shortly after the previous commit I spotted another definitely missing
display update: if you send the byte 0x7F, aka 'destructive
backspace', then the display didn't update immediately.
That was two in a row, so I did an eyeball review of the whole
terminal state machine to the best of my ability. Found a couple more
borderline ones, but also, found that the entire VT52 sub-state-
machine had a blanket seen_disp_event which really _shouldn't_ have
been there, because half the VT52 sequences aren't actually display-
modifying updates.
To make this _slightly_ less error-prone, I've sunk a number of
seen_disp_update calls into subroutines that aren't the top-level
term_out(). For example, erase_lots(), scroll(), move() and
swap_screen() now all call seen_disp_update within themselves, so
their call sites don't all have to remember to.
There are probably further bugs after this upheaval, but I think it's
moving in generally the right direction.
These escape sequences immediately change the display of the line
they're invoked on, so they need to trigger a display update. But they
weren't, and I suppose we must have never noticed before due to the
complete confusion fixed in commit bdbd5f429c.
Just happened to jump out at me in an eyeball inspection just now. I
carefully moved all the protocol byte-value constants into a header
file with mnemonic names, but I still hard-coded SOCKS4_REPLY_VERSION
in the text of one diagnostic, and I got the wrong one of
SOCKS5_REQUEST_VERSION and SOCKS5_REPLY_VERSION at one point in the
code. Both benign (the right value was there, juste called by the
wrong name).
Also fixed some missing whitespace, in passing. (Probably the line it
was missing from had once been squashed up closer to the right margin.)
These are now specified in conf.h and filled in by automated code,
which means test_conf can make sure we didn't forget to provide them.
The default for a mapping type (not that we currently have any unsaved
ones) is expected to be empty.
Also, while adding test_conf checks, I realised I hadn't filled in the
rest of the comment in conf.h. Belatedly updated that.
This allows a couple more settings to be treated automatically on
save, which are more complicated on load because they still honour
older alternative save keywords.
In particular, CONF_proxy_type and CONF_remote_qtitle_action now have
explicit enum mappings. These were needed for the automated save code,
but also, I've rewritten the custom load code to use them too. This
decouples the storage format of those settings from the order of
values in the internal enum, which is generally an advantage of
specifying storage enums explicitly.
Those two settings weren't already tested by test_conf, because I
wasn't changing them in previous commits. Now I've added extra code
that does test them, and verified it works when backported to commit
b567c9b2b5 where I introduced test_conf before beginning the main
refactoring.
A setting can also be specified explicitly as not loaded and saved at
all. There were quite a few commented that way, but now there's a
machine-readable indication of it.
test_conf will now check that all these settings make sense together -
things shouldn't have a save keyword unless they use it, and should
have one if they don't, and shouldn't specify combinations of options
that conflict.
(For that reason, test_conf is now also running the consistency check
before the main test, so that a missing keyword will cause an error
message _before_ it causes a segfault, saving some debugging!)
It's not used! The only config items that specified it are doing their
load/save in a custom way anyway, and they _don't_ use anything
resembling that enum - instead, they map the integer values in Conf to
strings in the storage format. That enum was a total lie and an
artefact of my conversion macros. Ahem.
This is why I wrote conf.h in the form of macros that expanded to
named structure field assignments, instead of just filling it with
named structure field assignments directly. This way, I can #include
the same file again with different macro definitions, and build up a
list of what fields were set in what config options.
This new code checks that if a config option has a default, then the
type of the default matches the declared type of the option value
itself. That's what caught the two goofs in the previous commit.
This is also the part of test_conf that I _won't_ want to delete once
I've finished with the refactoring: it can stay there forever, doing
type checking at test time that the compiler isn't doing for me at
build time.
I suspected there'd be one or two mistakes introduced by that
transcription in spite of the test suite, and there were!
CONF_mouseautocopy had its default labelled as int rather than bool,
because it didn't _look_ boolean to my conversion scripts or to my
eyeballs - but its default value is actually a macro that expands to
'true' or 'false' (depending on platform), so it is really.
And CONF_supdup_ascii_set, conversely, is an int which had its default
labelled as bool, which was due to my conversion scripts faithfully
transcribing the same confusion in the original code.
Ahem. Of course I've been running it interactively until now, so I
never noticed that I'd forgotten to fill in that important point. But
now it's run as part of my build, it should make sure to fail if it
fails.
The new ConfKeyInfo structure now includes some fields indicating how
to load and save each config option: what keyword it's stored under in
the saved settings file, and what its default value should be set to
when loading a session that doesn't mention it. (Including, of course,
loading the null session at program startup.)
So far, this only applies to the saved settings that are sufficiently
simple: a single integer, string or boolean value whose internal
format matches its storage format, or an integer value consisting of a
finite enumeration with a fixed mapping between its internal and
storage formats. Anything more difficult than that - mappings,
variable defaults, config options tied together, options that still
support a legacy save format alongside the up-to-date one, things
under #ifdef - hasn't yet been tampered with.
This allows a large amount of repetitive code in settings.c to be
deleted, and replaced by simple loops over the conf_key_info array
doing all the easy work. The remaining manual load/save code per
option is all there because it's difficult in some way.
The transitional test_conf program still passes after this upheaval.
This array is planned to be exposed more widely, and used for more
purposes than just checking the types of Conf options. In this commit
it just takes over from the two previous smaller arrays, and adds no
extra data.
This aims to be a reasonably exhaustive test of what happens if you
set Conf values to various things, and then save your session, and
find out what ends up in the storage. Or vice versa.
Currently, the test program is written to match the existing
behaviour. The idea is that I can refactor the code that does the
loading and saving, and if this test still passes, I've probably done
it right.
However, in the long term, this test will be a liability: it's yet
another place you have to add every new config option. So my plan is
to get rid of it again once the refactorings I'm planning are
finished.
Or rather, I'll get rid of _that_ part of its functionality. I also
suspect I'll have added new kinds of consistency check by then, which
won't be a liability in the same way, and which I'll want to keep.
FontSpec is completely different per platform; not only is the
structure type different, not only are the behind-the-scenes
implementations of copy and free functions different, but even the API
of the constructor function is different. Cross-platform code can't
construct a FontSpec at all. This makes it hard to write a
cross-platform test program that works with them.
So, as a nasty bodge, I'll allow test programs to #define
SUPERSEDE_FONTSPEC_FOR_TESTING before including putty.h. Then they can
provide their own definition of FontSpec, and they also take
responsibility for superseding all the other functions that work with
one.
Normally you don't ever want to have a Conf structure that doesn't
have an entry for every primary key, because the code that uses Conf
to get real work done will fail assertions if lookups fail. But test
programs manipulating Conf in unusual ways are a special case.
(In particular, one thing you _can_ legally do with an empty Conf is
to call load_open_settings() to populate it. That has to be legal,
because it's how a Conf gets populated in the first place, after it's
initially created empty.)
Recently I encountered a CLI tool that took tens of seconds to run,
and produced no _visible_ output, but wrote ESC[0m to the terminal a
few times during its operation. (Probably by mistake. In other modes
it does print colourful messages, so I expect a 'reset colour' call
was accidentally outside the 'if' statement containing the rest of the
diagnostic it followed. Or something along those lines.)
I noticed this because every ESC[0m reset my pterm scrollback to the
bottom, which wasn't very helpful, and was unintentional on pterm's
part (as _well_ as on the part of the tool). But I can fix pterm!
At first glance the code _looked_ sensible: terminal.c contains calls
to seen_disp_event(term) whenever terminal output does something that
requires a redraw of the terminal window. Those are also the updates
that should count as 'reset scrollback on display activity'. And
ESC[0m, along with the rest of the SGR handler, correctly contained no
such call. So how did a display update happen at all?
The code was confusingly tangled up with the code that responds to
terminal activity by resetting the phase of the blinking cursor (if
any). term_reset_cblink() was calling seen_disp_event() (when surely
it should be the other way round!), and also, term_reset_cblink() was
called whenever _any_ terminal output data arrived. That combination
meant that any byte output to the terminal at all turned out to count
as display activity, whether or not it changed the screen contents.
Additionally, the other scrollback-reset flag, 'reset scrollback on
keypress', was handled by calling seen_disp_event() from the keyboard
handler. But display events and keyboard events are supposed to be
_independent_ potential causes of scrollback resets - it doesn't make
any sense to handle one by treating it as the other!
So I've reorganised the code completely:
- the seen_disp_event *flag* is now gone. Instead, the
seen_disp_event function tests the scroll_on_disp flag, and if set,
resets the scroll position immediately and sets the general
'scrollbar needs updating' flag.
- keyboard input is handled by doing exactly the same thing except
testing the scroll_on_key flag, so the two systems are properly
independent. That code calls term_schedule_update so that the
terminal will be redrawn as a result of the scroll, but doesn't
also call seen_disp_event() for the rest of the full treatment.
- the term_update code that does the scrollbar update is much
simpler, since now it only needs to test that one flag.
- I also had to set that flag explicitly in scroll() so that the
scrollbar would still be updated as a result of the scrollback size
changing. I think that must have been happening entirely by
accident before.
- term_reset_cblink is subsumed into seen_disp_event, so that only
_substantive_ display updates cause the cursor blink phase to reset
to the start of the solid period.
Result: if programs output no-op sequences like ESC[0m, or if you
press keys that don't echo, then the cursor will carry on blinking
normally, and (if you don't also have scroll_on_key set) the
scrollback won't be reset. And the code is slightly shorter than it
was before, and hopefully more sensible too.
(However, other classes of no-op activity _will_ still cause a cursor
blink phase change and a scrollback reset, such as sending a
cursor-positioning sequence that puts the cursor in the same place it
was already - even something as simple as ^M when already at the start
of the line. It might be nice to fix that, but it's much more
difficult: you'd have to either put a complicated and error-prone test
at every seen_disp_event call site, or else expensively diff the
entire visible terminal state against how it was before. And to avoid
a nondeterministic dependency on the terminal update cooldown, that
diff would have to be done at the granularity of individual control
sequences rather than a bounded number of times a second. I'd rather
not!)
A user just reported that 0.79 doesn't build out of the box on Ubuntu
14.04 (trusty), because although its gcc (4.8.4) does _support_ C99,
it doesn't enable it without a non-default -std option. The user was
able to work around the problem by defining CMAKE_C_FLAGS=-std=gnu99,
but it would have been nicer if we'd done that automatically. Setting
CMAKE_C_STANDARD causes cmake to do so.
(This isn't a regression of 0.79 over 0.78 as far as I know; the user
in question said they had last built 0.76.)
I was surprised to find Ubuntu 14.04 still in use at all, but a quick
web search revealed that its support has been extended until next
year, so fair enough, I suppose. It's also running a cmake way older
than we support, but apparently that can be worked around via
Kitware's binary tarball downloads (which do still run on 14.04).
This is a bit unsatisfactory: I'd prefer to ask for C standards
support of _at least_ C99 level, and C11 if possible. Then I could
test for the presence of C11 features via check_c_source_compiles, and
use them opportunistically (e.g. in macro definitions). But as far as
I can see, cmake has no built-in support for asking for a standards
level of 'as new as you can get, but no older than 99'. Oh well.
(In any case, the thing I'd find most useful from C11 is _Generic, and
since that's in implementation namespace, compilers can - and do -
support it in C99 mode anyway. So it's probably fine, at least for now.)
Experimenting with different compile flags pointed out two instances
of typedefing the same name twice (though benignly, with the same
definition as well). PsocksDataSink was typedefed a couple of lines
above its struct definition and then again _with_ its struct
definition; cliloop_continue_t was typedefed in unix/platform.h and
didn't need defining again in unix/psocks.c.
That's two releases running I've got most of the way through the
mechanical upload processes and suddenly realised I still have a piece
of creative writing to do. A small one, but even so, there's no reason
it couldn't have been prepared a week in advance like the rest of the
announcements and changelogs. The only reason I didn't is that the
checklist didn't remind me to. Now it does.
A user reports, _just_ in time to make the 0.79 release, that changes
in the Windows port of OpenSSH from 8.9.x have made it unhappy with
the use of \ as a path separator in the 'IdentityAgent' config
directive. Switch to /, which is also accepted by earlier versions, so
it should work everywhere.
This can now happen if, for instance, the CLMUL implementation of
aesgcm is compiled in, but not available at runtime because we're on
an old Intel CPU.
In this situation, testcrypt would segfault when driven by
test/cryptsuite.py, and test/list-accel.py would erroneously claim the
CLMUL implementation was available when it wasn't.
Spotted by Coverity. If PuTTY is functioning as a sharing upstream,
and a new downstream mishandles the version string exchange in any way
that provokes an error message from share_receive() (such as failing
to start the greeting with the expected protocol-name string), we were
calling share_disconnect() and then going to crFinish. But
share_disconnect is capable of actually freeing the entire
ssh_sharing_connstate which contains the coroutine state - in which
case, crFinish's zeroing out of crLine is a use-after-free.
The usual pattern elsewhere in this code is to exit a coroutine with
an ordinary 'return' when you've destroyed its state structure. Switch
to doing that here.
Apparently when I started using _wfopen in commit 8bd75b85ed, the
winegcc build (which I mostly use for Coverity Scan) stopped working,
because _wfopen isn't included in any of the libraries I explicitly
had on my link line.
Rather than mess about with cmake, it's easier to just bodge it in the
winegcc wrapper script, since we had one of those already.
If we don't have GTK enabled in the build, then lots of important
stuff never gets added to the 'guiterminal' build-time object library,
without which these terminal-using programs can't link successfully,
even though they don't actually use GTK.
I could add yet more stub functions, but I don't think that's really
necessary - it doesn't seem like a serious inconvenience that you can
only test the terminal on a platform where you can also build real
applications that include it. So I've just moved those two executable
file definitions inside the Big If that conditionalises PuTTY and
pterm themselves.
An irate user complained today that they wished we'd documented
firewalls as a possible cause of WSAECONNREFUSED, because it took them
ages to think of checking that. Fair enough.
Someone just asked us a question which suggests they might have thought
they need to supply both files in the 'Public-key authentication' box in
the config dialog, to use public-key authentication at all. I can see
why someone might think that, anyway.