Recently I encountered a CLI tool that took tens of seconds to run,
and produced no _visible_ output, but wrote ESC[0m to the terminal a
few times during its operation. (Probably by mistake. In other modes
it does print colourful messages, so I expect a 'reset colour' call
was accidentally outside the 'if' statement containing the rest of the
diagnostic it followed. Or something along those lines.)
I noticed this because every ESC[0m reset my pterm scrollback to the
bottom, which wasn't very helpful, and was unintentional on pterm's
part (as _well_ as on the part of the tool). But I can fix pterm!
At first glance the code _looked_ sensible: terminal.c contains calls
to seen_disp_event(term) whenever terminal output does something that
requires a redraw of the terminal window. Those are also the updates
that should count as 'reset scrollback on display activity'. And
ESC[0m, along with the rest of the SGR handler, correctly contained no
such call. So how did a display update happen at all?
The code was confusingly tangled up with the code that responds to
terminal activity by resetting the phase of the blinking cursor (if
any). term_reset_cblink() was calling seen_disp_event() (when surely
it should be the other way round!), and also, term_reset_cblink() was
called whenever _any_ terminal output data arrived. That combination
meant that any byte output to the terminal at all turned out to count
as display activity, whether or not it changed the screen contents.
Additionally, the other scrollback-reset flag, 'reset scrollback on
keypress', was handled by calling seen_disp_event() from the keyboard
handler. But display events and keyboard events are supposed to be
_independent_ potential causes of scrollback resets - it doesn't make
any sense to handle one by treating it as the other!
So I've reorganised the code completely:
- the seen_disp_event *flag* is now gone. Instead, the
seen_disp_event function tests the scroll_on_disp flag, and if set,
resets the scroll position immediately and sets the general
'scrollbar needs updating' flag.
- keyboard input is handled by doing exactly the same thing except
testing the scroll_on_key flag, so the two systems are properly
independent. That code calls term_schedule_update so that the
terminal will be redrawn as a result of the scroll, but doesn't
also call seen_disp_event() for the rest of the full treatment.
- the term_update code that does the scrollbar update is much
simpler, since now it only needs to test that one flag.
- I also had to set that flag explicitly in scroll() so that the
scrollbar would still be updated as a result of the scrollback size
changing. I think that must have been happening entirely by
accident before.
- term_reset_cblink is subsumed into seen_disp_event, so that only
_substantive_ display updates cause the cursor blink phase to reset
to the start of the solid period.
Result: if programs output no-op sequences like ESC[0m, or if you
press keys that don't echo, then the cursor will carry on blinking
normally, and (if you don't also have scroll_on_key set) the
scrollback won't be reset. And the code is slightly shorter than it
was before, and hopefully more sensible too.
(However, other classes of no-op activity _will_ still cause a cursor
blink phase change and a scrollback reset, such as sending a
cursor-positioning sequence that puts the cursor in the same place it
was already - even something as simple as ^M when already at the start
of the line. It might be nice to fix that, but it's much more
difficult: you'd have to either put a complicated and error-prone test
at every seen_disp_event call site, or else expensively diff the
entire visible terminal state against how it was before. And to avoid
a nondeterministic dependency on the terminal update cooldown, that
diff would have to be done at the granularity of individual control
sequences rather than a bounded number of times a second. I'd rather
not!)
A user just reported that 0.79 doesn't build out of the box on Ubuntu
14.04 (trusty), because although its gcc (4.8.4) does _support_ C99,
it doesn't enable it without a non-default -std option. The user was
able to work around the problem by defining CMAKE_C_FLAGS=-std=gnu99,
but it would have been nicer if we'd done that automatically. Setting
CMAKE_C_STANDARD causes cmake to do so.
(This isn't a regression of 0.79 over 0.78 as far as I know; the user
in question said they had last built 0.76.)
I was surprised to find Ubuntu 14.04 still in use at all, but a quick
web search revealed that its support has been extended until next
year, so fair enough, I suppose. It's also running a cmake way older
than we support, but apparently that can be worked around via
Kitware's binary tarball downloads (which do still run on 14.04).
This is a bit unsatisfactory: I'd prefer to ask for C standards
support of _at least_ C99 level, and C11 if possible. Then I could
test for the presence of C11 features via check_c_source_compiles, and
use them opportunistically (e.g. in macro definitions). But as far as
I can see, cmake has no built-in support for asking for a standards
level of 'as new as you can get, but no older than 99'. Oh well.
(In any case, the thing I'd find most useful from C11 is _Generic, and
since that's in implementation namespace, compilers can - and do -
support it in C99 mode anyway. So it's probably fine, at least for now.)
Experimenting with different compile flags pointed out two instances
of typedefing the same name twice (though benignly, with the same
definition as well). PsocksDataSink was typedefed a couple of lines
above its struct definition and then again _with_ its struct
definition; cliloop_continue_t was typedefed in unix/platform.h and
didn't need defining again in unix/psocks.c.
That's two releases running I've got most of the way through the
mechanical upload processes and suddenly realised I still have a piece
of creative writing to do. A small one, but even so, there's no reason
it couldn't have been prepared a week in advance like the rest of the
announcements and changelogs. The only reason I didn't is that the
checklist didn't remind me to. Now it does.
A user reports, _just_ in time to make the 0.79 release, that changes
in the Windows port of OpenSSH from 8.9.x have made it unhappy with
the use of \ as a path separator in the 'IdentityAgent' config
directive. Switch to /, which is also accepted by earlier versions, so
it should work everywhere.
This can now happen if, for instance, the CLMUL implementation of
aesgcm is compiled in, but not available at runtime because we're on
an old Intel CPU.
In this situation, testcrypt would segfault when driven by
test/cryptsuite.py, and test/list-accel.py would erroneously claim the
CLMUL implementation was available when it wasn't.
Spotted by Coverity. If PuTTY is functioning as a sharing upstream,
and a new downstream mishandles the version string exchange in any way
that provokes an error message from share_receive() (such as failing
to start the greeting with the expected protocol-name string), we were
calling share_disconnect() and then going to crFinish. But
share_disconnect is capable of actually freeing the entire
ssh_sharing_connstate which contains the coroutine state - in which
case, crFinish's zeroing out of crLine is a use-after-free.
The usual pattern elsewhere in this code is to exit a coroutine with
an ordinary 'return' when you've destroyed its state structure. Switch
to doing that here.
Apparently when I started using _wfopen in commit 8bd75b85ed, the
winegcc build (which I mostly use for Coverity Scan) stopped working,
because _wfopen isn't included in any of the libraries I explicitly
had on my link line.
Rather than mess about with cmake, it's easier to just bodge it in the
winegcc wrapper script, since we had one of those already.
If we don't have GTK enabled in the build, then lots of important
stuff never gets added to the 'guiterminal' build-time object library,
without which these terminal-using programs can't link successfully,
even though they don't actually use GTK.
I could add yet more stub functions, but I don't think that's really
necessary - it doesn't seem like a serious inconvenience that you can
only test the terminal on a platform where you can also build real
applications that include it. So I've just moved those two executable
file definitions inside the Big If that conditionalises PuTTY and
pterm themselves.
An irate user complained today that they wished we'd documented
firewalls as a possible cause of WSAECONNREFUSED, because it took them
ages to think of checking that. Fair enough.
Someone just asked us a question which suggests they might have thought
they need to supply both files in the 'Public-key authentication' box in
the config dialog, to use public-key authentication at all. I can see
why someone might think that, anyway.
For serial connections, &H generally expanded to the empty string.
This seems more useful.
(It so happens that &H _could_ expand to the serial line if it came from
the command-line, but that's accidental.)
The pathname of Pageant's named pipe includes the name of the user
running it. And Windows usernames are allowed to have spaces in! So
the pipe pathname may also have a space, in which case Windows OpenSSH
will interpret the spacey pathname as an invalid first half followed
by a trailing garbage word.
A user reports that quoting the filename makes this work. Since double
quotes are an illegal Windows filename character, I think it should
therefore do no harm to quote it unconditionally, which is the easiest
fix.
Index the older format as 'PEM-style', since PEM is how it's referred to
in OpenSSH's own docs; and justify why you might want to use the newer
format.
Testing the script described in the previous commit message, Leak
Sanitiser pointed out that we didn't free the LogContext from the
first connection, and overwrote the pointer variable with the one from
the second.
A user points out that it always returned failure, even if it
succeeded. As a result, a 'psftp -b' script of the form
open this.host
do stuff
close
open that.host
do stuff
close
would terminate at the first 'close', believing it to have failed, and
PSFTP would exit with a failure status.
(Not only that, but there would be no error message indicating _why_
PSFTP had closed, because when a command returns failure it's expected
to have printed an error message already.)
These are all going to be used by a test program I have in the works,
which will need to link against a lot more of the code base than any
so far. So we need a pile of new stubs.
The trickiest of these was stubs/no-network.c, which had to
conditionally define a couple of extra network functions, because
there are Windows-specific plug_closing_system_error and
plug_closing_winsock_error functions.
CONF_bold_style is a pair of bit flags rather than an enum, so its
values aren't just BOLD_STYLE_FONT and BOLD_STYLE_COLOUR but also the
bitwise OR of them. (Hopefully not neither.)
The Windows version of the Filename structure now contains three
versions of the pathname, in UTF-16, UTF-8 and the system code page.
Callers can use whichever is most convenient.
All uses of filenames for actually opening files now use the UTF-16
version, which means they can tolerate 'exotic' filenames, by which I
mean those including Unicode characters outside the host system's
CP_ACP default code page.
Other uses of Filename structures inside the 'windows' subdirectory do
something appropriate, e.g. when printing a filename inside a message
box or a console message, we use the UTF-8 version of the filename
with the UTF-8 version of the appropriate API.
There are three remaining pieces to full Unicode filename support:
One is that the cross-platform code has many calls to
filename_to_str(), embodying the assumption that a file name can be
reliably converted into the unspecified current character set; those
will all need changing in some way.
Another is that write_setting_filename(), in windows/storage.c, still
saves filenames to the Registry as an ordinary REG_SZ in the system
code page. So even if an exotic filename were stored in a Conf, that
Conf couldn't round-trip via the Registry and back without corrupting
that filename by coercing it back to a string that fits in CP_ACP and
therefore doesn't represent the same file. This can't be fixed without
a compatibility break in the storage format, and I don't want to make
a minimal change in that area: if we're going to break compatibility,
then we should break it good and hard (the Nanny Ogg principle), and
devise a completely fresh storage representation that fixes as many
other legacy problems as possible at the same time. So that's my plan,
not yet started.
The final point, much more obviously, is that we're still short of
methods to _construct_ any Filename structures using a Unicode input
string! It should now work to enter one in the GUI configurer (either
by manual text input or via the file selector), but it won't
round-trip through a save and load (as discussed above), and there's
still no way to specify one on the command line (the groundwork is
laid by commit 10e1ac7752 but not yet linked up).
But this is a start.
I think the only reason it currently takes a plain string is because
its interesting caller (in unix/x11.c) has just constructed a string
out of an environment variable, and it seemed like the path of least
effort not to bother wrapping it into a proper Filename. But when
Filename on Windows becomes more interesting, we'll need it to take
the full version.
The values of that field in a Control structure are already
platform-dependent: you're only supposed to set them in cross-platform
code by using #defined names that each platform will define
differently.
Now I need the _type_ as well as the values to be opaque, because I'm
about to make a change on Windows that turns it into a wide character
string instead of a char string.
We already had encode_wide_string_as_utf8, which treats the wide
string as UTF-16 or UTF-32 as appropriate to the size of wchar_t. I'm
about to need the inverse function, and was surprised that it didn't
already exist (even though enough component parts did to make it easy).
message_box() previously differed from the real MessageBox API
function in that it permitted the user to provide a help context to be
used for a Help button in the dialog box.
Now it adds a second unusual ability: you can specify that the text
and caption strings are in UTF-8 rather than the system code page.
Having constructed a conio object with its own 'utf8' flag, we should
be checking that flag at time of use rather than the global
conio_use_utf8 (which was already taken into account at setup time).
Otherwise we miss the whole point, which is that without the override
flag turning off UTF-8, _some_ uses of the system should use the
default code page and not UTF-8.
In the key generation step where we invert 3f in the field
Z_q/<x^p-x-1>, I was carefully checking for failure, on the grounds
that even a field does have _one_ non-invertible element, namely zero.
But I forgot that we'd generated f in such a way that it can't
possibly be zero. So that failure check is pointless.
(However, I've retained it in the form of an assertion.)
In commit f9e572595b I claimed that I'd removed very nearly all
the global and static variables from windows/window.c. It turns out
that this was wildly overoptimistic - I missed quite a few of them!
I'm not quite sure how I managed that; my best guess is that I used an
underpowered 'nm' command that failed to find some classes of
variable.
Some of the remaining function-scope statics were removed completely
by commit afb3dab1e9 just now. In this commit, I've swept up some
more and turn them into fields of WinGuiSeat, where they should have
been moved last September.
The (hopefully complete this time) list of remaining variables,
generated by running this rune in the Windows build directory:
nm windows/CMakeFiles/putty.dir/window.c.obj |
grep -E '^([^ ]+)? *[bBcCdDgGsS] [^\.]'
consists of the following variables which are legitimately global
across the whole process and not related to a particular window:
- 'hinst' and 'hprev', instance handles for Windows loadable modules
- 'classname' in the terminal_window_class_a() and
terminal_window_class_w() functions, which allocate a window class
reusably
- some pointers to Windows API functions retrieved via the
DECL_WINDOWS_FUNCTION / GET_WINDOWS_FUNCTION system, such as
p_AdjustWindowRectExForDpi and p_FlashWindowEx
- some pointers to Windows API functions set up by assigning them at
startup to the right one of the ANSI or Unicode version depending on
the Windows version, e.g. sw_DefWindowProc and sw_DispatchMessage
- 'unicode_window', a boolean flag set at the same time as those
sw_Foo function pointers
- 'sesslist', storing the last-retrieved version of the saved
sessions menu
- 'cursor_visible' in show_mouseptr() and 'forced_visible' in
update_mouse_pointer(), each of which tracks the cumulative number
of times that function has shown or hidden the mouse pointer, so as
to manage its effect on the global state updated by ShowCursor
- 'trust_icon', loaded from the executable's resources
- 'wgslisthead', the list of all active WinGuiSeats
- 'wm_mousewheel', the window-message id we use for mouse wheel
events
and the following which are nothing to do with our code:
- '_OptionsStorage' in __local_stdio_printf_options() and
__local_stdio_scanf_options(), which I'd never noticed before, but
apparently are internal to a standard library header.
lpDx_maybe was a pointer defined to point at either lpDx itself or
NULL, depending on whether the code decided it needed to pass the lpDx
array of per-character pixel offsets to various functions during
drawing (based in turn on whether the font was variable-pitch).
lpDx is reallocated as necessary, which means lpDx_maybe must be kept
up to date. This was achieved by resetting it to lpDx if it was
already non-NULL.
But lpDx starts out as NULL before the first reallocation, so that
this can't work - it'll be initialised to NULL even if we _did_ want
to use it, and then at the first realloc, it won't be updated!
Before the previous commit turned lpDx from a static into an automatic
variable, this would have been a rare bug affecting only the first
call to the function. Now it will happen all the time, which is
better, because we can notice and fix it.
Replaced lpDx_maybe completely with a boolean flag indicating whether
we should pass lpDx to drawing functions.
In windows/window.c, a few variables inside functions were declared as
static, with no particular purpose that I can see: they don't seem to
have any reason to persist between calls to the function. So it makes
more sense to have them be ordinary stack-allocated automatic
variables.
Static variables removed by this commit:
- 'RECT ss' in reset_window.
- 'WORD keys[3]' and 'BYTE keysb[3]' in TranslateKey.
- several (buffer, length) pairs in do_text_internal.
- keys_unicode[] in TranslateKey.
All of these variables were originally introduced in patches credited
to Robert de Bath, which means I can't even try to reconstruct my
original thought processes, because they weren't _my_ thoughts anyway.
The arrays in do_text_internal are the easiest to understand: they're
reallocated larger as necessary, and making them static means the
allocation from a previous call can be reused, saving a malloc (though
I don't think that's a good enough reason to bother, these days).
The fixed-size static arrays and RECT are harder to explain. I suspect
they might originally have been that way because of 1990s attitudes to
performance: in x86-32 it's probably marginally faster to give your
variables constant addresses than sp-relative ones, and in the 1990s
computers were much slower, so there's an argument for making things
static if you have no _need_ to make them automatic. These days, the
difference is negligible, and persistent state is much more widely
recognised as a risk!
But keys_unicode[] is by far the strangest, because there was code
that clearly _did_ expect it to persist between calls, namely three
assignments to keys_unicode[0] near the end of the function after it's
finished being used for any other purpose, and a conditioned-out set
of debug() calls at the top of the function that print its contents
before anything has yet written to it.
But as far as I can see, the persistent data in the array is otherwise
completely unused. In any call to the function, if keys_unicode is
used at all, then it's either written directly by a call to ToAsciiEx,
or else (for pre-NT platforms) converted from ToAsciiEx's output via
MultiByteToWideChar. In both cases, the integer variable 'r' indicates
how many array elements were written, and subsequent accesses only
ever read those elements. So the assignments to keys_unicode[0] at the
end of the previous call will be overwritten before anything at all
can depend on them - with the exception of those debug statements.
I don't really understand what was going on here. It's tempting to
guess that those final assignments must have once done something
useful, and the code that used them was later removed. But the source
control history doesn't bear that out: a static array of three
elements (under its original name 'keys') was introduced in commit
0d5d39064a, and then commits 953b7775b3 and 26f1085038
added the other two assignments. And as far as I can see, even as of
the original commit 0d5d39064a, the code already had the property
that there was a final assignment to keys[0] which would inevitably be
overwritten in the next call before it could affect anything.
So I'm totally confused about what those assignments were _ever_
useful for. But an email thread from the time suggests that some of
those patches were being rebased repeatedly past other work (or
rather, the much less reliable CVS analogue of rebasing), so my best
guess is that that's where the confusion crept in - perhaps in RDB's
original version of the code they did do something useful.
Regardless of that, I'm pretty convinced that persistent array can't
be doing anything useful _now_. So I'm taking it out. But if anyone
reports a bug resulting from this change, then I'll eat my words - and
with any luck the details of the bug report will give us a clue what's
going on, and then we can put back some equivalent functionality with
much better comments!