Now it can optionally check that output lines don't go beyond a
certain length (measured in terminal columns, via wcwidth, rather than
bytes or characters). In this mode, lines are prefixed with a
distinctive character (namely '|'), and if a line is too long, then it
is broken and the continuation line gets a different prefix ('>').
When StripCtrlChars is targeting a terminal, it asks the terminal to
call wcwidth on its behalf, so it can be sure to use the same idea as
the real terminal about which characters are wide (i.e. depending on
the configuration of ambiguous characters).
This mode isn't yet used anywhere.
The previous unlimited system was nicely general, but unfortunately
meant you could easily DoS a PuTTY-based terminal by sending a
printing character followed by an endless stream of identical
combining chars. (In fact, due to accidentally-quadratic linked list
management, you'd DoS it by using up all the CPU even before you got
the point of making it allocate all the RAM.)
The new limit is chosen to be 32, more or less arbitrarily. Overlong
sequences of combining characters are signalled by turning the whole
character cell into U+FFFD REPLACEMENT CHARACTER.
If the terminal is one column wide, it's not possible to print a
double-width CJK character at all - it won't fit. Replace it with
U+FFFD to indicate that impossibility.
The previous behaviour was to notice that we're in the rightmost
column of the terminal, and invoke the LATTR_WRAPPED2 special case to
wrap to the leftmost column on the next line. But in a width-1
terminal, the rightmost column _is_ the leftmost column, so this would
leave us no better off, and we would have fallen through into the next
case while in exactly the situation we'd tried to rule out.
The REP escape (ESC [ nnn b) causes the previously printed graphic
character to be repeated another nnn times. So if it's sent as the
very first thing in a terminal session, when there _is_ no previously
printed graphic character, there's nothing sensible it can do.
In fact, in that situation, it does something decidedly _not_
sensible: it takes the uninitialised value term->last_graphic_char and
sends it directly to term_display_graphic_char, with undesirable
results if it's not actually a printing character. In particular, the
value 0 is treated as a combining char (because it has zero wcwidth),
leading to a knock-on assertion failure when compressing the
scrollback lines (which uses \0 as a terminating value for sequences
of combining characters, precisely because it expects it never to show
up in an actual cc slot!).
Turns out that my assertion that term->cols == line->cols can
sometimes fail, because if the window is shrunk, scrlineptr()
deliberately _doesn't_ shrink the line (so that the columns on the
right can be recovered if the window is then resized larger again). So
clear_line() should _make_ the line the right width, instead of
asserting that it already is.
I've factored out clear_line() (wipe out everything on a terminal line
including its line attrs) and also line_cols() (determine how many
columns are on this particular line, taking into account
LATTR_WRAPPED2 which reduces it by one).
Also, newline() and freeline() were badly named. Now they're called
newtermline() and freetermline(), which include the full actual type
name they deal with, and also means that now neither of them is named
the same as a control character!
SSH authentication prompts (passwords, passphrases and keyboard-
interactive) were previously sanitised to remove escape sequences by
the simplistic sanitise_term_data() in utils.c. Now they're fed
through the new mode of StripCtrlChars instead, which means they
should permit printable Unicode (if the terminal is in UTF-8 mode)
while still disallowing escape sequences. Hopefully this will be a
usability improvement to everyone whose login prompts are in a
language not representable in plain ASCII.
Also, instead of insisting on modifying the UTF-8 decoding state
inside the Terminal structure, it now takes a separate pointer to a
small struct containing that decode state. The idea is that if a
separate module wants to decode characters the same way the real
terminal would, it can pass its own mutable state structure, but the
same main Terminal pointer.
The idea of these is that they centralise the common idiom along the
lines of
if (logical_array_len >= physical_array_size) {
physical_array_size = logical_array_len * 5 / 4 + 256;
array = sresize(array, physical_array_size, ElementType);
}
which happens at a zillion call sites throughout this code base, with
different random choices of the geometric factor and additive
constant, sometimes forgetting them completely, and generally doing a
lot of repeated work.
The new macro sgrowarray(array,size,n) has the semantics: here are the
array pointer and its physical size for you to modify, now please
ensure that the nth element exists, so I can write into it. And
sgrowarrayn(array,size,n,m) is the same except that it ensures that
the array has size at least n+m (so sgrowarray is just the special
case where m=1).
Now that this is a single centralised implementation that will be used
everywhere, I've also gone to more effort in the implementation, with
careful overflow checks that would have been painful to put at all the
previous call sites.
This commit also switches over every use of sresize(), apart from a
few where I really didn't think it would gain anything. A consequence
of that is that a lot of array-size variables have to have their types
changed to size_t, because the macros require that (they address-take
the size to pass to the underlying function).
Commit fec93d5e0 missed a piece: when we hand wcTo to
term_bidi_cache_store and it uses it to set up the mapping between
physical and logical character positions for cursor and selection
handling, it will assume wcTo has as many entries as there are columns
in the terminal. But in fact now wcTo may be shorter than that, so
term_bidi_cache_store also needs to pay attention to the nchars field.
The actual calls to win_draw_{text,cursor} in do_paint() were
duplicated in two places, and I may want to change them soon, so it's
convenient to centralise them.
Previously, any double-width character would break the bidi algorithm,
because of the quirk of data representation in which we store UCSWIDE
(0xDFFF) in the right-hand termchar overlapped by the character.
UCSWIDE has bidirectional character class L according to minibidi's
getType(), so it disrupted the algorithm.
Now we remove UCSWIDE from the input line before passing it to
do_bidi(), replacing it with an 'nchars' field in the bidi_char
structure indicating single or double width, and put the UCSWIDEs back
afterwards once do_bidi returns.
Now that all the call sites are expecting a size_t instead of an int
length field, it's no longer particularly difficult to make it
actually return the pointer,length pair in the form of a ptrlen.
It would be nice to say that simplifies call sites because those
ptrlens can all be passed straight along to other ptrlen-consuming
functions. Actually almost none of the call sites are like that _yet_,
but this makes it possible to move them in that direction in future
(as part of my general aim to migrate ptrlen-wards as much as I can).
But also it's just nicer to keep the pointer and length together in
one variable, and not have to declare them both in advance with two
extra lines of boilerplate.
This is a general cleanup which has been overdue for some time: lots
of length fields are now the machine word type rather than the (in
practice) fixed 'int'.
Taking a leaf out of the LLVM code base: this macro still includes an
assert(false) so that the message will show up in a typical build, but
it follows it up with a call to a function explicitly marked as no-
return.
So this ought to do a better job of convincing compilers that once a
code path hits this function it _really doesn't_ have to still faff
about with making up a bogus return value or filling in a variable
that 'might be used uninitialised' in the following code that won't be
reached anyway.
I've gone through the existing code looking for the assert(false) /
assert(0) idiom and replaced all the ones I found with the new macro,
which also meant I could remove a few pointless return statements and
variable initialisations that I'd already had to put in to placate
compiler front ends.
In the big boolification commit (3214563d8) I accidentally rewrote
"term->wrap == 0" as "term->wrap" instead of as "!term->wrap", so now
sending the backspace character to the terminal at the start of a line
causes the cursor to wrap round to the end of the previous line if and
only if it _shouldn't_ have done.
A long time ago, in commit 4d77b6567, I moved the generation of the
arrow-key escape sequences into a function format_arrow_key(). Mostly
the reason for that was a special purpose I had in mind at the time
which involved auto-generating the same sequences in response to
things other than a keypress, but I always thought it would be nice to
centralise a lot more of PuTTY's complicated keyboard handling in the
same way - at least the handling of the function keys and their
numerous static and dynamic config options.
In this year's general spirit of tidying up and refactoring, I think
it's finally time. So here I introduce three more centralised
functions for dealing with the numbered function keys, the small
keypad (Ins, Home, PgUp etc) and the numeric keypad. Lots of horrible
and duplicated code from the key handling functions in window.c and
gtkwin.c is now more sensibly centralised: each platform keyboard
handler concerns itself with the local format of a keyboard event and
platform-specific enumeration of key codes, and once it's decided what
the logical key press actually _is_, it hands off to the new functions
in terminal.c to generate the appropriate escape code.
Mostly this is intended to be a refactoring without functional change,
leaving the keyboard handling how it's always been. But in cases where
the Windows and GTK handlers were accidentally inconsistent, I've
fixed the inconsistency rather than carefully keeping both sides how
they were. Known consistency fixes:
- swapping the arrow keys between normal (ESC [ A) and application
(ESC O A) is now done by pressing Ctrl with them, and _not_ by
pressing Shift. That was how it was always supposed to work, and
how it's worked on GTK all along, but on Windows it's been done by
Shift as well since 2010, due to a bug at the call site of
format_arrow_key() introduced when I originally wrote that function.
- in Xterm function key mode plus application keypad mode, the /*-
keys on the numeric keypad now send ESC O {o,j,m} in place of ESC O
{Q,R,S}. That's how the Windows keyboard handler has worked all
along (it was a deliberate behaviour tweak for the Xterm-like
function key mode, because in that mode ESC O {Q,R,S} are generated
by F2-F4). But the GTK keyboard handler omitted that particular
special case and was still sending ESC O {Q,R,S} for those keys in
all application keypad modes.
- also in Xterm function key mode plus app keypad mode, we only
generates the app-keypad escape sequences if Num Lock is on; with
Num Lock off, the numeric keypad becomes arrow keys and
Home/End/etc, just as it would in non-app-keypad mode. Windows has
done this all along, but again, GTK lacked that special case.
'struct str' in terminal.c was an earlier and less good implementation
of the same concept as misc.h's strbuf, so I've replaced it with the
same strbuf we have everywhere. As a bonus, this means I can also use
put_uint{16,32} to save a bit of effort writing out the compressed
scrollback data.
On the decompression side, I've also switched to using BinarySource,
which has the advantage that now if the decoding goes wrong we can at
least be sure of not reading beyond the end of the buffer.
(The flip side of that is that now we _store_ the length of each
compressed line buffer, which costs a bit of memory. But I think it's
worth it for the safety and code consistency.)
This is another cleanup I felt a need for while I was doing
boolification. If you define a function or variable in one .c file and
declare it extern in another, then nothing will check you haven't got
the types of the two declarations mismatched - so when you're
_changing_ the type, it's a pain to make sure you've caught all the
copies of it.
It's better to put all those extern declarations in header files, so
that the declaration in the header is also in scope for the
definition. Then the compiler will complain if they don't match, which
is what I want.
My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
I think this is the full set of things that ought logically to be
boolean.
One annoyance is that quite a few radio-button controls in config.c
address Conf fields that are now bool rather than int, which means
that the shared handler function can't just access them all with
conf_{get,set}_int. Rather than back out the rigorous separation of
int and bool in conf.c itself, I've just added a similar alternative
handler function for the bool-typed ones.
This commit includes <stdbool.h> from defs.h and deletes my
traditional definitions of TRUE and FALSE, but other than that, it's a
100% mechanical search-and-replace transforming all uses of TRUE and
FALSE into the C99-standardised lowercase spellings.
No actual types are changed in this commit; that will come next. This
is just getting the noise out of the way, so that subsequent commits
can have a higher proportion of signal.
After the recent Seat and LogContext revamps, _nearly_ all the
remaining uses of the type 'Frontend' were in terminal.c, which needs
all sorts of interactions with the GUI window the terminal lives in,
from the obvious (actually drawing text on the window, reading and
writing the clipboard) to the obscure (minimising, maximising and
moving the window in response to particular escape sequences).
All of those functions are now provided by an abstraction called
TermWin. The few remaining uses of Frontend after _that_ are internal
to a particular platform directory, so as to spread the implementation
of that particular kind of Frontend between multiple source files; so
I've renamed all of those so that they take a more specifically named
type that refers to the particular implementation rather than the
general abstraction.
So now the name 'Frontend' no longer exists in the code base at all,
and everywhere one used to be used, it's completely clear whether it
was operating in one of Frontend's three abstract roles (and if so,
which), or whether it was specific to a particular implementation.
Another type that's disappeared is 'Context', which used to be a
typedef defined to something different on each platform, describing
whatever short-lived resources were necessary to draw on the terminal
window: the front end would provide a ready-made one when calling
term_paint, and the terminal could request one with get_ctx/free_ctx
if it wanted to do proactive window updates. Now that drawing context
lives inside the TermWin itself, because there was never any need to
have two of those contexts live at the same time.
(Another minor API change is that the window-title functions - both
reading and writing - have had a missing 'const' added to their char *
parameters / return values.)
I don't expect this change to enable any particularly interesting new
functionality (in particular, I have no plans that need more than one
implementation of TermWin in the same application). But it completes
the tidying-up that began with the Seat and LogContext rework.
In the very old days, when PuTTY was new and computers were slow, I
tried to implement a feature where scrolling the window would be
implemented using a fast rectangle-copy GDI operation, rather than an
expensive character-by-character redraw of all the changed areas.
It never quite worked right, and I ended up conditioning it out on
Windows, and never even tried to implement it on GTK. It's now been
sitting around unused for so long that I think it's no longer worth
keeping in the code at all - if I tried to put it back in, it surely
wouldn't even compile, and would need rewriting from scratch anyway.
Disturbingly, it looks as if I _tried_ to re-enable it at one point,
in that there was a '#define OPTIMISE_IS_SCROLL 1' in putty.h - but
that never had any effect, because the macro name is misspelled. All
the #ifdefs are for 'OPTIMISE_SCROLL', without the 'IS'. So despite
appearances, it really _has_ been conditioned out all along!
Now there's a centralised routine in misc.c to do the sanitisation,
which copies data on to an outgoing bufchain. This allows me to remove
from_backend_untrusted() completely from the frontend API, simplifying
code in several places.
Two use cases for untrusted-terminal-data sanitisation were in the
terminal.c prompts handler, and in the collection of SSH-2 userauth
banners. Both of those were writing output to a bufchain anyway, so
it was very convenient to just replace a bufchain_add with
sanitise_term_data and then not have to worry about it again.
There was also a simplistic sanitiser in uxcons.c, which I've now
replaced with a call to the good one - and in wincons.c there was a
FIXME saying I ought to get round to that, which now I have!
This is another major source of unexplained 'void *' parameters
throughout the code.
In particular, the currently unused testback.c actually gave the wrong
pointer type to its internal store of the frontend handle - it cast
the input void * to a Terminal *, from which it got implicitly cast
back again when calling from_backend, and nobody noticed. Now it uses
the right type internally as well as externally.
Nearly every part of the code that ever handles a full backend
structure has historically done it using a pair of pointer variables,
one pointing at a constant struct full of function pointers, and the
other pointing to a 'void *' state object that's passed to each of
those.
While I'm modernising the rest of the code, this seems like a good
time to turn that into the same more or less type-safe and less
cumbersome system as I'm using for other parts of the code, such as
Socket, Plug, BinaryPacketProtocol and so forth: the Backend structure
contains a vtable pointer, and a system of macro wrappers handles
dispatching through that vtable.
Same principle again - the more of these structures have globally
visible tags (even if the structure contents are still opaque in most
places), the fewer of them I can mistake for each other.
This is a cleanup I started to notice a need for during the BinarySink
work. It removes a lot of faffing about casting things to char * or
unsigned char * so that some API will accept them, even though lots of
such APIs really take a plain 'block of raw binary data' argument and
don't care what C thinks the signedness of that data might be - they
may well reinterpret it back and forth internally.
So I've tried to arrange for all the function call APIs that ought to
have a void * (or const void *) to have one, and those that need to do
pointer arithmetic on the parameter internally can cast it back at the
top of the function. That saves endless ad-hoc casts at the call
sites.
This causes the previous graphic character to be displayed another Pn
times (defaulting to 1, as usual). I just found out about it because
Ubuntu 18.04's ncurses expects it to be honoured.
According to all-escapes, REP is only supposed to be used when the
thing immediately preceding it in the terminal data stream _is_ a
printing character, and if not, then the behaviour is undefined. But
'undefined' is good enough for me to do the simple thing of just
remembering the last graphic character no matter whether anything else
has intervened since then.
To avoid DoS attacks using this escape sequence with a really huge Pn,
I clamp the value at the total size of the screen. There might be ways
to do that with more finesse (e.g. reduce it mod the width so that the
screen ends up looking the way it should even for huge parameters, or
reduce it even further if we notice the terminal isn't in wrapping
modes), but this will do for now.
I'm about to want to implement an escape sequence that causes a
graphic character to be printed, which means I'll need the code that
does so to be in a separate routine that I can call easily, instead of
buried a few loops deep in the middle of the main state machine.
NFC for the moment, because the bufchain is always specially
constructed to hold exactly the same data that would have been passed
in to the function as a (pointer,length) pair. But this API change
allows get_userpass_input to express the idea that it consumed some
but not all of the data in the bufchain, which means that later on
I'll be able to point the same function at a longer-lived bufchain
containing the full stream of keyboard input and avoid dropping
keystrokes that arrive too quickly after the end of an interactive
password prompt.
Thanks to Jiri Kaspar for sending this patch (apart from the new docs
section, which is in my own words), which implements a feature we've
had as a wishlist item ('utf8-plus-vt100') for a long time.
I was actually surprised it was possible to implement it in so few
lines of code! I'd forgotten, or possibly never noticed in the first
place, that even in UTF-8 mode PuTTY not only accepts but still
_processes_ all the ISO 2022 control sequences and shift characters,
and keeps running track of all the same state in term->cset and
term->cset_attrs that it tracks in IS0-2022-enabled modes. It's just
that in UTF-8 mode, at the very last minute when a character+attribute
pair is about to be written into the terminal's character buffer, it
deliberately ignores the contents of those variables.
So all that was needed was a new flag checked at that last moment
which causes it not quite to ignore them after all, and bingo,
utf8-plus-vt100 is supported. And it works no matter which ISO 2022
sequences you're using; whether you're using ESC ( 0 to select the
line drawing set directly into GL and ESC ( B to get back when you're
done, or whether you send a preliminary ESC ( B ESC ) 0 to get GL/GR
to be ASCII and line drawing respectively so you can use SI and SO as
one-byte mode switches thereafter, both work just as well.
This implementation strategy has a couple of consequences, which I
don't think matter very much one way or the other but I document them
just in case they turn out to be important later:
- if an application expecting this mode has already filled your
terminal window with lqqqqqqqqk, then enabling this mode in Change
Settings won't retroactively turn them into the line drawing
characters you wanted, because no memory is preserved in the screen
buffer of what the ISO 2022 state was when they were printed. So
the application still has to do a screen refresh.
- on the other hand, if you already sent the ESC ( 0 or whatever to
put the terminal _into_ line drawing mode, and then you turn on
this mode in Change Settings, you _will_ still be in line drawing
mode, because the system _does_ remember your current ISO 2022
state at all times, whether it's currently applying it to output
printing characters or not.
My theory that this report was completely obsolete seems to have been
scuppered, in the most infuriating way possible: a user sent a report
from 0.70 of a null-pointer crash happening moments _before_ that
check, because the compressed line pointer passed to decompressline()
was NULL.
So there's still some need for this thing after all, and moreover, it
should be happening just before that decompressline() call as well as
after it!
This is a mild security measure against malicious clipboard-writing.
It's only mild, because of course there are situations in which even a
sanitised paste could be successfully malicious (imagine someone
managing to write the traditional 'rm -rf' command into your clipboard
when you were going to paste to a shell prompt); but it at least
allows pasting into typical text editors without also allowing the
control sequence that exits the editor UI and returns to the shell
prompt.
This is a configurable option, because there's no well defined line to
be drawn between acceptable and unacceptable pastes, and it's very
plausible that users will have sensible use cases for pasting things
outside the list of permitted characters, or cases in which they know
they trust the clipboard-writer. I for one certainly find it useful on
occasion to deliberately construct a paste containing control
sequences that automate a terminal-based UI.
While I'm at it, when bracketed paste mode is enabled, we also prevent
pasting of data that includes the 'end bracketed paste' sequence
somewhere in the middle. I really _hope_ nobody was treating bracketed
paste mode as a key part of their security boundary, but then again, I
also can't imagine that anyone had an actually sensible use case for
deliberately making a bracketed paste be only partly bracketed, and
it's an easy change while I'm messing about in this area anyway.
People sometimes send in cut-and-pastes of that dialog box from very
old versions of PuTTY. This can usually be detected because the
'lineno' field in the error message refers to a line number in
terminal.c which doesn't have a call to lineptr() or scrlineptr() on
it _now_ but used to a long time ago). But that's a pretty roundabout
way to detect anything, so let's put some more reliable version
information in the error message.
(This might also provide a way to test the hypothesis that whatever
bug used to cause this dialog box to appear is now fixed, and that
_all_ remaining reports of this error message are from outdated
builds.)
This stores the last text selected in _this_ terminal, regardless of
whether any other application has since taken back whatever system
clipboard we also copied it to. It's written unconditionally whenever
text is selected in terminal.c.
The main purpose of this will be that it's also the place that you can
go and find the data you need to write to a system clipboard in
response to an explicit Copy operation. But it can also act as a data
source for pastes in its own right, so you can use it to implement an
intra-window private extra clipboard if that's useful. (OS X Terminal
has one of those, so _someone_ at least seems to like the idea.)
This lays some groundwork for making PuTTY's cut and paste handling
more flexible in the area of which clipboard(s) it reads and writes,
if more than one is available on the system.
I've introduced a system of list macros which define an enumeration of
integer clipboard ids, some defined centrally in putty.h (at present
just a CLIP_NULL which never has any text in it, because that seems
like the sort of thing that will come in useful for configuring a
given copy or paste UI action to be ignored) and some defined per
platform. All the front end functions that copy and paste take a
clipboard id, and the Terminal structure is now configured at startup
to tell it which clipboard id it should paste from on a mouse click,
and which it should copy from on a selection.
However, I haven't actually added _real_ support for multiple X11
clipboards, in that the Unix front end supports a single CLIP_SYSTEM
regardless of whether it's in OS X or GTK mode. So this is currently a
NFC refactoring which does nothing but prepare the way for real
changes to come.
Previously, both the Unix and Windows front ends would respond to a
paste action by retrieving data from the system clipboard, converting
it appropriately, _storing_ it in a persistent dynamic data block
inside the front end, and then calling term_do_paste(term), which in
turn would call back to the front end via get_clip() to retrieve the
current contents of that stored data block.
But, as far as I can tell, this was a completely pointless mechanism,
because after a data block was written into this storage area, it
would be immediately used for exactly one paste, and then never
accessed again until the next paste action caused it to be freed and
replaced with a new chunk of pasted data.
So why on earth was it stored persistently at all, and why that
callback mechanism from frontend to terminal back to frontend to
retrieve it for the actual paste action? I have no idea. This change
removes the entire system and replaces it with the completely obvious
alternative: the character-set-converted version of paste data is
allocated in a _local_ variable in the frontend paste functions,
passed directly to term_do_paste which now takes (buffer,length)
parameters, and freed immediately afterwards. get_clip() is gone.
It's an incoherent concept! There should not be any such thing as an
error box that terminates the entire program but is not modal. If it's
bad enough to terminate the whole program, i.e. _all_ currently live
connections, then there's no point in permitting progress to continue
in windows other than the affected one, because all windows are
affected anyway.
So all previous uses of fatalbox() have become modalfatalbox(), except
those which looked to me as if they shouldn't have been fatal in the
first place, e.g. lingering pieces of error handling in winnet.c which
ought to have had the severity of 'give up on this particular Socket
and close it' rather than 'give up on the ENTIRE UNIVERSE'.
After fixing the previous two bugs, I thought it was probably a good
idea to re-check _everywhere_ in terminal.c where curr_attr is used,
to make sure that if curr_truecolour also needed updating at the same
time then that was being done.
I spotted this myself while looking through the code in search of the
cause of the background-colour-erase bug: saving and restoring the
cursor via ESC 7 / ESC 8 ought to also save and restore the current
graphics rendition attributes including foreground and background
colour settings, but it was not saving and restoring the new
term->curr_truecolour along with term->curr_attr.
So there's now a term->save_truecolour to keep that in, and also a
term->alt_save_truecolour to take account of the fact that all the
saved cursor state variables get swapped out _again_ when switching
between the main and alternate screens.
(However, there is not a term->alt_truecolour to complete the cross
product, because the _active_ graphics rendition is carried over when
switching between the terminal screens; it's only the _saved_ one from
ESC 7 / ESC 8 that is saved separately. That's consistent with the
behaviour we've had all along for ordinary fg/bg colour selection.)
I've done this on a 'where possible' basis: in Windows paletted mode
(in case anyone is still using an old enough graphics card to need
that!) I simply haven't bothered, and will completely ignore the dim
flag.
Markus Gans points out that some applications which (not at all
unreasonably) don't trust $TERM to tell them the full capabilities of
their terminal will use the sequence "OSC 4 ; nn ; ? BEL" to ask for
the colour-palette value in position nn, and they may not particularly
care _what_ the results are but they will use them to decide whether
the right number of colour palette entries even exist.
I know some users don't like any colour _at all_, and we have a
separate option to turn off xterm-style 256-colour sequences, so it
seems remiss not to have an option to disable true colour as well.
A mouse drag which manages to reach x < 0 (via SetCapture or
equivalent) was treated as having the coordinates of (x_max, y-1).
This is intended to be useful when the mouse drag is part of ordinary
raster-ordered selection.
But we were leaving that treatment enabled even for mouse actions that
went to xterm mouse tracking mode - thanks to Markus Gans for
reporting that - and when I investigated, I realised that this isn't a
sensible transformation in _rectangular_ selection mode either. Fixed
both.
This is a heavily rewritten version of a patch originally by Lorenz
Diener; it was tidied up somewhat by Christian Brabandt, and then
tidied up more by me. The basic idea is to add to the termchar
structure a pair of small structs encoding 24-bit RGB values, each
with a flag indicating whether it's turned on; if it is, it overrides
any other specification of fg or bg colour for that character cell.
I've added a test line to colours.txt containing a few example colours
from /usr/share/X11/rgb.txt. In fact it makes quite a good demo to run
the whole of rgb.txt through this treatment, with a command such as
perl -pe 's!^\s*(\d+)\s+(\d+)\s+(\d+).*$!\e[38;2;$1;$2;$3m$&\e[m!' rgb.txt
A user reports that a remote window title query, if the window title
is empty or if the option to return it is disabled, fails the
assertion in ldisc_send that I introduced as part of commit c269dd013
to catch any lingering uses of ldisc_send with length 0 that should
have turned into ldisc_echoedit_update. Added a check for len > 0
guarding that ldisc_send call, and likewise at one or two others I
noticed on my way here.
(Probably at some point I should decide that the period of smoking out
lingering old-style ldisc_send(0) calls is over, and declare it safe
to remove that assertion again and get rid of all the cumbersome
safety checks at call sites like these ones. But not quite yet.)
Ilya Shipitsin sent me a list of errors reported by a tool 'cppcheck',
which I hadn't seen before, together with some fixes for things
already taken off that list. This change picks out all the things from
the remaining list that I could quickly identify as actual errors,
which it turns out are all format-string goofs along the lines of
using a %d with an unsigned int, or a %u with a signed int, or (in the
cases in charset/utf8.c) an actual _size_ mismatch which could in
principle have caused trouble on a big-endian target.
Copying large scrollback buffers to the clipboard can take a long time,
up to several minutes. Doubling the size of the clipboard copy buffer
when more space is needed, instead of just adding a small constant size,
significantly speeds up clipboard copies of large scrollback buffers.
An opcode for this was recently published in
https://tools.ietf.org/html/draft-sgtatham-secsh-iutf8-00 .
The default setting is conditional on frontend_is_utf8(), which is
consistent with the pty back end's policy for setting the same flag
locally. Of course, users can override the setting either way in the
GUI configurer, the same as all other tty modes.
This is a minimal fix for CVE-2015-5309, and while it's probably
unnecessary now, it seems worth committing for defence in depth and to
give downstreams something reasonably non-intrusive to cherry-pick.
Parameters are now accumulated in unsigned integers and carefully checked
for overflow (which is turned into saturation). Things that consume them
now have explicit range checks (again, saturating) to ensure that their
inputs are sane. This should make it much harder to cause overflow by
supplying ludicrously large numbers.
Fixes two bugs found with the help of afl-fuzz. One of them may be
exploitable and is CVE-2015-5309.
I broke it as a side effect of commit 30e63c105, in which I intended
to ignore mouse drag events that hadn't been preceded by a click. I
didn't spot that right-clicks (assuming Unix-style button mappings) go
through the same code path as left-drags, and hence were being ignored
even though they _were_ their own initiating click.
On OS X GTK, when you click in a pterm that wasn't the active window,
the first click activates it but is swallowed by the windowing system
- but a subsequent tiny drag can still be taken as part of a selection
action, making it difficult to activate the window in order to paste
into it.
Fixed by ignoring mouse drags when the terminal.c mouse state was
NO_SELECTION; if we've seen one prior click then it should be
ABOUT_TO, or DRAGGING if we saw a double or triple click.
The original version of the xterm mouse tracking protocol did not
support character-cell coordinates greater than 223. If term_mouse()
got one, it would fail to construct an escape sequence for the mouse
event, and would then call ldisc_send() with a zero-length string -
which fails an assertion that I added in November (c269dd0135) on the
occasion of moving ldisc_echoedit_update() into its own function. So
the corresponding operation before that change would have done a
gratuitous ldisc_echoedit_update(), which is exactly the sort of thing
the assertion was there to catch :-)
Later extensions to the mouse tracking protocol support larger
coordinates anyway (try ESC[?1006h or ESC[?1015h in addition to the
ESC[?1000h that turns the whole system on in the first place). It's
only clients that don't use one of those extensions which would have
had the problem.
Thanks to Mirko Wolle for the report.
Having found a lot of unfixed constness issues in recent development,
I thought perhaps it was time to get proactive, so I compiled the
whole codebase with -Wwrite-strings. That turned up a huge load of
const problems, which I've fixed in this commit: the Unix build now
goes cleanly through with -Wwrite-strings, and the Windows build is as
close as I could get it (there are some lingering issues due to
occasional Windows API functions like AcquireCredentialsHandle not
having the right constness).
Notable fallout beyond the purely mechanical changing of types:
- the stuff saved by cmdline_save_param() is now explicitly
dupstr()ed, and freed in cmdline_run_saved.
- I couldn't make both string arguments to cmdline_process_param()
const, because it intentionally writes to one of them in the case
where it's the argument to -pw (in the vain hope of being at least
slightly friendly to 'ps'), so elsewhere I had to temporarily
dupstr() something for the sake of passing it to that function
- I had to invent a silly parallel version of const_cmp() so I could
pass const string literals in to lookup functions.
- stripslashes() in pscp.c and psftp.c has the annoying strchr nature
A minus sign is illegal at that position in a control sequence, so if
ESC[13t should report something like ESC[3;-123;234t then we won't
accept it as input. Switch to printing the numbers as unsigned, so
that negative window coordinates are output as their 32-bit two's
complement; experimentation suggests that PuTTY does accept that on
input.
I'm not actually sure why we've always had back ends notify ldisc of
changes to echo/edit settings by giving ldisc_send(ldisc,NULL,0,0) a
special meaning, instead of by having a separate dedicated notify
function with its own prototype and parameter set. Coverity's recent
observation that the two kinds of call don't even have the same
requirements on the ldisc (particularly, whether ldisc->term can be
NULL) makes me realise that it's really high time I separated the two
conceptually different operations into actually different functions.
While I'm here, I've renamed the confusing ldisc_update() function
which that special operation ends up feeding to, because it's not
actually a function applying to an ldisc - it applies to a front end.
So ldisc_send(ldisc,NULL,0,0) is now ldisc_echoedit_update(ldisc), and
that in turn figures out the current echo/edit settings before passing
them on to frontend_echoedit_update(). I think that should be clearer.
Now Jacob has reminded me that 'resize-no-truncate' was already on the
wishlist, I notice that it suggested Clear Scrollback should remove
the preserved information off to the right. On the basis that that's
(at least partly) a privacy feature, that seems sensible, so let's do it.
[originally from svn r10210]
We now only truncate a termline to the current terminal width if we're
actually going to modify it. As a result, resizing to a narrower
terminal width and then immediately back again, with no terminal
output in between, should restore the previous screen contents. Only
lines that are actually modified while the terminal is narrow (and
scrolling them around doesn't count as modification) should now be
truncated.
This will be a bit nicer for Unix window resizing (since X lacks the
Windows distinction between mid-drag resize events and the ultimate
drag-release, so can't defer the call to term_size until the latter as
we can on Windows), but mostly it's inspired by having played with a
tiling window manager recently and hence realised that in some
environments windows will be resized back and forth without much
control as a side effect of just moving them around - so it's
generally desirable for resizes to be non-destructive.
[originally from svn r10208]
On Windows (X mouse reporting of the mouse wheel isn't currently done
by the Unix front end, though I'm shortly about to fix that too) a
mouse wheel event is translated into a virtual button, and we send
both a press and a release of that button to terminal.c, which encodes
both in X mouse reporting escape sequences and passes them on to the
server. This isn't consistent with what xterm does - scroll-wheel
events are encoded _like_ button presses, but differ semantically in
that they don't have matching releases. So we're updating to match
xterm.
[originally from svn r10138]
Handlers for a number of escape sequences, notably including ESC[J and
the sequences that switch to/from the alternate screen, were
unconditionally resetting the scrollback instead of first checking the
'Reset scrollback on display activity' configuration option. I've
added the missing if statements, so now 'Reset scrollback on display
activity' should actually mean what it says.
For example, this would have inconvenienced an mplayer user, who
wouldn't be able to go up and check their scrollback while mplayer was
repeatedly redisplaying its status line, because mplayer uses ESC[J to
erase each version of the status line before printing the next
version.
[originally from svn r10125]
Previously I had unthinkingly called the general-purpose
check_selection() routine to indicate that I was going to mess with n
character cells right of the cursor position, causing the selection
highlight to be removed if it intersected that region. This is all
wrong, since actually the whole region from cursor to EOL is modified
by any character insertion or deletion, so if we were going to call
check_selection it should be on that whole region. (Quick demo: select
part of the line to the right of the cursor, then emit ESC[P or ESC[@
and see the text move left or right while the highlight stays put.)
So we could just call check_selection() on that larger affected
region, and that would be correct. However, we can do something
slightly more elegant in the case where the selection is contained
entirely within the subregion that moves to one side (as opposed to
the characters that actually vanish at one or other end): we can move
the selection highlight with the text under it, to preserve the visual
reminder of which text was selected for as long as possible.
[originally from svn r10097]
In r10020 I carefully reimplemented using timing.c and callback.c the
same policy for large pastes that the previous code appeared to be
implementing ad-hoc, which included a 450ms delay between sending
successive lines of pasted text if no visible acknowledgment of the
just-sent line (in the form of a \n or \r) came back from the
application.
However, it turns out that that *wasn't* what the old code was doing.
It *would* have done that, but for the bug that it never actually set
the 'last_paste' variable, and never has done since it was first
introduced way back in r516! So the policy I thought had been in force
forever has in fact only been in force since I unwittingly fixed that
bug in r10020 - and it turns out to be a bad idea, breaking pastes
into vi in particular.
So I've removed the timed paste code completely, on the basis that
it's never actually worked and nobody seems to have been unhappy about
that. Now we still break large pastes into separate lines and send
them in successive top-level callbacks, and the user can still press a
key to interrupt a paste if they manage to catch it still going on,
but there's no attempted *delay* any more.
(It's possible that what I *really* ought to be doing is calling
back->sendbuffer() to see whether the backend is consuming the data
pasted so far, and if not, deferring the rest of the paste until the
send buffer becomes smaller. Then we could have pasting be delayed by
back-pressure from the recipient, and still manually interruptible
during that delay, but not have it delayed by anything else. But what
we have here should at least manage to be equivalent to the *actual*
rather than the intended old policy.)
[originally from svn r10041]
[r516 == 0d5d39064a]
[r10020 == 7be9af74ec]
I've removed the ad-hoc front-end bodgery in the Windows and GTK ports
to arrange for term_paste to be called at the right moments, and
instead, terminal.c itself deals with knowing when to send the next
chunk of pasted data using a combination of timers and the new
top-level callback mechanism.
As a happy side effect, it's now all in one place so I can actually
understand what it's doing! It turns out that what all that confusing
code was up to is: send a line of pasted data, and delay sending the
next line until either a CR or LF is returned from the server
(typically indicating that the pasted text has been received and
echoed) or 450ms elapse, whichever comes first.
[originally from svn r10020]
buffered in terminal.c indefinitely and only released when further
output turned up.
Arose because we suppress the call to term_out from term_data if a
drag-select is in progress, but when the drag-select ends we weren't
proactively calling term_out to release the buffered data. So if your
session generated some terminal output while you were in mid-select,
_and had stopped by the time you let go of the mouse button_, then the
output would just sit there until released by the next call to
term_data.
[originally from svn r9768]
xterm mouse tracking, both supported by the current up-to-date xterm
(288). They take the form of two new DEC terminal modes, 1006 and
1015, which do not in themselves _enable_ mouse tracking but they
modify the escape sequences sent if mouse tracking is enabled in the
usual way.
[originally from svn r9752]
attempting to call lineptr() with a y-coordinate off the bottom of the
screen and triggering the dreaded 'line==NULL' message box.
This crash can only occur if the bottommost line of the screen has the
LATTR_WRAPPED flag set, which as far as I can see you can only
contrive by constructing a LATTR_WRAPPED line further up the screen
and then moving it down using an insert-line escape sequence. That's
probably why this bug has been around forever without anyone coming
across it.
[originally from svn r9726]
window. scroll() iterates that many times, so this prevents a tedious
wait if you give a very large parameter to ESC[L or ESC[M, for
example.
A side effect is that very large requests for upward scrolling in a
context that affects the scrollback will not actually wipe out the
whole scrollback: instead they push just the current lines of the
screen into the scrollback, and don't continue on to fill it up with
endless boring blank lines. I think this is likely to be more useful
in general, since it avoids wiping out lots of useful scrollback data
by mistake. I can imagine that people might have been using it
precisely _to_ wipe the scrollback in some situations, but if so then
they should use CSI 3 J instead.
[originally from svn r9677]
First, make absolute times unsigned. This means that it's safe to
depend on their overflow behaviour (which is undefined for signed
integers). This requires a little extra care in handling comparisons,
but I think I've correctly adjusted them all.
Second, functions registered with schedule_timer() are guaranteed to be
called with precisely the time that was returned by schedule_timer().
Thus, it's only necessary to check these values for equality rather than
doing risky range checks, so do that.
The timing code still does lots that's undefined, unnecessary, or just
wrong, but this is a good start.
[originally from svn r9667]
platform manner, but which nothing ever called. It thus served only to
trap up the unwary. The live function key handling code lives in the
frontends, i.e. window.c on Windows and gtkwin.c on Unix.
[originally from svn r9579]
the offset horizontal line characters in the VT100 line-drawing set
(o,p,r,s), so that no trace of it - and hence no pointless performance
hit - is compiled into the cross-platform modules on non-Windows
platforms.
[originally from svn r9467]
which text pasted into the terminal is preceded and followed by
special function-key-like escape sequences ESC[200~ and ESC[201~ so
that the application can identify it and treat it specially (e.g.
disabling auto-indent-same-as-previous-line in text editors). Enabled
and disabled by ESC[?2004h and ESC[?2004l, and of course off by
default.
[originally from svn r9412]
UTF-16 support. High Unicode characters in the terminal are now
converted back into surrogates during copy and draw operations, and
the Windows drawing code takes account of that when splitting up the
UTF-16 string for display. Meanwhile, accidental uses of wchar_t have
been replaced with 32-bit integers in parts of the cross-platform code
which were expecting not to have to deal with UTF-16.
[originally from svn r9409]
(o,p,r,s). They are displayed in Windows by actually writing the
centred one (q) with a vertical offset, in case fonts don't have the
offset versions; this requires terminal.c to separate those characters
into distinct calls to do_text(). Unfortunately, it was only breaking
up a text-drawing call _before_ one of those characters, not after
one. Spotted by Robert de Bath.
[originally from svn r9221]
'Config' in putty.h, which stores all PuTTY's settings and includes an
arbitrary length limit on every single one of those settings which is
stored in string form. In place of it is 'Conf', an opaque data type
everywhere outside the new file conf.c, which stores a list of (key,
value) pairs in which every key contains an integer identifying a
configuration setting, and for some of those integers the key also
contains extra parts (so that, for instance, CONF_environmt is a
string-to-string mapping). Everywhere that a Config was previously
used, a Conf is now; everywhere there was a Config structure copy,
conf_copy() is called; every lookup, adjustment, load and save
operation on a Config has been rewritten; and there's a mechanism for
serialising a Conf into a binary blob and back for use with Duplicate
Session.
User-visible effects of this change _should_ be minimal, though I
don't doubt I've introduced one or two bugs here and there which will
eventually be found. The _intended_ visible effects of this change are
that all arbitrary limits on configuration strings and lists (e.g.
limit on number of port forwardings) should now disappear; that list
boxes in the configuration will now be displayed in a sorted order
rather than the arbitrary order in which they were added to the list
(since the underlying data structure is now a sorted tree234 rather
than an ad-hoc comma-separated string); and one more specific change,
which is that local and dynamic port forwardings on the same port
number are now mutually exclusive in the configuration (putting 'D' in
the key rather than the value was a mistake in the first place).
One other reorganisation as a result of this is that I've moved all
the dialog.c standard handlers (dlg_stdeditbox_handler and friends)
out into config.c, because I can't really justify calling them generic
any more. When they took a pointer to an arbitrary structure type and
the offset of a field within that structure, they were independent of
whether that structure was a Config or something completely different,
but now they really do expect to talk to a Conf, which can _only_ be
used for PuTTY configuration, so I've renamed them all things like
conf_editbox_handler and moved them out of the nominally independent
dialog-box management module into the PuTTY-specific config.c.
[originally from svn r9214]
today reported an SSH2_MSG_UNIMPLEMENTED from a Cisco router which
looks as if it was triggered by SSH2_MSG_IGNORE, so I'm
experimentally putting this flag in. Currently must be manually
enabled, though if it turns out to solve the user's problem then
I'll probably add at least one version string...
[Edited commit message: actually, I also committed in error a piece
of experimental code as part of this checkin. Serve me right for not
running 'svn diff' first.]
[originally from svn r8926]
function in terminal.c, and replace the cloned-and-hacked handling
code in all our front ends with calls to that.
This was intended for code cleanliness, but a side effect is to make
the GTK arrow-key handling support disabling of application cursor
key mode in the Features panel. Previously that checkbox was
accidentally ignored, and nobody seems to have noticed before!
[originally from svn r8896]
UTF-16 when exchanging wchar_t strings with the front end. Enabled
by a #define in the platform's header file (one should not
promiscuously translate UTF-16 surrogate pairs on 32-bit wchar_t
platforms since that could give rise to redundant encoding attacks),
which is present on Windows.
[originally from svn r8495]
as unsigned char. This means that passing in a bare char is incorrect on
systems where char is signed. Sprinkle some appropriate casts to prevent
this.
[originally from svn r8406]
has width and height swapped. Since both a random xterm I have and
<http://invisible-island.net/xterm/ctlseqs/ctlseqs.txt> agree with him, I've
changed ours. (This stuff appears to originate in dtterm, but I can't check the
behaviour of that right now.)
While I'm here, the are-we-iconified report (CSI 11 t) looks to have the
wrong sense compared to the same sources, so swap that too.
(All this has been this way since it was originally implemented in r1414,
which doesn't cite a source. all-escapes is silent too.)
[originally from svn r8376]
[r1414 == bb1f5cec31]