In protocols other than PROT_RAW, the new line editing system differed
from the old one in not considering ^M or ^J (typed using the actual
Ctrl key, so distinct from pressing Return) to mean "I've finished
editing this line, please send it". This commit reinstates that
behaviour.
It turned out that a third-party tool (namely PuTTY Connection Manager),
which automatically answers prompts for the user, was terminating them
by sending ^J in place of the Return key. We don't know why (and it's
now unmaintained), but it was. So this change should make that tool
start working again.
I exclude PROT_RAW above because in that protocol the line editing has
much weirder handling for ^M and ^J, which lineedit replicated
faithfully from the old code: either control character by itself is
treated literally (displaying as "^M" or "^J" in the terminal), but if
you type the two in sequence in that order, then the ^J deletes the ^M
from the edit buffer and enters the line, so that the sequence CR LF
acts as a newline overall. I haven't changed that behaviour here, but
I have added a regression test of it to test_lineedit.
When the user pressed Return at the end of a line, we were calling the
TermLineEditor's receiver function once for each character in the line
buffer. A Telnet user reported from looking at packet traces that this
leads to each character being sent in its own TCP segment, which is
wasteful and silly, and a regression in 0.82 compared to 0.81.
You can see the SSH version of the phenomenon even more easily in
PuTTY's own SSH logs, without having to look at the TCP layer at all:
you get a separate SSH2_MSG_CHANNEL_DATA per character when sending a
line that you entered via local editing in the GUI terminal.
The fix in this commit makes lineedit_send_line() collect keystrokes
into a temporary bufchain and pass them on to the backend in chunks
the size of a bufchain block.
This is better, but still not completely ideal: lineedit_send_line()
is often followed by a call to lineedit_send_newline(), and there's no
buffering done between _those_ functions. So you'll still see a
separate SSH message / Telnet TCP segment for the newline after the
line.
I haven't fixed that in this commit, for two reasons. First, unlike
the character-by-character sending of the line content, it's not a
regression in 0.82: previous versions also sent the newline in a
separate packet and nobody complained about that. Second, it's much
more difficult, because newlines are handled specially - in particular
by the Telnet backend, which sometimes turns them into a wire sequence
CR LF that can't be generated by passing any literal byte to
backend_send. So you'd need to violate a load of layers, or else have
multiple parts of the system buffer up output and then arrange to
release it on a toplevel callback or some such. Much more code, more
risk of bugs, and less gain.
This takes over from both the implementation in ldisc.c and the one in
term_get_userpass_input, which were imperfectly duplicating each
other's functionality. The new version should be more consistent
between the two already, and also, it means further improvements can
now be made in just one place.
In the course of this, I've restructured the inside of ldisc.c by
moving the input_queue bufchain to the other side of the translation
code in ldisc_send. Previously, ldisc_send received a string, an
optional 'dedicated key' indication (bodgily signalled by a negative
length) and an 'interactive' flag, translated that somehow into a
combination of raw backend output and specials, and saved the latter
in input_queue. Now it saves the original (string, dedicated flag,
interactive flag) data in input_queue, and doesn't do the translation
until the data is pulled back _out_ of the queue. That's because the
new line editing system expects to receive something much closer to
the original data format.
The term_get_userpass_input system is also substantially restructured.
Instead of ldisc.c handing each individual keystroke to terminal.c so
that it can do line editing on it, terminal.c now just gives the Ldisc
a pointer to its instance of the new TermLineEditor object - and then
ldisc.c can put keystrokes straight into that, in the same way it
would put them into its own TermLineEditor, without having to go via
terminal.c at all. So the term_get_userpass_input edifice is only
called back when the line editor actually delivers the answer to a
username or password prompt.
(I considered not _even_ having a separate TermLineEditor for password
prompts, and just letting ldisc.c use its own. But the problem is that
some of the behaviour differences between the two line editors are
deliberate, for example the use of ^D to signal 'abort this prompt',
and the use of Escape as an alternative line-clearing command. So
TermLineEditor has a flags word that allows ldisc and terminal to set
it up differently. Also this lets me give the two TermLineEditors a
different vtable of callback functions, which is a convenient way for
terminal.c to get notified when a prompt has been answered.)
The new line editor still passes all the tests I wrote for the old
one. But it already has a couple of important improvements, both in
the area of UTF-8 handling:
Firstly, when we display a UTF-8 character on the terminal, we check
with the terminal how many character cells it occupied, and then if
the user deletes it again from the editing buffer, we can emit the
right number of backspace-space-backspace sequences. (The old ldisc
line editor incorrectly assumed all Unicode characters had terminal
with 1, partly because its buffer was byte- rather than character-
oriented and so it was more than enough work just finding where the
character _start_ was.)
Secondly, terminal.c's userpass line editor would never emit a byte in
the 80-BF range to the terminal at all, which meant that nontrivial
UTF-8 characters always came out as U+FFFD blobs!