putty-source

nhyatt/putty-source

Fork 0

mirror of https://git.tartarus.org/simon/putty.git synced 2025-07-01 19:42:48 -05:00

Commit Graph

Author	SHA1	Message	Date
Simon Tatham	e74790003c	StripCtrlChars: option to provide a target Terminal. If you use the new stripctrl_new_term() to construct a StripCtrlChars instead of the existing stripctrl_new(), then the resulting object will align itself with the character-set configuration of the Terminal object you point it at. (In fact, it'll reuse the same actual translation code, courtesy of the last few refactoring commits.) So it will interpret things as control characters precisely if that Terminal would also have done so. The previous locale-based sanitisation is appropriate if you're sending the sanitised output to an OS terminal device managed outside this process - the LC_CTYPE setting has the best chance of knowing how that terminal device will interpret a byte stream. But I want to start using the same sanitisation system for data intended for PuTTY's own internal terminal emulator, in which case there's no reason why LC_CTYPE should be expected to match that terminal's configuration, and no reason to need it to either since we can check the internal terminal configuration directly. One small bodge: stripctrl_new_term() is actually a macro, which passes in the function pointer term_translate() to the underlying real constructor. That's just so that console-only tools can link in stripctrl.c without acquiring a dependency on terminal.c (similarly to how we pass random_read in to the mp_random functions).	2019-03-06 20:31:26 +00:00
Simon Tatham	6593009b0e	New utility object, StripCtrlChars. This is for sanitising output that's going to be sent to a terminal, if you don't want it to be able to send arbitrary escape sequences and thereby (for example) move the cursor back up to existing text on the screen and overprint it confusingly. It works using the standard C library: we convert to a wide-character string and back, and then use wctype.h to spot control characters in the intermediate form. This means its idea of the conversion character set is locale-based rather than any of our own charset library's fixed settings - which is what you want if the aim is to protect your local terminal (which we assume the system locale represents accurately). This also means that the sanitiser strips things that will _act_ as control characters when sent to the local terminal, whether or not they were intended as control characters by a server that might have had a different character set in mind. Since the main aim is to protect the local terminal rather than to faithfully replicate the server's intention, I think that's the right criterion. It only strips control characters at the charset-independent layer, like backspace, carriage return and the escape character: wctype.h classifies those as control characters, but classifies as printing all of the more Unicode-specific controls like bidirectional overrides. But that's enough to prevent cursor repositioning, for example. stripctrl.c comes with a test main() of its own, which I wasn't able to fold into testcrypt and put in the test suite because of its dependence on the system locale - it wouldn't be guaranteed to work the same way on different test systems anyway. A knock-on build tweak: because you can feed data into this sanitiser in chunks of arbitrary size, including partial multibyte chars, I had to use mbrtowc() for the decoding, and that means that in the 'old' Win32 builds I have to link against the Visual Studio C++ library as well as the C library, because for some reason that's where mbrtowc lived in VS2003.	2019-02-20 07:27:22 +00:00

Author

SHA1

Message

Date

Simon Tatham

e74790003c

StripCtrlChars: option to provide a target Terminal.

If you use the new stripctrl_new_term() to construct a StripCtrlChars
instead of the existing stripctrl_new(), then the resulting object
will align itself with the character-set configuration of the Terminal
object you point it at. (In fact, it'll reuse the same actual
translation code, courtesy of the last few refactoring commits.) So it
will interpret things as control characters precisely if that Terminal
would also have done so.

The previous locale-based sanitisation is appropriate if you're
sending the sanitised output to an OS terminal device managed outside
this process - the LC_CTYPE setting has the best chance of knowing how
that terminal device will interpret a byte stream. But I want to start
using the same sanitisation system for data intended for PuTTY's own
internal terminal emulator, in which case there's no reason why
LC_CTYPE should be expected to match that terminal's configuration,
and no reason to need it to either since we can check the internal
terminal configuration directly.

One small bodge: stripctrl_new_term() is actually a macro, which
passes in the function pointer term_translate() to the underlying real
constructor. That's just so that console-only tools can link in
stripctrl.c without acquiring a dependency on terminal.c (similarly to
how we pass random_read in to the mp_random functions).

2019-03-06 20:31:26 +00:00

Simon Tatham

6593009b0e

New utility object, StripCtrlChars.

This is for sanitising output that's going to be sent to a terminal,
if you don't want it to be able to send arbitrary escape sequences and
thereby (for example) move the cursor back up to existing text on the
screen and overprint it confusingly.

It works using the standard C library: we convert to a wide-character
string and back, and then use wctype.h to spot control characters in
the intermediate form. This means its idea of the conversion character
set is locale-based rather than any of our own charset library's fixed
settings - which is what you want if the aim is to protect your local
terminal (which we assume the system locale represents accurately).

This also means that the sanitiser strips things that will _act_ as
control characters when sent to the local terminal, whether or not
they were intended as control characters by a server that might have
had a different character set in mind. Since the main aim is to
protect the local terminal rather than to faithfully replicate the
server's intention, I think that's the right criterion.

It only strips control characters at the charset-independent layer,
like backspace, carriage return and the escape character: wctype.h
classifies those as control characters, but classifies as printing all
of the more Unicode-specific controls like bidirectional overrides.
But that's enough to prevent cursor repositioning, for example.

stripctrl.c comes with a test main() of its own, which I wasn't able
to fold into testcrypt and put in the test suite because of its
dependence on the system locale - it wouldn't be guaranteed to work
the same way on different test systems anyway.

A knock-on build tweak: because you can feed data into this sanitiser
in chunks of arbitrary size, including partial multibyte chars, I had
to use mbrtowc() for the decoding, and that means that in the 'old'
Win32 builds I have to link against the Visual Studio C++ library as
well as the C library, because for some reason that's where mbrtowc
lived in VS2003.

2019-02-20 07:27:22 +00:00

2 Commits