putty-source

mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-09 09:27:59 +00:00

Author	SHA1	Message	Date
Simon Tatham	98200d1bfe	Arm: turn on PSTATE.DIT if available and needed. DIT, for 'Data-Independent Timing', is a bit you can set in the processor state on sufficiently new Arm CPUs, which promises that a long list of instructions will deliberately avoid varying their timing based on the input register values. Just what you want for keeping your constant-time crypto primitives constant-time. As far as I'm aware, no CPU has _yet_ implemented any data-dependent optimisations, so DIT is a safety precaution against them doing so in future. It would be embarrassing to be caught without it if a future CPU does do that, so we now turn on DIT in the PuTTY process state. I've put a call to the new enable_dit() function at the start of every main() and WinMain() belonging to a program that might do cryptography (even testcrypt, in case someone uses it for something!), and in case I missed one there, also added a second call at the first moment that any cryptography-using part of the code looks as if it might become active: when an instance of the SSH protocol object is configured, when the system PRNG is initialised, and when selecting any cryptographic authentication protocol in an HTTP or SOCKS proxy connection. With any luck those precautions between them should ensure it's on whenever we need it. Arm's own recommendation is that you should carefully choose the granularity at which you enable and disable DIT: there's a potential time cost to turning it on and off (I'm not sure what, but plausibly something of the order of a pipeline flush), so it's a performance hit to do it _inside_ each individual crypto function, but if CPUs start supporting significant data-dependent optimisation in future, then it will also become a noticeable performance hit to just leave it on across the whole process. So you'd like to do it somewhere in the middle: for example, you might turn on DIT once around the whole process of verifying and decrypting an SSH packet, instead of once for decryption and once for MAC. With all respect to that recommendation as a strategy for maximum performance, I'm not following it here. I turn on DIT at the start of the PuTTY process, and then leave it on. Rationale: 1. PuTTY is not otherwise a performance-critical application: it's not likely to max out your CPU for any purpose _other_ than cryptography. The most CPU-intensive non-cryptographic thing I can imagine a PuTTY process doing is the complicated computation of font rendering in the terminal, and that will normally be cached (you don't recompute each glyph from its outline and hints for every time you display it). 2. I think a bigger risk lies in accidental side channels from having DIT turned off when it should have been on. I can imagine lots of causes for that. Missing a crypto operation in some unswept corner of the code; confusing control flow (like my coroutine macros) jumping with DIT clear into the middle of a region of code that expected DIT to have been set at the beginning; having a reference counter of DIT requests and getting it out of sync. In a more sophisticated programming language, it might be possible to avoid the risk in #2 by cleverness with the type system. For example, in Rust, you could have a zero-sized type that acts as a proof token for DIT being enabled (it would be constructed by a function that also sets DIT, have a Drop implementation that clears DIT, and be !Send so you couldn't use it in a thread other than the one where DIT was set), and then you could require all the actual crypto functions to take a DitToken as an extra parameter, at zero runtime cost. Then "oops I forgot to set DIT around this piece of crypto" would become a compile error. Even so, you'd have to take some care with coroutine-structured code (what happens if a Rust async function yields while holding a DIT token?) and with nesting (if you have two DIT tokens, you don't want dropping the inner one to clear DIT while the outer one is still there to wrongly convince callees that it's set). Maybe in Rust you could get this all to work reliably. But not in C! DIT is an optional feature of the Arm architecture, so we must first test to see if it's supported. This is done the same way as we already do for the various Arm crypto accelerators: on ELF-based systems, check the appropriate bit in the 'hwcap' words in the ELF aux vector; on Mac, look for an appropriate sysctl flag. On Windows I don't know of a way to query the DIT feature, _or_ of a way to write the necessary enabling instruction in an MSVC-compatible way. I've _heard_ that it might not be necessary, because Windows might just turn on DIT unconditionally and leave it on, in an even more extreme version of my own strategy. I don't have a source for that - I heard it by word of mouth - but I _hope_ it's true, because that would suit me very well! Certainly I can't write code to enable DIT without knowing (a) how to do it, (b) how to know if it's safe. Nonetheless, I've put the enable_dit() call in all the right places in the Windows main programs as well as the Unix and cross-platform code, so that if I later find out that I _can_ put in an explicit enable of DIT in some way, I'll only have to arrange to set HAVE_ARM_DIT and compile the enable_dit() function appropriately.	2024-12-19 08:52:47 +00:00
Simon Tatham	74150633f1	Add and use cmdline_arg_to_filename(). Converting a CmdlineArg straight to a Filename allows us to make the filename out of the wide-character version of the string on Windows. So now filenames specified on the command line should generally be able to handle pathnames containing Unicode characters not in the system code page. This change also involves making some char pointers _into_ Filename structs where they weren't previously: for example, the 'openssh_config_file' variable in Windows Pageant's WinMain().	2024-09-26 11:30:07 +01:00
Simon Tatham	841bf321d4	New abstraction for command-line arguments. This begins the process of enabling our Windows applications to handle Unicode characters on their command lines which don't fit in the system code page. Instead of passing plain strings to cmdline_process_param, we now pass a partially opaque and platform-specific thing called a CmdlineArg. This has a method that extracts the argument word as a default-encoded string, and another one that tries to extract it as UTF-8 (though it may fail if the UTF-8 isn't available). On Windows, the command line is now constructed by calling split_into_argv_w on the Unicode command line returned by GetCommandLineW(), and the UTF-8 method returns text converted directly from that wide-character form, not going via the system code page. So it _can_ include UTF-8 characters that wouldn't have round-tripped via CP_ACP. This commit introduces the abstraction and switches over the cross-platform and Windows argv-handling code to use it, with minimal functional change. Nothing yet tries to call cmdline_arg_get_utf8(). I say 'cross-platform and Windows' because on the Unix side there's still a lot of use of plain old argv which I haven't converted. That would be a much larger project, and isn't currently needed: the _current_ aim of this abstraction is to get the right things to happen relating to Unicode on Windows, so for code that doesn't run on Windows anyway, it's not adding value. (Also there's a tension with GTK, which wants to talk to standard argv and extract arguments _it_ knows about, so at the very least we'd have to let it munge argv before importing it into this new system.)	2024-09-26 11:30:07 +01:00
Simon Tatham	4341ba6d5c	Add platform-independent fontspec_new_default() function. Constructing a FontSpec in platform-independent code is awkward, because you can't call fontspec_new() outside the platform subdirs (since its prototype varies per platform). But sometimes you just need _some_ valid FontSpec, e.g. to put in a Conf that will be used in some place where you don't actually care about font settings, such as a purely CLI program. Both Unix and Windows _have_ an idiom for this, but they're different, because their FontSpec constructors have different prototypes. The existing CLI tools have always had per-platform main source files, so they just use the locally appropriate method of constructing a boring don't-care FontSpec. But if you want a _platform-independent_ main source file, such as you might find in a test program, then that's rather awkward. Better to have a platform-independent API for making a default FontSpec.	2023-02-18 14:10:21 +00:00
Simon Tatham	d509a2dc1e	Formatting: normalise to put a space after condition keywords. 'if (thing)' is the local style here, not 'if(thing)'. Similarly with 'for' and 'while'.	2022-12-28 15:32:24 +00:00
Simon Tatham	20f818af12	Rename 'ret' variables passed from allocation to return. I mentioned recently (in commit `9e7d4c53d8`) message that I'm no longer fond of the variable name 'ret', because it's used in two quite different contexts: it's the return value from a subroutine you just called (e.g. 'int ret = read(fd, buf, len);' and then check for error or EOF), or it's the value you're preparing to return from the _containing_ routine (maybe by assigning it a default value and then conditionally modifying it, or by starting at NULL and reallocating, or setting it just before using the 'goto out' cleanup idiom). In the past I've occasionally made mistakes by forgetting which meaning the variable had, or accidentally conflating both uses. If all else fails, I now prefer 'retd' (short for 'returned') in the former situation, and 'toret' (obviously, the value 'to return') in the latter case. But even better is to pick a name that actually says something more specific about what the thing actually is. One particular bad habit throughout this codebase is to have a set of functions that deal with some object type (say 'Foo'), all but one of which take a 'Foo foo' parameter, but the foo_new() function starts with 'Foo ret = snew(Foo)'. If all the rest of them think the canonical name for the ambient Foo is 'foo', so should foo_new()! So here's a no-brainer start on cutting down on the uses of 'ret': I looked for all the cases where it was being assigned the result of an allocation, and renamed the variable to be a description of the thing being allocated. In the case of a new() function belonging to a family, I picked the same name as the rest of the functions in its own family, for consistency. In other cases I picked something sensible. One case where it _does_ make sense not to use your usual name for the variable type is when you're cloning an existing object. In that case, _neither_ of the Foo objects involved should be called 'foo', because it's ambiguous! They should be named so you can see which is which. In the two cases I found here, I've called them 'orig' and 'copy'. As in the previous refactoring, many thanks to clang-rename for the help.	2022-09-14 16:10:29 +01:00
Simon Tatham	c1a2114b28	Implement AES-GCM using the @openssh.com protocol IDs. I only recently found out that OpenSSH defined their own protocol IDs for AES-GCM, defined to work the same as the standard ones except that they fixed the semantics for how you select the linked cipher+MAC pair during key exchange. (RFC 5647 defines protocol ids for AES-GCM in both the cipher and MAC namespaces, and requires that you MUST select both or neither - but this contradicts the selection policy set out in the base SSH RFCs, and there's no discussion of how you resolve a conflict between them! OpenSSH's answer is to do it the same way ChaCha20-Poly1305 works, because that will ensure the two suites don't fight.) People do occasionally ask us for this linked cipher/MAC pair, and now I know it's actually feasible, I've implemented it, including a pair of vector implementations for x86 and Arm using their respective architecture extensions for multiplying polynomials over GF(2). Unlike ChaCha20-Poly1305, I've kept the cipher and MAC implementations in separate objects, with an arm's-length link between them that the MAC uses when it needs to encrypt single cipher blocks to use as the inputs to the MAC algorithm. That enables the cipher and the MAC to be independently selected from their hardware-accelerated versions, just in case someone runs on a system that has polynomial multiplication instructions but not AES acceleration, or vice versa. There's a fourth implementation of the GCM MAC, which is a pure software implementation of the same algorithm used in the vectorised versions. It's too slow to use live, but I've kept it in the code for future testing needs, and because it's a convenient place to dump my design comments. The vectorised implementations are fairly crude as far as optimisation goes. I'm sure serious x86 _or_ Arm optimisation engineers would look at them and laugh. But GCM is a fast MAC compared to HMAC-SHA-256 (indeed compared to HMAC-anything-at-all), so it should at least be good enough to use. And we've got a working version with some tests now, so if someone else wants to improve them, they can.	2022-08-16 20:33:58 +01:00
Simon Tatham	fd840f0dfe	Add CPU feature checks on M1 macOS. I booted my M1 Mac into macOS rather than Asahi for the first time in a while, and discovered that an OS update seems to have added some sysctl flags indicating the presence of the CPU extensions that I previously knew of no way to check for! Added them checks to arm_arch_queries.c, though I've also retained backwards compat with the previous OS version which didn't have them at all.	2022-08-16 18:39:12 +01:00
Simon Tatham	14203bc54f	Formatting: standardise on "func(\n", not "func\n(". If the function name (or expression) in a function call or declaration is itself so long that even the first argument doesn't fit after it on the same line, or if that would leave so little space that it would be silly to try to wrap all the run-on lines into a tall thin column, then I used to do this ludicrously_long_function_name (arg1, arg2, arg3); and now prefer this ludicrously_long_function_name( arg1, arg2, arg3); I picked up the habit from Python, where the latter idiom is required by Python's syntactic significance of newlines (you can write the former if you use a backslash-continuation, but pretty much everyone seems to agree that that's much uglier). But I've found it works well in C as well: it makes it more obvious that the previous line is incomplete, it gives you a tiny bit more space to wrap the following lines into (the old idiom indents the _third_ line one space beyond the second), and I generally turn out to agree with the knock-on indentation decisions made by at least Emacs if you do it in the middle of a complex expression. Plus, of course, using the _same_ idiom between C and Python means less state-switching. So, while I'm making annoying indentation changes in general, this seems like a good time to dig out all the cases of the old idiom in this code, and switch them over to the new.	2022-08-03 20:48:46 +01:00
Simon Tatham	5935c68288	Update source file names in comments and docs. Correcting a source file name in the docs just now reminded me that I've seen a lot of outdated source file names elsewhere in the code, due to all the reorganisation since we moved to cmake. Here's a giant pass of trying to make them all accurate again.	2022-01-22 15:51:31 +00:00
Simon Tatham	a2ff884512	Richer data type for interactive prompt results. All the seat functions that request an interactive prompt of some kind to the user - both the main seat_get_userpass_input and the various confirmation dialogs for things like host keys - were using a simple int return value, with the general semantics of 0 = "fail", 1 = "proceed" (and in the case of seat_get_userpass_input, answers to the prompts were provided), and -1 = "request in progress, wait for a callback". In this commit I change all those functions' return types to a new struct called SeatPromptResult, whose primary field is an enum replacing those simple integer values. The main purpose is that the enum has not three but _four_ values: the "fail" result has been split into 'user abort' and 'software abort'. The distinction is that a user abort occurs as a result of an interactive UI action, such as the user clicking 'cancel' in a dialog box or hitting ^D or ^C at a terminal password prompt - and therefore, there's no need to display an error message telling the user that the interactive operation has failed, because the user already knows, because they _did_ it. 'Software abort' is from any other cause, where PuTTY is the first to know there was a problem, and has to tell the user. We already had this 'user abort' vs 'software abort' distinction in other parts of the code - the SSH backend has separate termination functions which protocol layers can call. But we assumed that any failure from an interactive prompt request fell into the 'user abort' category, which is not true. A couple of examples: if you configure a host key fingerprint in your saved session via the SSH > Host keys pane, and the server presents a host key that doesn't match it, then verify_ssh_host_key would report that the user had aborted the connection, and feel no need to tell the user what had gone wrong! Similarly, if a password provided on the command line was not accepted, then (after I fixed the semantics of that in the previous commit) the same wrong handling would occur. So now, those Seat prompt functions too can communicate whether the user or the software originated a connection abort. And in the latter case, we also provide an error message to present to the user. Result: in those two example cases (and others), error messages should no longer go missing. Implementation note: to avoid the hassle of having the error message in a SeatPromptResult being a dynamically allocated string (and hence, every recipient of one must always check whether it's non-NULL and free it on every exit path, plus being careful about copying the struct around), I've instead arranged that the structure contains a function pointer and a couple of parameters, so that the string form of the message can be constructed on demand. That way, the only users who need to free it are the ones who actually _asked_ for it in the first place, which is a much smaller set. (This is one of the rare occasions that I regret not having C++'s extra features available in this code base - a unique_ptr or shared_ptr to a string would have been just the thing here, and the compiler would have done all the hard work for me of remembering where to insert the frees!)	2021-12-28 18:08:31 +00:00
Simon Tatham	d9f217323e	Break up gtkmisc.c. It's another file that should have been subdivided into lots of tiny separate things in the utils library - especially since for some reason I made a completely separate 'guimisc' cmake-level library for it when there was no need.	2021-04-26 18:00:01 +01:00
Simon Tatham	fca13a17b1	Break up crypto modules containing HW acceleration. This applies to all of AES, SHA-1, SHA-256 and SHA-512. All those source files previously contained multiple implementations of the algorithm, enabled or disabled by ifdefs detecting whether they would work on a given compiler. And in order to get advanced machine instructions like AES-NI or NEON crypto into the output file when the compile flags hadn't enabled them, we had to do nasty stuff with compiler-specific pragmas or attributes. Now we can do the detection at cmake time, and enable advanced instructions in the more sensible way, by compile-time flags. So I've broken up each of these modules into lots of sub-pieces: a file called (e.g.) 'foo-common.c' containing common definitions across all implementations (such as round constants), one called 'foo-select.c' containing the top-level vtable(s), and a separate file for each implementation exporting just the vtable(s) for that implementation. One advantage of this is that it depends a lot less on compiler- specific bodgery. My particular least favourite part of the previous setup was the part where I had to _manually_ define some Arm ACLE feature macros before including <arm_neon.h>, so that it would define the intrinsics I wanted. Now I'm enabling interesting architecture features in the normal way, on the compiler command line, there's no need for that kind of trick: the right feature macros are already defined and <arm_neon.h> does the right thing. Another change in this reorganisation is that I've stopped assuming there's just one hardware implementation per platform. Previously, the accelerated vtables were called things like sha256_hw, and varied between FOO-NI and NEON depending on platform; and the selection code would simply ask 'is hw available? if so, use hw, else sw'. Now, each HW acceleration strategy names its vtable its own way, and the selection vtable has a whole list of possibilities to iterate over looking for a supported one. So if someone feels like writing a second accelerated implementation of something for a given platform - for example, I've heard you can use plain NEON to speed up AES somewhat even without the crypto extension - then it will now have somewhere to drop in alongside the existing ones.	2021-04-21 21:55:26 +01:00
Simon Tatham	3396c97da9	New library-style 'utils' subdirectories. Now that the new CMake build system is encouraging us to lay out the code like a set of libraries, it seems like a good idea to make them look more _like_ libraries, by putting things into separate modules as far as possible. This fixes several previous annoyances in which you had to link against some object in order to get a function you needed, but that object also contained other functions you didn't need which included link-time symbol references you didn't want to have to deal with. The usual offender was subsidiary supporting programs including misc.c for some innocuous function and then finding they had to deal with the requirements of buildinfo(). This big reorganisation introduces three new subdirectories called 'utils', one at the top level and one in each platform subdir. In each case, the directory contains basically the same files that were previously placed in the 'utils' build-time library, except that the ones that were extremely miscellaneous (misc.c, utils.c, uxmisc.c, winmisc.c, winmiscs.c, winutils.c) have been split up into much smaller pieces.	2021-04-18 08:18:27 +01:00

14 Commits