My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
Ilya Shipitsin sent me a list of errors reported by a tool 'cppcheck',
which I hadn't seen before, together with some fixes for things
already taken off that list. This change picks out all the things from
the remaining list that I could quickly identify as actual errors,
which it turns out are all format-string goofs along the lines of
using a %d with an unsigned int, or a %u with a signed int, or (in the
cases in charset/utf8.c) an actual _size_ mismatch which could in
principle have caused trouble on a big-endian target.
I've shifted away from using the SVN revision number as a monotonic
version identifier (replacing it in the Windows version resource with
a count of days since an arbitrary epoch), and I've removed all uses
of SVN keyword expansion (replacing them with version information
written out by Buildscr).
While I'm at it, I've done a major rewrite of the affected code which
centralises all the computation of the assorted version numbers and
strings into Buildscr, so that they're all more or less alongside each
other rather than scattered across multiple source files.
I've also retired the MD5-based manifest file system. A long time ago,
it seemed like a good idea to arrange that binaries of PuTTY would
automatically cease to identify themselves as a particular upstream
version number if any changes were made to the source code, so that if
someone made a local tweak and distributed the result then I wouldn't
get blamed for the results. Since then I've decided the whole idea is
more trouble than it's worth, so now distribution tarballs will have
version information baked in and people can just cope with that.
[originally from svn r10262]
bidi_char from wchar_t to unsigned int, but omitted to similarly
adjust the parameter to doMirror which is passed a pointer to that
field.
[originally from svn r9426]
[r9409 == 053d2ba6d1]
UTF-16 support. High Unicode characters in the terminal are now
converted back into surrogates during copy and draw operations, and
the Windows drawing code takes account of that when splitting up the
UTF-16 string for display. Meanwhile, accidental uses of wchar_t have
been replaced with 32-bit integers in parts of the cross-platform code
which were expecting not to have to deal with UTF-16.
[originally from svn r9409]
easily manage, by adopting a hybrid approach to Unicode text
display. The old approach of simply calling ExtTextOutW provided
font linking without us having to lift a finger, but didn't do the
right thing when it came to bidirectional or Arabic-shaped text.
Arabeyes' replacement exact_textout() supported the latter, but
turned out to break the former (with no warning from the Windows API
documentation, so it's not their fault).
So now I've got a second wrapper layer called general_textout(),
which splits the input string into substrings based on bidi
character class. Any character liable to cause bidi or shaping
behaviour if fed straight to ExtTextOutW is instead fed through
Arabeyes' exact_textout(), but the rest is fed straight to
ExtTextOutW as it used to be.
The effect appears to be that font linking is restored for all
characters _except_ Arabic and other bidi scripts, which means in
particular that we are no longer in a state of regression over 0.57.
(0.57 would have done font linking on Arabic as well, but would also
have misbidied it, so we've merely exchanged one failure mode for
another slightly less harmful one in that situation.)
[originally from svn r6910]
incorrect. I must have written that binary search idiom a hundred
times, so it's rather embarrassing that I can't _automatically_ get
it right! This was causing all kinds of characters to be classified
as ON when they should have been various other classes.
Also while I'm here, I've added another test case to utf8.txt (a
small piece of Arabic within a predominantly L->R line), and also
supplied a means to compile minibidi.c with -DTEST_GETTYPE to
produce a command-line character class lookup tool. (Not sure what
use that'll be _other_ than debugging this precise problem, but I
don't like to throw it away now I've written it :-)
[originally from svn r5016]
searches a list of (start,end,type) tuples. This increases data size
by about 5Kb, which is a shame; but on the plus side, it boosts
performance from O(N) to O(log N). As an added bonus, the table now
covers _all_ of Unicode, not just the BMP.
[originally from svn r4964]
- rewrote the reversal loop in flipThisRun to be considerably clearer
- rewrote leastGreaterOdd and leastGreaterEven as bit-twiddling macros
- replaced malloc/free with snewn/sfree
- lost some gratuitous repeat calls of getType on the same character
And most noticeably:
- got rid of minibidi.h, since it was entirely full of minibidi.c
internals (including constant data definitions!) and wasn't used
to provide an external interface at all. Everything in it has
been folded into minibidi.c.
[originally from svn r4963]