2002-08-04 21:18:56 +00:00
|
|
|
/*
|
2013-01-19 17:17:44 +00:00
|
|
|
* Split a complete command line into argc/argv, attempting to do it
|
|
|
|
* exactly the same way the Visual Studio C library would do it (so
|
|
|
|
* that our console utilities, which receive argc and argv already
|
|
|
|
* broken apart by the C library, will have their command lines
|
|
|
|
* processed in the same way as the GUI utilities which get a whole
|
|
|
|
* command line and must call this function).
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-06 17:57:37 +00:00
|
|
|
* Does not modify the input command line.
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-06 17:57:37 +00:00
|
|
|
* The final parameter (argstart) is used to return a second array
|
|
|
|
* of char * pointers, the same length as argv, each one pointing
|
|
|
|
* at the start of the corresponding element of argv in the
|
|
|
|
* original command line. So if you get half way through processing
|
|
|
|
* your command line in argc/argv form and then decide you want to
|
|
|
|
* treat the rest as a raw string, you can. If you don't want to,
|
|
|
|
* `argstart' can be safely left NULL.
|
2002-08-04 21:18:56 +00:00
|
|
|
*/
|
New library-style 'utils' subdirectories.
Now that the new CMake build system is encouraging us to lay out the
code like a set of libraries, it seems like a good idea to make them
look more _like_ libraries, by putting things into separate modules as
far as possible.
This fixes several previous annoyances in which you had to link
against some object in order to get a function you needed, but that
object also contained other functions you didn't need which included
link-time symbol references you didn't want to have to deal with. The
usual offender was subsidiary supporting programs including misc.c for
some innocuous function and then finding they had to deal with the
requirements of buildinfo().
This big reorganisation introduces three new subdirectories called
'utils', one at the top level and one in each platform subdir. In each
case, the directory contains basically the same files that were
previously placed in the 'utils' build-time library, except that the
ones that were extremely miscellaneous (misc.c, utils.c, uxmisc.c,
winmisc.c, winmiscs.c, winutils.c) have been split up into much
smaller pieces.
2021-04-17 14:22:20 +00:00
|
|
|
|
|
|
|
#include "putty.h"
|
|
|
|
|
2002-08-06 17:57:37 +00:00
|
|
|
void split_into_argv(char *cmdline, int *argc, char ***argv,
|
2019-09-08 19:29:00 +00:00
|
|
|
char ***argstart)
|
2002-08-04 21:18:56 +00:00
|
|
|
{
|
2002-08-06 17:57:37 +00:00
|
|
|
char *p;
|
2002-08-04 21:18:56 +00:00
|
|
|
char *outputline, *q;
|
2002-08-06 17:57:37 +00:00
|
|
|
char **outputargv, **outputargstart;
|
2002-08-04 21:18:56 +00:00
|
|
|
int outputargc;
|
|
|
|
|
|
|
|
/*
|
2013-01-19 17:17:44 +00:00
|
|
|
* These argument-breaking rules apply to Visual Studio 7, which
|
|
|
|
* is currently the compiler expected to be used for PuTTY. Visual
|
|
|
|
* Studio 10 has different rules, lacking the curious mod 3
|
|
|
|
* behaviour of consecutive quotes described below; I presume they
|
|
|
|
* fixed a bug. As and when we migrate to a newer compiler, we'll
|
|
|
|
* have to adjust this to match; however, for the moment we
|
|
|
|
* faithfully imitate in our GUI utilities what our CLI utilities
|
|
|
|
* can't be prevented from doing.
|
|
|
|
*
|
|
|
|
* When I investigated this, at first glance the rules appeared to
|
|
|
|
* be:
|
2002-08-04 21:18:56 +00:00
|
|
|
*
|
|
|
|
* - Single quotes are not special characters.
|
|
|
|
*
|
|
|
|
* - Double quotes are removed, but within them spaces cease
|
|
|
|
* to be special.
|
|
|
|
*
|
|
|
|
* - Backslashes are _only_ special when a sequence of them
|
|
|
|
* appear just before a double quote. In this situation,
|
|
|
|
* they are treated like C backslashes: so \" just gives a
|
|
|
|
* literal quote, \\" gives a literal backslash and then
|
|
|
|
* opens or closes a double-quoted segment, \\\" gives a
|
|
|
|
* literal backslash and then a literal quote, \\\\" gives
|
|
|
|
* two literal backslashes and then opens/closes a
|
|
|
|
* double-quoted segment, and so forth. Note that this
|
|
|
|
* behaviour is identical inside and outside double quotes.
|
|
|
|
*
|
|
|
|
* - Two successive double quotes become one literal double
|
|
|
|
* quote, but only _inside_ a double-quoted segment.
|
|
|
|
* Outside, they just form an empty double-quoted segment
|
|
|
|
* (which may cause an empty argument word).
|
|
|
|
*
|
|
|
|
* - That only leaves the interesting question of what happens
|
|
|
|
* when one or more backslashes precedes two or more double
|
|
|
|
* quotes, starting inside a double-quoted string. And the
|
|
|
|
* answer to that appears somewhat bizarre. Here I tabulate
|
|
|
|
* number of backslashes (across the top) against number of
|
|
|
|
* quotes (down the left), and indicate how many backslashes
|
|
|
|
* are output, how many quotes are output, and whether a
|
|
|
|
* quoted segment is open at the end of the sequence:
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* backslashes
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* 0 1 2 3 4
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* 0 0,0,y | 1,0,y 2,0,y 3,0,y 4,0,y
|
|
|
|
* --------+-----------------------------
|
|
|
|
* 1 0,0,n | 0,1,y 1,0,n 1,1,y 2,0,n
|
|
|
|
* q 2 0,1,n | 0,1,n 1,1,n 1,1,n 2,1,n
|
|
|
|
* u 3 0,1,y | 0,2,n 1,1,y 1,2,n 2,1,y
|
|
|
|
* o 4 0,1,n | 0,2,y 1,1,n 1,2,y 2,1,n
|
|
|
|
* t 5 0,2,n | 0,2,n 1,2,n 1,2,n 2,2,n
|
|
|
|
* e 6 0,2,y | 0,3,n 1,2,y 1,3,n 2,2,y
|
|
|
|
* s 7 0,2,n | 0,3,y 1,2,n 1,3,y 2,2,n
|
|
|
|
* 8 0,3,n | 0,3,n 1,3,n 1,3,n 2,3,n
|
|
|
|
* 9 0,3,y | 0,4,n 1,3,y 1,4,n 2,3,y
|
|
|
|
* 10 0,3,n | 0,4,y 1,3,n 1,4,y 2,3,n
|
|
|
|
* 11 0,4,n | 0,4,n 1,4,n 1,4,n 2,4,n
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* [Test fragment was of the form "a\\\"""b c" d.]
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* There is very weird mod-3 behaviour going on here in the
|
|
|
|
* number of quotes, and it even applies when there aren't any
|
|
|
|
* backslashes! How ghastly.
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* With a bit of thought, this extremely odd diagram suddenly
|
|
|
|
* coalesced itself into a coherent, if still ghastly, model of
|
|
|
|
* how things work:
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* - As before, backslashes are only special when one or more
|
|
|
|
* of them appear contiguously before at least one double
|
|
|
|
* quote. In this situation the backslashes do exactly what
|
|
|
|
* you'd expect: each one quotes the next thing in front of
|
|
|
|
* it, so you end up with n/2 literal backslashes (if n is
|
|
|
|
* even) or (n-1)/2 literal backslashes and a literal quote
|
|
|
|
* (if n is odd). In the latter case the double quote
|
|
|
|
* character right after the backslashes is used up.
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* - After that, any remaining double quotes are processed. A
|
|
|
|
* string of contiguous unescaped double quotes has a mod-3
|
|
|
|
* behaviour:
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* * inside a quoted segment, a quote ends the segment.
|
|
|
|
* * _immediately_ after ending a quoted segment, a quote
|
|
|
|
* simply produces a literal quote.
|
|
|
|
* * otherwise, outside a quoted segment, a quote begins a
|
|
|
|
* quoted segment.
|
2019-09-08 19:29:00 +00:00
|
|
|
*
|
2002-08-04 21:18:56 +00:00
|
|
|
* So, for example, if we started inside a quoted segment
|
|
|
|
* then two contiguous quotes would close the segment and
|
|
|
|
* produce a literal quote; three would close the segment,
|
|
|
|
* produce a literal quote, and open a new segment. If we
|
|
|
|
* started outside a quoted segment, then two contiguous
|
|
|
|
* quotes would open and then close a segment, producing no
|
|
|
|
* output (but potentially creating a zero-length argument);
|
|
|
|
* but three quotes would open and close a segment and then
|
|
|
|
* produce a literal quote.
|
|
|
|
*/
|
|
|
|
|
2002-08-07 17:29:28 +00:00
|
|
|
/*
|
|
|
|
* First deal with the simplest of all special cases: if there
|
|
|
|
* aren't any arguments, return 0,NULL,NULL.
|
|
|
|
*/
|
|
|
|
while (*cmdline && isspace(*cmdline)) cmdline++;
|
|
|
|
if (!*cmdline) {
|
2019-09-08 19:29:00 +00:00
|
|
|
if (argc) *argc = 0;
|
|
|
|
if (argv) *argv = NULL;
|
|
|
|
if (argstart) *argstart = NULL;
|
|
|
|
return;
|
2002-08-07 17:29:28 +00:00
|
|
|
}
|
|
|
|
|
2002-08-04 21:18:56 +00:00
|
|
|
/*
|
|
|
|
* This will guaranteeably be big enough; we can realloc it
|
|
|
|
* down later.
|
|
|
|
*/
|
2003-03-29 16:14:26 +00:00
|
|
|
outputline = snewn(1+strlen(cmdline), char);
|
|
|
|
outputargv = snewn(strlen(cmdline)+1 / 2, char *);
|
|
|
|
outputargstart = snewn(strlen(cmdline)+1 / 2, char *);
|
2002-08-04 21:18:56 +00:00
|
|
|
|
|
|
|
p = cmdline; q = outputline; outputargc = 0;
|
|
|
|
|
|
|
|
while (*p) {
|
2019-09-08 19:29:00 +00:00
|
|
|
bool quote;
|
|
|
|
|
|
|
|
/* Skip whitespace searching for start of argument. */
|
|
|
|
while (*p && isspace(*p)) p++;
|
|
|
|
if (!*p) break;
|
|
|
|
|
|
|
|
/* We have an argument; start it. */
|
|
|
|
outputargv[outputargc] = q;
|
|
|
|
outputargstart[outputargc] = p;
|
|
|
|
outputargc++;
|
|
|
|
quote = false;
|
|
|
|
|
|
|
|
/* Copy data into the argument until it's finished. */
|
|
|
|
while (*p) {
|
|
|
|
if (!quote && isspace(*p))
|
|
|
|
break; /* argument is finished */
|
|
|
|
|
|
|
|
if (*p == '"' || *p == '\\') {
|
|
|
|
/*
|
|
|
|
* We have a sequence of zero or more backslashes
|
|
|
|
* followed by a sequence of zero or more quotes.
|
|
|
|
* Count up how many of each, and then deal with
|
|
|
|
* them as appropriate.
|
|
|
|
*/
|
|
|
|
int i, slashes = 0, quotes = 0;
|
|
|
|
while (*p == '\\') slashes++, p++;
|
|
|
|
while (*p == '"') quotes++, p++;
|
|
|
|
|
|
|
|
if (!quotes) {
|
|
|
|
/*
|
|
|
|
* Special case: if there are no quotes,
|
|
|
|
* slashes are not special at all, so just copy
|
|
|
|
* n slashes to the output string.
|
|
|
|
*/
|
|
|
|
while (slashes--) *q++ = '\\';
|
|
|
|
} else {
|
|
|
|
/* Slashes annihilate in pairs. */
|
|
|
|
while (slashes >= 2) slashes -= 2, *q++ = '\\';
|
|
|
|
|
|
|
|
/* One remaining slash takes out the first quote. */
|
|
|
|
if (slashes) quotes--, *q++ = '"';
|
|
|
|
|
|
|
|
if (quotes > 0) {
|
|
|
|
/* Outside a quote segment, a quote starts one. */
|
|
|
|
if (!quote) quotes--;
|
|
|
|
|
|
|
|
/* Now we produce (n+1)/3 literal quotes... */
|
|
|
|
for (i = 3; i <= quotes+1; i += 3) *q++ = '"';
|
|
|
|
|
|
|
|
/* ... and end in a quote segment iff 3 divides n. */
|
|
|
|
quote = (quotes % 3 == 0);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
*q++ = *p++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* At the end of an argument, just append a trailing NUL. */
|
|
|
|
*q++ = '\0';
|
2002-08-04 21:18:56 +00:00
|
|
|
}
|
|
|
|
|
2003-03-29 16:14:26 +00:00
|
|
|
outputargv = sresize(outputargv, outputargc, char *);
|
|
|
|
outputargstart = sresize(outputargstart, outputargc, char *);
|
2002-08-04 21:18:56 +00:00
|
|
|
|
|
|
|
if (argc) *argc = outputargc;
|
2002-08-06 17:57:37 +00:00
|
|
|
if (argv) *argv = outputargv; else sfree(outputargv);
|
|
|
|
if (argstart) *argstart = outputargstart; else sfree(outputargstart);
|
2002-08-04 21:18:56 +00:00
|
|
|
}
|
|
|
|
|
2021-04-17 15:39:31 +00:00
|
|
|
#ifdef TEST
|
2002-08-04 21:18:56 +00:00
|
|
|
|
|
|
|
const struct argv_test {
|
|
|
|
const char *cmdline;
|
|
|
|
const char *argv[10];
|
|
|
|
} argv_tests[] = {
|
|
|
|
/*
|
|
|
|
* We generate this set of tests by invoking ourself with
|
|
|
|
* `-generate'.
|
|
|
|
*/
|
|
|
|
{"ab c\" d", {"ab", "c d", NULL}},
|
|
|
|
{"a\"b c\" d", {"ab c", "d", NULL}},
|
|
|
|
{"a\"\"b c\" d", {"ab", "c d", NULL}},
|
|
|
|
{"a\"\"\"b c\" d", {"a\"b", "c d", NULL}},
|
|
|
|
{"a\"\"\"\"b c\" d", {"a\"b c", "d", NULL}},
|
|
|
|
{"a\"\"\"\"\"b c\" d", {"a\"b", "c d", NULL}},
|
|
|
|
{"a\"\"\"\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"a\"\"\"\"\"\"\"b c\" d", {"a\"\"b c", "d", NULL}},
|
|
|
|
{"a\"\"\"\"\"\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"a\\b c\" d", {"a\\b", "c d", NULL}},
|
|
|
|
{"a\\\"b c\" d", {"a\"b", "c d", NULL}},
|
|
|
|
{"a\\\"\"b c\" d", {"a\"b c", "d", NULL}},
|
|
|
|
{"a\\\"\"\"b c\" d", {"a\"b", "c d", NULL}},
|
|
|
|
{"a\\\"\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\"\"\"\"\"b c\" d", {"a\"\"b c", "d", NULL}},
|
|
|
|
{"a\\\"\"\"\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\"\"\"\"\"\"\"b c\" d", {"a\"\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\"\"\"\"\"\"\"\"b c\" d", {"a\"\"\"b c", "d", NULL}},
|
|
|
|
{"a\\\\b c\" d", {"a\\\\b", "c d", NULL}},
|
|
|
|
{"a\\\\\"b c\" d", {"a\\b c", "d", NULL}},
|
|
|
|
{"a\\\\\"\"b c\" d", {"a\\b", "c d", NULL}},
|
|
|
|
{"a\\\\\"\"\"b c\" d", {"a\\\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\"\"\"\"b c\" d", {"a\\\"b c", "d", NULL}},
|
|
|
|
{"a\\\\\"\"\"\"\"b c\" d", {"a\\\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\"\"\"\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\"\"\"\"\"\"\"b c\" d", {"a\\\"\"b c", "d", NULL}},
|
|
|
|
{"a\\\\\"\"\"\"\"\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\b c\" d", {"a\\\\\\b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\"b c\" d", {"a\\\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\"\"b c\" d", {"a\\\"b c", "d", NULL}},
|
|
|
|
{"a\\\\\\\"\"\"b c\" d", {"a\\\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\"\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\"\"\"\"\"b c\" d", {"a\\\"\"b c", "d", NULL}},
|
|
|
|
{"a\\\\\\\"\"\"\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\"\"\"\"\"\"\"b c\" d", {"a\\\"\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\"\"\"\"\"\"\"\"b c\" d", {"a\\\"\"\"b c", "d", NULL}},
|
|
|
|
{"a\\\\\\\\b c\" d", {"a\\\\\\\\b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\\\"b c\" d", {"a\\\\b c", "d", NULL}},
|
|
|
|
{"a\\\\\\\\\"\"b c\" d", {"a\\\\b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\\\"\"\"b c\" d", {"a\\\\\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\\\"\"\"\"b c\" d", {"a\\\\\"b c", "d", NULL}},
|
|
|
|
{"a\\\\\\\\\"\"\"\"\"b c\" d", {"a\\\\\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\\\"\"\"\"\"\"b c\" d", {"a\\\\\"\"b", "c d", NULL}},
|
|
|
|
{"a\\\\\\\\\"\"\"\"\"\"\"b c\" d", {"a\\\\\"\"b c", "d", NULL}},
|
|
|
|
{"a\\\\\\\\\"\"\"\"\"\"\"\"b c\" d", {"a\\\\\"\"b", "c d", NULL}},
|
|
|
|
{"\"ab c\" d", {"ab c", "d", NULL}},
|
|
|
|
{"\"a\"b c\" d", {"ab", "c d", NULL}},
|
|
|
|
{"\"a\"\"b c\" d", {"a\"b", "c d", NULL}},
|
|
|
|
{"\"a\"\"\"b c\" d", {"a\"b c", "d", NULL}},
|
|
|
|
{"\"a\"\"\"\"b c\" d", {"a\"b", "c d", NULL}},
|
|
|
|
{"\"a\"\"\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\"\"\"\"\"\"b c\" d", {"a\"\"b c", "d", NULL}},
|
|
|
|
{"\"a\"\"\"\"\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\"\"\"\"\"\"\"\"b c\" d", {"a\"\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\b c\" d", {"a\\b c", "d", NULL}},
|
|
|
|
{"\"a\\\"b c\" d", {"a\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\"\"b c\" d", {"a\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\"\"\"\"b c\" d", {"a\"\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\"\"\"\"\"b c\" d", {"a\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\"\"\"\"\"\"b c\" d", {"a\"\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\"\"\"\"\"\"\"b c\" d", {"a\"\"\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\"\"\"\"\"\"\"\"b c\" d", {"a\"\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\b c\" d", {"a\\\\b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\"b c\" d", {"a\\b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\"\"b c\" d", {"a\\\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\"\"\"b c\" d", {"a\\\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\"\"\"\"b c\" d", {"a\\\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\"\"\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\"\"\"\"\"\"b c\" d", {"a\\\"\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\"\"\"\"\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\"\"\"\"\"\"\"\"b c\" d", {"a\\\"\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\b c\" d", {"a\\\\\\b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\\\"b c\" d", {"a\\\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\\\"\"b c\" d", {"a\\\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\"\"\"\"b c\" d", {"a\\\"\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\\\"\"\"\"\"b c\" d", {"a\\\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\"\"\"\"\"\"b c\" d", {"a\\\"\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\"\"\"\"\"\"\"b c\" d", {"a\\\"\"\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\\\"\"\"\"\"\"\"\"b c\" d", {"a\\\"\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\\b c\" d", {"a\\\\\\\\b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"b c\" d", {"a\\\\b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"\"b c\" d", {"a\\\\\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"\"\"b c\" d", {"a\\\\\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"\"\"\"b c\" d", {"a\\\\\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"\"\"\"\"b c\" d", {"a\\\\\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"\"\"\"\"\"b c\" d", {"a\\\\\"\"b c", "d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"\"\"\"\"\"\"b c\" d", {"a\\\\\"\"b", "c d", NULL}},
|
|
|
|
{"\"a\\\\\\\\\"\"\"\"\"\"\"\"b c\" d", {"a\\\\\"\"\"b", "c d", NULL}},
|
|
|
|
};
|
|
|
|
|
2021-04-17 15:39:31 +00:00
|
|
|
void out_of_memory(void)
|
|
|
|
{
|
|
|
|
fprintf(stderr, "out of memory!\n");
|
|
|
|
exit(2);
|
|
|
|
}
|
|
|
|
|
2002-08-04 21:18:56 +00:00
|
|
|
int main(int argc, char **argv)
|
|
|
|
{
|
|
|
|
int i, j;
|
|
|
|
|
|
|
|
if (argc > 1) {
|
2019-09-08 19:29:00 +00:00
|
|
|
/*
|
|
|
|
* Generation of tests.
|
|
|
|
*
|
|
|
|
* Given `-splat <args>', we print out a C-style
|
|
|
|
* representation of each argument (in the form "a", "b",
|
|
|
|
* NULL), backslash-escaping each backslash and double
|
|
|
|
* quote.
|
|
|
|
*
|
|
|
|
* Given `-split <string>', we first doctor `string' by
|
|
|
|
* turning forward slashes into backslashes, single quotes
|
|
|
|
* into double quotes and underscores into spaces; and then
|
|
|
|
* we feed the resulting string to ourself with `-splat'.
|
|
|
|
*
|
|
|
|
* Given `-generate', we concoct a variety of fun test
|
|
|
|
* cases, encode them in quote-safe form (mapping \, " and
|
|
|
|
* space to /, ' and _ respectively) and feed each one to
|
|
|
|
* `-split'.
|
|
|
|
*/
|
|
|
|
if (!strcmp(argv[1], "-splat")) {
|
|
|
|
int i;
|
|
|
|
char *p;
|
|
|
|
for (i = 2; i < argc; i++) {
|
|
|
|
putchar('"');
|
|
|
|
for (p = argv[i]; *p; p++) {
|
|
|
|
if (*p == '\\' || *p == '"')
|
|
|
|
putchar('\\');
|
|
|
|
putchar(*p);
|
|
|
|
}
|
|
|
|
printf("\", ");
|
|
|
|
}
|
|
|
|
printf("NULL");
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!strcmp(argv[1], "-split") && argc > 2) {
|
2021-04-18 11:08:50 +00:00
|
|
|
strbuf *cmdline = strbuf_new();
|
|
|
|
char *p;
|
2019-09-08 19:29:00 +00:00
|
|
|
|
2021-04-18 11:08:50 +00:00
|
|
|
strbuf_catf(cmdline, "%s -splat ", argv[0]);
|
2019-09-08 19:29:00 +00:00
|
|
|
printf(" {\"");
|
2021-04-18 11:08:50 +00:00
|
|
|
size_t args_start = cmdline->len;
|
|
|
|
for (p = argv[2]; *p; p++) {
|
|
|
|
char c = (*p == '/' ? '\\' :
|
|
|
|
*p == '\'' ? '"' :
|
|
|
|
*p == '_' ? ' ' :
|
|
|
|
*p);
|
|
|
|
put_byte(cmdline, c);
|
2019-09-08 19:29:00 +00:00
|
|
|
}
|
2021-04-18 11:08:50 +00:00
|
|
|
write_c_string_literal(stdout, ptrlen_from_asciz(
|
|
|
|
cmdline->s + args_start));
|
2019-09-08 19:29:00 +00:00
|
|
|
printf("\", {");
|
|
|
|
fflush(stdout);
|
|
|
|
|
2021-04-18 11:08:50 +00:00
|
|
|
system(cmdline->s);
|
2019-09-08 19:29:00 +00:00
|
|
|
|
|
|
|
printf("}},\n");
|
|
|
|
|
2021-04-18 11:08:50 +00:00
|
|
|
strbuf_free(cmdline);
|
2019-09-08 19:29:00 +00:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!strcmp(argv[1], "-generate")) {
|
|
|
|
char *teststr, *p;
|
|
|
|
int i, initialquote, backslashes, quotes;
|
|
|
|
|
|
|
|
teststr = malloc(200 + strlen(argv[0]));
|
|
|
|
|
|
|
|
for (initialquote = 0; initialquote <= 1; initialquote++) {
|
|
|
|
for (backslashes = 0; backslashes < 5; backslashes++) {
|
|
|
|
for (quotes = 0; quotes < 9; quotes++) {
|
|
|
|
p = teststr + sprintf(teststr, "%s -split ", argv[0]);
|
|
|
|
if (initialquote) *p++ = '\'';
|
|
|
|
*p++ = 'a';
|
|
|
|
for (i = 0; i < backslashes; i++) *p++ = '/';
|
|
|
|
for (i = 0; i < quotes; i++) *p++ = '\'';
|
|
|
|
*p++ = 'b';
|
|
|
|
*p++ = '_';
|
|
|
|
*p++ = 'c';
|
|
|
|
*p++ = '\'';
|
|
|
|
*p++ = '_';
|
|
|
|
*p++ = 'd';
|
|
|
|
*p = '\0';
|
|
|
|
|
|
|
|
system(teststr);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2021-04-18 11:10:04 +00:00
|
|
|
if (!strcmp(argv[1], "-tabulate")) {
|
|
|
|
char table[] = "\
|
|
|
|
* backslashes \n\
|
|
|
|
* \n\
|
|
|
|
* 0 1 2 3 4 \n\
|
|
|
|
* \n\
|
|
|
|
* 0 | \n\
|
|
|
|
* --------+----------------------------- \n\
|
|
|
|
* 1 | \n\
|
|
|
|
* q 2 | \n\
|
|
|
|
* u 3 | \n\
|
|
|
|
* o 4 | \n\
|
|
|
|
* t 5 | \n\
|
|
|
|
* e 6 | \n\
|
|
|
|
* s 7 | \n\
|
|
|
|
* 8 | \n\
|
|
|
|
";
|
|
|
|
char *linestarts[14];
|
|
|
|
char *p = table;
|
|
|
|
for (i = 0; i < lenof(linestarts); i++) {
|
|
|
|
linestarts[i] = p;
|
|
|
|
p += strcspn(p, "\n");
|
|
|
|
if (*p) p++;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < lenof(argv_tests); i++) {
|
|
|
|
const struct argv_test *test = &argv_tests[i];
|
|
|
|
const char *q = test->cmdline;
|
|
|
|
|
|
|
|
/* Skip tests that aren't telling us something about
|
|
|
|
* the behaviour _inside_ a quoted string */
|
|
|
|
if (*q != '"')
|
|
|
|
continue;
|
|
|
|
|
|
|
|
q++;
|
|
|
|
|
|
|
|
assert(*q == 'a');
|
|
|
|
q++;
|
|
|
|
int backslashes_in = 0, quotes_in = 0;
|
|
|
|
while (*q == '\\') {
|
|
|
|
q++;
|
|
|
|
backslashes_in++;
|
|
|
|
}
|
|
|
|
while (*q == '"') {
|
|
|
|
q++;
|
|
|
|
quotes_in++;
|
|
|
|
}
|
|
|
|
|
|
|
|
q = test->argv[0];
|
|
|
|
assert(*q == 'a');
|
|
|
|
q++;
|
|
|
|
int backslashes_out = 0, quotes_out = 0;
|
|
|
|
while (*q == '\\') {
|
|
|
|
q++;
|
|
|
|
backslashes_out++;
|
|
|
|
}
|
|
|
|
while (*q == '"') {
|
|
|
|
q++;
|
|
|
|
quotes_out++;
|
|
|
|
}
|
|
|
|
assert(*q == 'b');
|
|
|
|
q++;
|
|
|
|
bool in_quoted_string = (*q == ' ');
|
|
|
|
|
|
|
|
int x = (backslashes_in == 0 ? 15 : 18 + 7 * backslashes_in);
|
|
|
|
int y = (quotes_in == 0 ? 4 : 5 + quotes_in);
|
|
|
|
char *buf = dupprintf("%d,%d,%c",
|
|
|
|
backslashes_out, quotes_out,
|
|
|
|
in_quoted_string ? 'y' : 'n');
|
|
|
|
memcpy(linestarts[y] + x, buf, strlen(buf));
|
|
|
|
sfree(buf);
|
|
|
|
}
|
|
|
|
|
|
|
|
fputs(table, stdout);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-09-08 19:29:00 +00:00
|
|
|
fprintf(stderr, "unrecognised option: \"%s\"\n", argv[1]);
|
|
|
|
return 1;
|
2002-08-04 21:18:56 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we get here, we were invoked with no arguments, so just
|
|
|
|
* run the tests.
|
|
|
|
*/
|
|
|
|
|
|
|
|
for (i = 0; i < lenof(argv_tests); i++) {
|
2019-09-08 19:29:00 +00:00
|
|
|
int ac;
|
|
|
|
char **av;
|
2002-08-04 21:18:56 +00:00
|
|
|
|
2021-04-17 15:39:31 +00:00
|
|
|
split_into_argv((char *)argv_tests[i].cmdline, &ac, &av, NULL);
|
2002-08-04 21:18:56 +00:00
|
|
|
|
2019-09-08 19:29:00 +00:00
|
|
|
for (j = 0; j < ac && argv_tests[i].argv[j]; j++) {
|
|
|
|
if (strcmp(av[j], argv_tests[i].argv[j])) {
|
|
|
|
printf("failed test %d (|%s|) arg %d: |%s| should be |%s|\n",
|
|
|
|
i, argv_tests[i].cmdline,
|
|
|
|
j, av[j], argv_tests[i].argv[j]);
|
|
|
|
}
|
2002-08-04 21:18:56 +00:00
|
|
|
#ifdef VERBOSE
|
2019-09-08 19:29:00 +00:00
|
|
|
else {
|
|
|
|
printf("test %d (|%s|) arg %d: |%s| == |%s|\n",
|
|
|
|
i, argv_tests[i].cmdline,
|
|
|
|
j, av[j], argv_tests[i].argv[j]);
|
|
|
|
}
|
2002-08-04 21:18:56 +00:00
|
|
|
#endif
|
2019-09-08 19:29:00 +00:00
|
|
|
}
|
|
|
|
if (j < ac)
|
|
|
|
printf("failed test %d (|%s|): %d args returned, should be %d\n",
|
|
|
|
i, argv_tests[i].cmdline, ac, j);
|
|
|
|
if (argv_tests[i].argv[j])
|
|
|
|
printf("failed test %d (|%s|): %d args returned, should be more\n",
|
|
|
|
i, argv_tests[i].cmdline, ac);
|
2002-08-04 21:18:56 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2021-04-17 15:39:31 +00:00
|
|
|
#endif /* TEST */
|