c - char splitting and parsing -
so i'm attempting setup function correctly parse following type of input (note input jibberish try , illustrate example)
"./test script";ls -a -o;mkdir "te>st";ls > somefile.txt
where each command separated ';' , each argument separated whitespace ' ' except in cases wrapped in "" in case should treated literal or whole. ie output want
cmd : "./test script"
cmd : ls args[2] {-a, -o}
cmd : mkdir args[1] { "te>st" }
cmd : ls args[2] {>, somefile.txt}
i've tried splitting via ; first, via ' ' first example fails (wrapped in "" should considered whole), i'm having trouble c i'm not familiar language, help? have far
// commands split ; char *cmdsplittoken = strtok(srcpointer, ";"); // store commands seperately can deal them 1 one while(cmdsplittoken != null) { cmds[cmdcount++] = cmdsplittoken; cmdsplittoken = strtok(null, ";"); } // loop on commands , gather arguments for(int = 0; < cmdcount; i++) { // args split ' ' char *argsplittoken = strtok(cmds[i], " "); int argcount = 0; while(argsplittoken != null) { printf("arg %s\n", argsplittoken); argcount++; argsplittoken = strtok(null, " "); } }
roll own strtok
, check quotes in there. (your example string doesn't contain ';' inside quotes, perhaps misunderstanding entire issue :)
anyway, here take on rough version of strtok
works similar, except accepts 1 single token character instead of string (but if necessary, that's added) , meta-parsing on following:
- strings start
"
matched closing"
- strings start
'
matched closing'
- any individual character can escaped prepending
\
an unmatched "
, '
match end of string.
#include <stdio.h> char *get_token (char *input_str, char separator) { static char *last_pos; if (input_str) last_pos = input_str; else input_str = last_pos; if (last_pos && *last_pos) { while (*last_pos) { if (*last_pos == separator) { *last_pos = 0; last_pos++; return input_str; } if (*last_pos == '\"') { last_pos++; while (*last_pos && *last_pos != '\"') last_pos++; } else if (*last_pos == '\'') { last_pos++; while (*last_pos && *last_pos != '\'') last_pos++; } else if (*last_pos == '\\' && last_pos[1]) { last_pos++; } last_pos++; } return input_str; } return null; } void main (void) { char str[] = "\"./test; script\";ls -a\\;b -o;mkdir \"te>st\";ls > 'some;file.txt'"; char *cmdsplittoken = get_token (str, ';'); while (cmdsplittoken != null) { printf("arg %s\n", cmdsplittoken); cmdsplittoken = get_token (null, ';'); } }
this fixes first half of command parsing. second part can handled same routine, or -- understand -- bog standard strtok
.
by way, static char
inside routine makes not re-entrant -- not use alternating strings. (possibly knew that, because avoid in own code.)
Comments
Post a Comment