c# - possible to use regex to find all matches including its own delimiter? -


i trying write email subject line parser user defines own parsing rules. rules match member names on subject line , use up. catch member name might contain parsing rule delimiter.

// rule has defined text between > matches member name.  // note user can make parsing rule example. string samplerule = ">{member}>";      // left out parsing code. have figured out looking    // member , prefix/postfix delimiters.   string prefix = ">"; string postfix = ">";  // note member>name3 valid member name string subject =  "subject>membername1>membername2>member>name3>endsubject"; string pattern = "(?="+prefix+"([a-z].+?)"+postfix+")";  match m = regex.match(subject, pattern);  while(m.success) {     // possible member name     console.writeline(m.groups[1].tostring());     m = m.nextmatch(); }  // output needs // membername1 // membername2 // member>name3      //  // membername1 // membername2 // member  // note spanning bad matches ok, example // membername1>membername2 or membername1>membername2>member>name3 

here's fragile attempt use regular regular expressions , recursion:

static class program {     static void main(string[] args)     {         string prefix = ">";         string suffix = ">";         string subject =             "subject>membername1>membername2>member>name3>endsubject";         var result = find(subject, true, prefix, suffix).tolist();         result.foreach(item =>         {             console.writeline(item);         });         /* output is:         membername1>membername2         member>name3                *match         membername1                 *match         membername2                 *match         member         name3          */     }      private static ienumerable<string> find(         string subject,         bool toggle,         string prefix,         string suffix)     {         string             r1 = @"(?<=" + prefix + @")(?>([\w]*(" + prefix +             "|" + suffix + @")[\w]*))(?=" + suffix + ")",             r2 = @"[\w]*";         var temp = regex.matches(subject, toggle ?             r1 : r2             )             .cast<match>()             .tolist();          return temp.selectmany(m =>             temp             .select(i => i.value)             .union(find(m.value, !toggle, prefix, suffix)))             .where(s => !string.isnullorempty(s))             .distinct();     } } 

note: i'm not sure if in example, > in member>name3 considered prefix or suffix.

[edit] here's approach, doesn't use regular expressions. takes account > in member>name3 prefix or suffix:

var separators = new[] { prefix, suffix };  var firstresult = separators     .selectmany(s => subject         .split(separators,stringsplitoptions.removeemptyentries)         .skip(1)         .reverse()         .skip(1)         .reverse())     .distinct()     .tolist();  var result = firstresult     .zip(firstresult.skip(1), (a, b) =>     {         var l = new list<string>();         separators.tolist().foreach(s =>         {             l.add(string.format("{0}{1}{2}", a, s, b));         });         return l;     })     .selectmany(s => s)     .union(firstresult)     .tolist(); 

Comments

Popular posts from this blog

basic authentication with http post params android -

vb.net - Virtual Keyboard commands -

css - Firefox for ubuntu renders wrong colors -