Thursday, July 30, 2009

C++ and regular expressions?

I want to search through a file and identify three kinds of tokens: words, new lines, and spaces, using regular expressions.





How do I do this in C++ (Visual Studio 2005)?





Specific examples of code would be appreciated!





Please be as specific as possible (e.g. Please don't just say "use regex"... I have no idea how to use regex or where to get regex and my google searches are proving fruitless...)

C++ and regular expressions?
Bad news is there is no standard regrex support in c++.


Good news is that using regex to split a string into words is like traveling around the corner by a a private jet.


Just go through your string, and look at the chars:





char *word = 0;


for (char *c = input; *c; c++)


{


if (!word) word = c;


if (*c %26gt;= 'A' %26amp;%26amp; *c %26lt;= 'z') continue; //middle of the word;


if (*c == ' ' || *c == '\n') process_blank (c, c - input);


*c = 0; // Terminate the current word


if (word) process_word (word, word - input);


word = 0;


}





This assumes that by "word" you meant any sequence of letters. If you want to allow more characters - like digits or underscores, just modify the condition of the second "if" (that has continue) next to it accordingly.
Reply:I do regex in Perl and Ksh, but not C++, my google search was for :"c++ regular expression for finding newline"


and what I found on one site is below.


A note on new line, sometimes you have to search for both a newline character as well as a carriage return character.


:


Trying It Out


Since searching and matching are so common, the Boost::Regex library has functions for both. The following code demonstrates both, using the preceding test strings as well as another, abc.





#include %26lt;iostream%26gt;


#include %26lt;boost/regex.hpp%26gt;





using namespace std;





void testMatch(const boost::regex %26amp;ex, const string st) {


cout %26lt;%26lt; "Matching " %26lt;%26lt; st %26lt;%26lt; endl;


if (boost::regex_match(st, ex)) {


cout %26lt;%26lt; " matches" %26lt;%26lt; endl;


}


else {


cout %26lt;%26lt; " doesn’t match" %26lt;%26lt; endl;


}


}





void testSearch(const boost::regex %26amp;ex, const string st) {


cout %26lt;%26lt; "Searching " %26lt;%26lt; st %26lt;%26lt; endl;


string::const_iterator start, end;


start = st.begin();


end = st.end();


boost::match_results%26lt;std::string::const_... what;


boost::match_flag_type flags = boost::match_default;


while(boost::regex_search(start, end, what, ex, flags))


{


cout %26lt;%26lt; " " %26lt;%26lt; what.str() %26lt;%26lt; endl;


start = what[0].second;


}


}





int main(int argc, char *argv[])


{


static const boost::regex ex("[Rr]eg...r");


testMatch(ex, "regular");


testMatch(ex, "abc");


testMatch(ex, "some regular expressions are Regxyzr");


testMatch(ex, "RegULarexpressionstring");


testSearch(ex, "regular");


testSearch(ex, "abc");


testSearch(ex, "some regular expressions are Regxyzr");


testSearch(ex, "RegULarexpressionstring");


return 0;


}
Reply:good luck, I've never heard of using regex in c.... but there is a c library called pcre (perl compatible regular expressions) that is used in apache!


No comments:

Post a Comment