manipulating a string

M

ma740988

I'm interested in displaying the variable names for methods with
single arguments: The code below does just that and produce the right
output: i.e
arg_a
arg_b
arg_c

Something tells me the parse function could be alot simpler.
Critiques welcomed. Thanks


# include <sstream>
# include <string>
# include <iostream>
# include <iomanip>
# include <algorithm>

void parse ( std::string& to_parse ) {
std::string::size_type const posb = to_parse.find_first_of
( '(' ) ;
std::string::size_type const pose = to_parse.find_last_of ( ')' ) ;
if ( posb == std::string::npos || pose == std::string::npos ) {
return ;
}
to_parse = to_parse.substr( posb , pose );
std::replace( to_parse.begin(), to_parse.end(), ' ', '+' );
std::string::iterator it ;
int end = 0; int beg = 0 ;
int sz = to_parse.size();
for( it = to_parse.end() - 1, sz; it != to_parse.begin() ; --it, --
sz ) {
if ( *it == ')' ) {
continue ;
} else if ( *it == '+' ) {
if ( end ) { break ; }
else { continue; }
} else {
if ( !end ) {
end = sz ;
} else {
beg = sz - 1;
}
}
}
to_parse = to_parse.substr( beg, end - beg );
}


int main() {

const std::string strr( "void Set_Whatever( int arg_a )\n"
"void Set_This( unsigned int arg_b )\n"
"void Set_That( double arg_c )\n" );
std::istringstream isss( strr );
std::string mline ;
while ( std::getline ( isss, mline ) ) {
parse ( mline ) ;
std::cout << mline << std::endl;
}
std::cin.get() ;
}
 
C

Cédric Baudry

ma740988 said:
I'm interested in displaying the variable names for methods with
single arguments: The code below does just that and produce the right
output: i.e
arg_a
arg_b
arg_c
Something tells me the parse function could be alot simpler.
Critiques welcomed. Thanks

Parsing C++ code is not a trivial task. Note that your code attempts to
parse the line backwards, whereas C++ is only parsable forwards, I
believe. Thus it will fail in presence of default argument values in
declarations, and probably in other cases.

If the source code uses strict formatting rules, then it might be doable,
however.

The work you do in the example code seems to be accomplished with a
single regex. I would write a Perl single-liner for that. If you insist
on learning C++ string manipulations, then you should stick either to
loops or find* functions, mixing them only creates a mess. E.g.
(untested!):

typedef std::string::size_type pos_t;

std::string parse_last_word(
const std::string& to_parse
) {

pos_t endparen = to_parse.rfind(')');
if (endparen==to_parse.npos || endparen==0) return "";

pos_t word_end = to_parse.find_last_not_of(" \t", endparen-1);
if (word_end==to_parse.npos || !isalnum(to_parse[word_end])) {
return "";
}
pos_t word_start = to_parse.find_last_of(" \t*&/)", word_end);
if (word_start==to_parse.npos) return "";

return to_parse.substr(word_start+1, word_end-word_start);

}

This will fail for default argument values as well, and probably in other
cases.

hth
Paavo


# include <sstream>
# include <string>
# include <iostream>
# include <iomanip>
# include <algorithm>
void parse ( std::string& to_parse ) {
std::string::size_type const posb = to_parse.find_first_of
( '(' ) ;
std::string::size_type const pose = to_parse.find_last_of ( ')' ) ;
if ( posb == std::string::npos || pose == std::string::npos ) {
return ;
}
to_parse = to_parse.substr( posb , pose );
std::replace( to_parse.begin(), to_parse.end(), ' ', '+' );
std::string::iterator it ;
int end = 0; int beg = 0 ;
int sz = to_parse.size();
for( it = to_parse.end() - 1, sz; it != to_parse.begin() ; --it, --
sz ) {
if ( *it == ')' ) {
continue ;
} else if ( *it == '+' ) {
if ( end ) { break ; }
else { continue; }
} else {
if ( !end ) {
end = sz ;
} else {
beg = sz - 1;
}
}
}
to_parse = to_parse.substr( beg, end - beg );
}
int main() {
const std::string strr( "void Set_Whatever( int arg_a )\n"
"void Set_This( unsigned int arg_b )\n"
"void Set_That( double arg_c )\n" );
std::istringstream isss( strr );
std::string mline ;
while ( std::getline ( isss, mline ) ) {
parse ( mline ) ;
std::cout << mline << std::endl;
}
std::cin.get() ;
}



It works great in this case (one argument with no default options and
some spaces) but you would definitely take advantage of using regular
expressions.


For instance, the following code:
std::string::size_type const posb = to_parse.find_first_of
( '(' ) ;
std::string::size_type const pose = to_parse.find_last_of ( ')' ) ;
if ( posb == std::string::npos || pose == std::string::npos ) {
return ;
}
to_parse = to_parse.substr( posb , pose );

Would be done using Perl regular expressions as:

if ( /.+\((.*)\).*/ )
{
$to_parse=trim($1);
}

What are you working on, what is the purpose of this ?


Because refactoring the algorithm would be necessary to handle more
than one parameter and to allow default values. I don't see other
obvious limitations appart the occasional comment between variables.

However, sometimes there are no space character to rely on.
Consider the following valid declarations:


void f(int*x);
const int const * g (int&y);

They break your algorithm. And I can't rely forbid that practice.
My main idea is to treat &, * and , characters separately.
But using a code source beautifier like AStyle could do the trick. ;-)

Good luck, and tell us if you have improvements.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,825
Latest member
VernonQuy6

Latest Threads

Top