Economics should be open

March 26, 2009

Sample regular expressions for stata

Filed under: coding, Stata — howardchong @ 10:22 pm

Just a note to show how to use regular expressions in stata for text processing.

PROBLEM: I had a lot of codes in variable dsmnem the had “:” and “.” characters. I wanted to do a reshape my data and use these strings as the j variable, i.e. “reshape … j( dsmnem) string”
SOLUTION: regular expressions

replace dsmnem=lower(regexr(dsmnem,”:”,”_”))
replace dsmnem=regexr(dsmnem,”\.”,””)

These two lines replace periods and colons with emptytext and underscore respectively. Note that I have to use the escape character to specify the period character; otherwise the period has a special meaning in the regular expression.

Weird how they call it “regexr” and not “regexp” or “regexpr”, but whatever.

By the way, dsmnem is datastream mnemomic

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: