Economics should be open

March 26, 2009

Sample regular expressions for stata

Filed under: coding, Stata — howardchong @ 10:22 pm

Just a note to show how to use regular expressions in stata for text processing.

PROBLEM: I had a lot of codes in variable dsmnem the had “:” and “.” characters. I wanted to do a reshape my data and use these strings as the j variable, i.e. “reshape … j( dsmnem) string”
SOLUTION: regular expressions

replace dsmnem=lower(regexr(dsmnem,”:”,”_”))
replace dsmnem=regexr(dsmnem,”\.”,””)

These two lines replace periods and colons with emptytext and underscore respectively. Note that I have to use the escape character to specify the period character; otherwise the period has a special meaning in the regular expression.

Weird how they call it “regexr” and not “regexp” or “regexpr”, but whatever.

By the way, dsmnem is datastream mnemomic


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at

%d bloggers like this: