Economics should be open

August 14, 2009

Octave cell-arrays are pretty slow

Filed under: coding — howardchong @ 5:50 pm

I’m trying to figure out which open source statistical/computation package to use.  I used to use Matlab. It’s good, but expensive, and it has WAY more features than I need.

I know I should be running things on Unix, but right now I’m on Windows XP. I sometimes putty into a Unix server and run things.

R looks very good. That’s  my next langauge to learn.

Octave is pretty good. It provides syntax almost identical to Matlab.  In 3.0, it now has support for Multidimensional Cell Arrays. These are arrays that can hold any data type. Most common for me is an array of strings. If you load data that is mixed text and numeric, then your data will probably be read as a cell-array.

One thing I have noticed is that the cell-arrays are really quite slow.

I had a ~10000 x 10 csv file.

Column 1 had mixed numeric and strings. They were 6 character codes, and about 2/3 of them did not have alphabetical characters. I needed to convert these to strings, and then do a sort and some other processing. I basically had to traverse each element of the first row and do the datatype change individually.

The process was VERY slow. In fact, I think Excel would be better at such tasks.

Here are a few tips:

  • If you can, remove all strings from your CSV file.
  • If you read a large dataset as a large cell arrays, separate each column into its own variable. Then pack together the numeric data into a matrix (if needed).
  • STATA has an “encode” routine that converts strings into records stored as numeric. For example, if your data range is car makes, it will give each make a number and then also generate a lookup table where you can decipher what the numbers mean.

Also check out this page that benchmarks the math/science packages with a set of standard routines:

http://www.sciviews.org/benchmark/index.html

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: