Dataframe package

From Octave
Revision as of 16:19, 27 February 2015 by CdeMills (talk | contribs)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Dataframe, Data manipulation toolbox similar to R data.frame

At an mature development stage. hg

  • Maintainer: Pascal Dupuis
  • Contributors:

This package permits to handle complex (both in the sense of complex numbers and high complexity) data as if they were ordinary arrays, except that each column MAY possess a different type. It also complete a fairly complete interface to CSV files, permitting to cope with a number of oddities, like f.i. CSV files starting with a header spread over a few lines. The resulting array tries as far as it can to mimick an array, in such a way that binary operators and usual functions will work as expected.

Meta-information is also handled. Rows and columns may have a name, and this name is searchable. If for whatever reason the ordering of a CSV file changes, searching by column names will return the expected information.

A simple example:

truc={"Id", "Name", "Type";1, "onestring", "bla"; 2, "somestring", "foobar";}
truc =
{
  [1,1] = Id
  [2,1] =  1
  [3,1] =  2
  [1,2] = Name
  [2,2] = onestring
  [3,2] = somestring
  [1,3] = Type
  [2,3] = bla
  [3,3] = foobar
}
>> tt=dataframe(truc)
tt = dataframe with 2 rows and 3 columns
_1     Id       Name   Type
Nr double       char   char
 1      1  onestring    bla
 2      2 somestring foobar

The first cell line is intended to contain column names; the rest is column content. The type is automatically inferred from the cell content. Now let us select one column by its name:

>> tt(:, 'Name')
ans = dataframe with 2 rows and 1 columns
_1       Name
Nr       char
1  onestring
2 somestring

In this case, a sub-dataframe is returned. Struct-like indexing is also implemented:

>> tt.Id
ans =
  1
  2

When the output is a vector and can be simplified to something simple ... it is.