Difference between revisions of "User:Josiah425:TISEAN Package"

From Octave
Jump to navigation Jump to search
Line 1: Line 1:
  
 
= TISEAN Package Porting Project  =
 
= TISEAN Package Porting Project  =
 +
== Goal of the project ==
 +
The goal of this project is not to port the entire TISEAN package to octave. That would be a desired outcome though it might not be feasible within the time constraints. The goal of this project is to give the TISEAN package a solid start and to port as many functions as possible to create a solid foundation for the future.
 +
 
== General division ==
 
== General division ==
AS the TISEAN package consists of 74 programs it needs to be divided into subparts that can be tackled separately and create a entity in-and-of-themselves. I chose to work along the lines of the articles about implementations of nonlinear timeseries included in the documentation. This article discusses various algorithms and what certain programs mean. It can be found [http://www.mpipks-dresden.mpg.de/~tisean/Tisean_3.0.1/docs/chaospaper/TiseanHTML.html| here]. I will discuss in which order I would like to port various topics in this and where my work currently stands.
+
As the TISEAN package consists of 74 programs I have divided the first part into three sub-parts:
 +
# FORTRAN ones that can be re-implemented easily in m-files (a good example of such a program is 'henon') -- there are 5 programs in this class
 +
# c programs which also need to be linked to oct files (an example is 'ghkss') -- there are 41 programs in this class
 +
# the FORTRAN ones that need to be linked to oct files (an example of such a program is 'project') -- there are 28 programs in this class
 +
They are ordered so that according to my estimates the difficulty rises with the number. This is because typecasting and implicit typing (which is included in most of the FORTRAN files in the TISEAN library) can be problematic sometimes.
 +
 
 +
Apart from the qualitative division I propose a work oriented division, in which each subpart can be tackled separately and create an entity in-and-of-itself. I chose to work along the lines of the articles about implementations of nonlinear timeseries included in the documentation. This article discusses various algorithms and what certain programs mean. It can be found [http://www.mpipks-dresden.mpg.de/~tisean/Tisean_3.0.1/docs/chaospaper/TiseanHTML.html| here]. I will discuss in which order I would like to port various topics in this and where my work currently stands.
 
==== Nonlinear noise reduction ====
 
==== Nonlinear noise reduction ====
 
This is the first topic I chose. It is because it contains programs from all three categories. It is also relatively small -- it contains 3 programs: project, lazy, ghkss. I have chosen to further implement addnoise and henon, to demonstrate how project and ghkss work. Thus this topic contains programs from each category:  
 
This is the first topic I chose. It is because it contains programs from all three categories. It is also relatively small -- it contains 3 programs: project, lazy, ghkss. I have chosen to further implement addnoise and henon, to demonstrate how project and ghkss work. Thus this topic contains programs from each category:  
Line 32: Line 41:
 
* Linkable to FORTAN (surrogates, randomize , timerev)
 
* Linkable to FORTAN (surrogates, randomize , timerev)
 
This stage should take me about 3 days to complete.
 
This stage should take me about 3 days to complete.
=== Notes to time estimates ===
+
=== Notes on time estimates ===
 
Totaling up the above estimates it should take me 6 weeks to complete my task as outlined above. I do think my time estimates are rather conservative, but I would rather work on other programs (the documentation contains another article located [http://www.mpipks-dresden.mpg.de/~tisean/Tisean_3.0.1/docs/surropaper/Surrogates.html| here]) than to be overwhelmed with the work and try to rush trough. As I am not fully familiar with the mathematical concepts discussed within these articles I want to make sure that I reduce the possibility of error when linking programs to octave to a minimum. If I vastly overestimated the time I will need to port those functions I intend to finish the 'Visualization, non-stationary' section of the work on nonlinear timeseries and then proceed to programs from the 'Surrogate time series article'.
 
Totaling up the above estimates it should take me 6 weeks to complete my task as outlined above. I do think my time estimates are rather conservative, but I would rather work on other programs (the documentation contains another article located [http://www.mpipks-dresden.mpg.de/~tisean/Tisean_3.0.1/docs/surropaper/Surrogates.html| here]) than to be overwhelmed with the work and try to rush trough. As I am not fully familiar with the mathematical concepts discussed within these articles I want to make sure that I reduce the possibility of error when linking programs to octave to a minimum. If I vastly overestimated the time I will need to port those functions I intend to finish the 'Visualization, non-stationary' section of the work on nonlinear timeseries and then proceed to programs from the 'Surrogate time series article'.
 
+
== Details of work on each program ==
 
+
* FORTRAN linking
+
For each FORTRAN program that I intend to link to a oct-file I intend to:  
Porting of the TISEAN package has a couple parts. First part is making the FORTRAN and c programs accessible to Octave. Second part would be creating makefiles and putting all that code in a neat package.
+
# Strip the program of its input validation and transform it into a subroutine
I have divided the first part into three sub-parts:  
+
# Create a .cc program (compiled into an oct-file) that will launch the stripped FORTRAN subroutine; this .cc program will also not contain input validation, it will be for internal use only
# FORTRAN ones that can be re-implemented easily in m-files (a good example of such a program is 'henon') -- there are 5 programs in this class
+
# Create a m-file that will perform input validation and launch the .cc and contain usage documentation
# c programs which also need to be linked to oct files (an example is 'ghkss') -- there are 41 programs in this class
+
* C linking
# the FORTRAN ones that need to be linked to oct files (an example of such a program is 'project') -- there are 28 programs in this class
+
I intend to do here something similar to the FORTRAN programs, although, it might be better to not create any extra m-files and incorporate the program's existing input validation into the .cc file. This might be a desired course of action. I will make a decision once I complete one such linking program.  
They are ordered so that my estimate of the difficulty rises with the number. This is because typecasting and implicit typing (which is included in most of the FORTRAN files in the TISEAN library) can be problematic sometimes.
+
* Reimplementing in mfile
 
+
This is quite straightforward, although it is important not to make a mistake while taking this approach.
 
 
 
 
 
 
 
 
As linking FORTRAN code to oct code is most difficult of those three tasks, there are 28 in this category. If it is more difficult than I expect I will move some of the easier programs into the m-file category.  
 
 
 
Next there are the programs in the Tisean package which can be ported to m-files easily. This is not as difficult a task as linking FORTRAN code to oct files. I have put 5 programs in this category.
 
Last but not least, I have 41 programs in C that need to be linked to Oct files. There are 41 programs in this category.
 
 
 
My plan is to try to work with sections of the library at the time. As described below, I intend to begin with the programs connected to Nonlinear noise reduction. The goal is to then document all those files and create a usable package. After finishing those functions I intend to move to another area of the TISEAN package and add programs that actually make a whole. As it is hard to precisely estimate how much time porting the entire TISEAN package will take, I can make small steps that will in-and-of-themselves form a whole.  
 
 
 
Thus every milestone will be finishing each section of the TISEAN package.
 
 
 
I would like to tackle them in the following order:
 
* Nonlinear noise reduction
 
* Testing for nonlinearity
 
* Nonlinear prediction
 
* Lapunov Exponents
 
* Dimensions and entropies
 
 
 
Once those are completed I will look at other programs to be ported. The idea though, is to focus on getting a solid start for porting this library.
 
  
 
== Where I intend to start ==
 
== Where I intend to start ==

Revision as of 20:57, 1 April 2015

TISEAN Package Porting Project

Goal of the project

The goal of this project is not to port the entire TISEAN package to octave. That would be a desired outcome though it might not be feasible within the time constraints. The goal of this project is to give the TISEAN package a solid start and to port as many functions as possible to create a solid foundation for the future.

General division

As the TISEAN package consists of 74 programs I have divided the first part into three sub-parts:

  1. FORTRAN ones that can be re-implemented easily in m-files (a good example of such a program is 'henon') -- there are 5 programs in this class
  2. c programs which also need to be linked to oct files (an example is 'ghkss') -- there are 41 programs in this class
  3. the FORTRAN ones that need to be linked to oct files (an example of such a program is 'project') -- there are 28 programs in this class

They are ordered so that according to my estimates the difficulty rises with the number. This is because typecasting and implicit typing (which is included in most of the FORTRAN files in the TISEAN library) can be problematic sometimes.

Apart from the qualitative division I propose a work oriented division, in which each subpart can be tackled separately and create an entity in-and-of-itself. I chose to work along the lines of the articles about implementations of nonlinear timeseries included in the documentation. This article discusses various algorithms and what certain programs mean. It can be found here. I will discuss in which order I would like to port various topics in this and where my work currently stands.

Nonlinear noise reduction

This is the first topic I chose. It is because it contains programs from all three categories. It is also relatively small -- it contains 3 programs: project, lazy, ghkss. I have chosen to further implement addnoise and henon, to demonstrate how project and ghkss work. Thus this topic contains programs from each category:

  • Re-implementable in mfile (henon)
  • Linkable to FORTRAN (project, addnoise, lazy)
  • Linkable to c (ghkss)

I have already started work on this stage. My progress can be viewed at https://bitbucket.org/josiah425/tisean. So far I have implemented addnoise, project and re-implemented henon as an mfile. As most work on this topic has been completed I estimate that finishing it up around 2 days -- I estimate 1 day per function (that includes documentation and testing).

Phase space representation

This is the next topic that needs to be implemented. This is because it contains programs (especially 'delay') that are used to visualize data. Whenever an example is given in the package the resulting data is routed through 'delay' before it is plotted. Apart from delay it also contains other functions that can divided into the following categories:

  • Linkable to FORTRAN (autocorr, pc)
  • Linkable to c (delay, corr, mutual, false_nearest, pca)

Assuming around a day for each function (with testing and documenting the usage) I assume this stage will take a little over a week.

Nonlinear prediction

This seems like a reasonable next step. It consists of the following programs:

  • Linkable to FORTRAN (predict, upo)
  • Linkable to C (lzo-test, lzo-gm, lzo-run, lfo-ar, lfo-gm, lfo-run, rbf, polynom, xzero)

Again assuming around a day for each program (with testing, documenting usage and writing examples) I assume this stage will take about two weeks.

Lyapunov exponents

This stage will include:

  • Linkable to C (lyap_r, lyap_k, lyap_spec)

It will take about 2-3 days to complete.

Dimensions and entropies

This topic is next on the list. Programs it include are as follows:

  • Linkable to FORTRAN (c2naive, c2, c2t, c2d, c2g, c1)
  • Linkable to C (d2, boxcount)

This stage should take little over a week. I expect this stage and the previous one to take about two weeks.

Testing for nonlinearity

This is the last topic I intend to tackle. The following programs are included here:

  • Linkable to FORTAN (surrogates, randomize , timerev)

This stage should take me about 3 days to complete.

Notes on time estimates

Totaling up the above estimates it should take me 6 weeks to complete my task as outlined above. I do think my time estimates are rather conservative, but I would rather work on other programs (the documentation contains another article located here) than to be overwhelmed with the work and try to rush trough. As I am not fully familiar with the mathematical concepts discussed within these articles I want to make sure that I reduce the possibility of error when linking programs to octave to a minimum. If I vastly overestimated the time I will need to port those functions I intend to finish the 'Visualization, non-stationary' section of the work on nonlinear timeseries and then proceed to programs from the 'Surrogate time series article'.

Details of work on each program

  • FORTRAN linking

For each FORTRAN program that I intend to link to a oct-file I intend to:

  1. Strip the program of its input validation and transform it into a subroutine
  2. Create a .cc program (compiled into an oct-file) that will launch the stripped FORTRAN subroutine; this .cc program will also not contain input validation, it will be for internal use only
  3. Create a m-file that will perform input validation and launch the .cc and contain usage documentation
  • C linking

I intend to do here something similar to the FORTRAN programs, although, it might be better to not create any extra m-files and incorporate the program's existing input validation into the .cc file. This might be a desired course of action. I will make a decision once I complete one such linking program.

  • Reimplementing in mfile

This is quite straightforward, although it is important not to make a mistake while taking this approach.

Where I intend to start

I will start with a small step of porting all of the functions needed for Nonlinear noise reduction. The functions I will need is: henon (for generating data), addnoise, ghkss and project. They cover all of the three categories I talked about in the first section. I have already reimplemented henon in m-file, it is accessible here. Both addnoise and project are in FORTRAN and need to be linked to C++ files and compiled into oct files. Lastly, ghkss is implemented in c and needs to be linked to a C++ oct file.

Where I am at

I have already ported henon. I have also been able to link with FORTRAN programs addnoise and project. My progress can be viewed here.

Explanation of what I want to do with each file

Each FORTRAN file that need to be linked to an Oct file needs work done on it. I plan to take the following steps with each FORTRAN program:

  1. Change the FORTRAN program into a subroutine. The arguments of this subroutine will be the parameters that this program would have normally read from the user during execution.
  2. Move input parsing and validation from the FORTRAN files to the .cc file which will link the respective fortran file to it. This will make the fortran subroutines 'dumb' and unable to distinguish between good and bad data.
  3. Eliminate all file inputs and outputs. The fortran programs write and read data to/from files. This is unnecessary in Octave, as data can be supplied and retrieved to/from these subroutines directly via oct files.
  4. Test the oct file against the original library to ensure I didn't make mistakes.

I plan to do similar steps for the c files. I believe this stage will be easier as the c code is much better organized and eliminating input validation & parsing, file inputs and outputs should be a much easier task.