NDpar package

Revision as of 17:16, 21 September 2014 by Huntj (talk | contribs) (→‎Usage)

Introduction

ndpar_[cell,array]fun are the par[cell,array]fun from package parallel 2.2.0, extended in order to handle N-dimensionnal arrays as input and output. Hope it will eventually make it to the parallel package, once its stability is proved. See also package parallel, maintained by Olaf Till

Installation

The ndpar package can be downloaded from SourceForge

To install, issue pkg install <path to the downloaded file>


Usage

To load the package, pkg load ndpar

setting CatDimensions and IdxDimensions - vectorized

Suppose we have a function that works along different dimensions of its input arguments, and returns arrays of different shapes like

function [x, y] = f(u, v)
  x = u + v.';
  y = x.';
endfunction

This function is used only to illustrate the syntax, not the usefulness of the extension. Admittedly such a simple case could be handled easily either by changing the function or making a small wrapper to it. But it becomes really interesting when we are dealing with arrays of many and different dimensions (not shown here yet).

Applying this function to arrays u and v can be straightforwardly parallelized :

Code: setting CatDimensions and IdxDimensions - vectorized
u = [1:10; 2:11];
v = u.';
[x, y] = ndpar_arrayfun(2, @f, u, v, "Vectorized", true, "ChunksPerProc", 2, "VerboseLevel", 1, "CatDimensions", [2 1],"IdxDimensions", [2 1] );

Here is the meaning of the options

  • "IdxDimensions", [2 1] The parallelization (or slicing, or indexing) should be done along the 2nd dimension of u and 1st dimension of v. A value of 0 means no indexing (no slicing), so the argument would be passed "as is".
  • "CatDimensions", [2 1] The outputs from each slice should be concatenated along the 2nd dimension of the first output and 1st dimension of the second output
  • "Vectorized", true Use only if the function is vectorized along the "indexing" dimensions.
  • "ChunksPerProc", 2 It means that each process should make 2 chunks (2 calls to f with "Vectorized", true). Increase this number to minimize memory usage for instance. Increasing this number is also useful if function executions can have very different durations. If a process is finished, it can take over jobs from another process that is still busy.