Parallel package

From Octave
Revision as of 03:18, 10 June 2019 by Siko1056 (talk | contribs) (Remove redundant Category:Packages. Categories at bottom.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The Parallel execution package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine.

To install: pkg install -forge parallel

And then, once on each octave session, pkg load parallel

multicore parallelization (parcellfun, pararrayfun)[edit]

See also the NDpar package, for an extension of these functions to N-dimensional arrays

calculation on a single array[edit]

Code: simple
# fun is the function to apply 
fun = @(x) x^2;

vector_x = 1:10;

vector_y = pararrayfun(nproc, fun, vector_x)

should output

parcellfun: 10/10 jobs done

vector_y =

     1     4     9    16    25    36    49    64    81   100

nproc returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use nproc - 1 instead, in order to leave one cpu free for instance.

fun can be replaced by @myfun if the function resides in the myfun.m file.

In the previous example, the function was executed once for each element of the input vector_x. If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the "Vectorized", true option.

Code: vectorized
# fun is the function to apply, vectorized (see the dot)
fun = @(x) x.^2;

vector_x = 1:10;

vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1)

should output

parcellfun: 4/4 jobs done
vector_y =

     1     4     9    16    25    36    49    64    81   100

The "ChunksPerProc" option is mandatory with "Vectorized", true. 1 means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of "ChunksPerProc" allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another.

Output in cell arrays[edit]

The following sample code was an answer to this question. The goal was to diagonalize 2x2 matrices contained as rows of a 2d array (each row of the array being a flattened 2x2 matrix).

Code: diagonalize NxN matrices contained in an array
A = [0.6060168 0.8340029 0.0064574 0.7133187;
0.6325375 0.0919912 0.5692567 0.7432627;
0.8292699 0.5136958 0.4171895 0.2530783;
0.7966113 0.1975865 0.6687064 0.3226548;
0.0163615 0.2123476 0.9868179 0.1478827];

N = 2;
[eigenvectors, eigenvalues] = pararrayfun(nproc, 
                                @(row_idx) eig(reshape(A(row_idx, :), N, N)), 
                                1:rows(A), "UniformOutput", false)

With "UniformOutput", false, the outputs are contained in cell arrays (one cell per slice). In the sample above, both eigenvectors and eigenvalues are 1x5 cell arrays.

cluster operation[edit]

Documentation can be found in the README.parallel or files, located inside the doc directory of the parallel package.