Parallel package: Difference between revisions

From Octave
Jump to navigation Jump to search
(Created page with "The Parallel execution package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine. == multicore parallelization ...")
 
No edit summary
Line 1: Line 1:
The Parallel execution package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine.
The Parallel execution package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine.
To install: {{Codeline|pkg install -forge parallel}}
And then, once on each octave session, {{Codeline|pkg load parallel}}


== multicore parallelization (parcellfun, pararrayfun) ==
== multicore parallelization (parcellfun, pararrayfun) ==

Revision as of 11:09, 22 August 2014

The Parallel execution package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine.

To install: pkg install -forge parallel

And then, once on each octave session, pkg load parallel

multicore parallelization (parcellfun, pararrayfun)

calculation on a single array

Code: simple
# fun is the function to apply 
fun = @(x) x^2;

vector_x = 1:10;

vector_y = pararrayfun(nproc, fun, vector_x)

should output

parcellfun: 10/10 jobs done

vector_y =

     1     4     9    16    25    36    49    64    81   100

nproc returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use nproc - 1 instead, in order to leave one cpu free for instance.

fun can be replaced by @myfun if the function resides in the myfun.m file.

In the previous example, the function was executed once for each element of the input vector_x. If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the "Vectorized", true option.

Code: vectorized
# fun is the function to apply, vectorized (see the dot)
fun = @(x) x.^2;

vector_x = 1:10;

vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1)

should output

parcellfun: 4/4 jobs done
vector_y =

     1     4     9    16    25    36    49    64    81   100

The "ChunksPerProc" option is mandatory with "Vectorized", true. 1 means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of "ChunksPerProc" allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another.