The Parallel execution package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine.
multicore parallelization (parcellfun, pararrayfun)
calculation on a single array
# fun is the function to apply fun = @(x) x^2; vector_x = 1:10; vector_y = pararrayfun(nproc, fun, vector_x)
parcellfun: 10/10 jobs done vector_y = 1 4 9 16 25 36 49 64 81 100
nproc returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use
nproc - 1 instead, in order to leave one cpu free for instance.
fun can be replaced by
@myfun if the function resides in the
In the previous example, the function was executed once for each element of the input
If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the
"Vectorized", true option.
# fun is the function to apply, vectorized (see the dot) fun = @(x) x.^2; vector_x = 1:10; vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1)
parcellfun: 4/4 jobs done vector_y = 1 4 9 16 25 36 49 64 81 100
"ChunksPerProc" option is mandatory with
1 means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of
"ChunksPerProc" allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another.