This package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine.
pkg install -forge parallel
pkg load parallel
Multicore parallelization (parcellfun, pararrayfun)
Calculation on a single array
# fun is the function to apply fun = @(x) x^2; vector_x = 1:10; vector_y = pararrayfun(nproc, fun, vector_x)
parcellfun: 10/10 jobs done vector_y = 1 4 9 16 25 36 49 64 81 100
nproc returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use
nproc - 1 instead, in order to leave one cpu free for instance.
fun can be replaced by
@myfun if the function resides in the
In the previous example, the function was executed once for each element of the input
If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the
"Vectorized", true option.
# fun is the function to apply, vectorized (see the dot) fun = @(x) x.^2; vector_x = 1:10; vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1)
parcellfun: 4/4 jobs done vector_y = 1 4 9 16 25 36 49 64 81 100
"ChunksPerProc" option is mandatory with
1 means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of
"ChunksPerProc" allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another.
Output in cell arrays
The following sample code was an answer to this question. The goal was to diagonalize 2x2 matrices contained as rows of a 2d array (each row of the array being a flattened 2x2 matrix).
A = [0.6060168 0.8340029 0.0064574 0.7133187; 0.6325375 0.0919912 0.5692567 0.7432627; 0.8292699 0.5136958 0.4171895 0.2530783; 0.7966113 0.1975865 0.6687064 0.3226548; 0.0163615 0.2123476 0.9868179 0.1478827]; N = 2; [eigenvectors, eigenvalues] = pararrayfun(nproc, @(row_idx) eig(reshape(A(row_idx, :), N, N)), 1:rows(A), "UniformOutput", false)
"UniformOutput", false, the outputs are contained in cell arrays (one cell per slice). In the sample above, both
1x5 cell arrays.