Line 1: |
Line 1: |
− | The Parallel execution package provides utilities to work with clusters, but also functions to parallelize work among cores of a single machine. | + | The {{Forge|parallel|parallel package}} is part of the Octave Forge project. See its {{Forge|parallel|homepage}} for the latest release. |
| | | |
− | To install: {{Codeline|pkg install -forge parallel}}
| + | This package provides utilities to work with clusters<ref>[https://octave.sourceforge.io/parallel/package_doc/ Package documentation]</ref>, but also functions to parallelize work among cores of a single machine. |
| | | |
− | And then, once on each octave session, {{Codeline|pkg load parallel}}
| + | * Install: {{Codeline|pkg install -forge parallel}} |
| + | * Load: {{Codeline|pkg load parallel}} |
| | | |
− | == multicore parallelization (parcellfun, pararrayfun) == | + | == Multicore parallelization (parcellfun, pararrayfun) == |
| | | |
| + | === Calculation on a single array === |
| | | |
− | See also the [[NDpar package]], for an extension of these functions to N-dimensional arrays
| + | <syntaxhighlight lang="octave"> |
− | | |
− | === calculation on a single array ===
| |
− | | |
− | {{Code|simple|<pre>
| |
| # fun is the function to apply | | # fun is the function to apply |
| fun = @(x) x^2; | | fun = @(x) x^2; |
Line 19: |
Line 17: |
| | | |
| vector_y = pararrayfun(nproc, fun, vector_x) | | vector_y = pararrayfun(nproc, fun, vector_x) |
− | </pre> | + | </syntaxhighlight> |
− | }}
| |
| | | |
| should output | | should output |
| | | |
− | <code><pre> | + | <syntaxhighlight lang="plain"> |
| parcellfun: 10/10 jobs done | | parcellfun: 10/10 jobs done |
| | | |
Line 30: |
Line 27: |
| | | |
| 1 4 9 16 25 36 49 64 81 100 | | 1 4 9 16 25 36 49 64 81 100 |
− | </pre></code> | + | </syntaxhighlight> |
| | | |
| {{Codeline|nproc}} returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use {{Codeline|nproc - 1}} instead, in order to leave one cpu free for instance. | | {{Codeline|nproc}} returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use {{Codeline|nproc - 1}} instead, in order to leave one cpu free for instance. |
Line 39: |
Line 36: |
| If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the {{Codeline|"Vectorized", true}} option. | | If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the {{Codeline|"Vectorized", true}} option. |
| | | |
− | {{Code|vectorized|<pre>
| + | <syntaxhighlight lang="octave"> |
| # fun is the function to apply, vectorized (see the dot) | | # fun is the function to apply, vectorized (see the dot) |
| fun = @(x) x.^2; | | fun = @(x) x.^2; |
Line 46: |
Line 43: |
| | | |
| vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1) | | vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1) |
− | </pre> | + | </syntaxhighlight> |
− | }}
| |
| should output | | should output |
| | | |
− | <code><pre> | + | <syntaxhighlight lang="plain"> |
| parcellfun: 4/4 jobs done | | parcellfun: 4/4 jobs done |
| vector_y = | | vector_y = |
| | | |
| 1 4 9 16 25 36 49 64 81 100 | | 1 4 9 16 25 36 49 64 81 100 |
− | </pre></code> | + | </syntaxhighlight> |
| | | |
| The {{Codeline|"ChunksPerProc"}} option is mandatory with {{Codeline|"Vectorized", true}}. {{Codeline|1}} means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of {{Codeline|"ChunksPerProc"}} allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another. | | The {{Codeline|"ChunksPerProc"}} option is mandatory with {{Codeline|"Vectorized", true}}. {{Codeline|1}} means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of {{Codeline|"ChunksPerProc"}} allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another. |
Line 61: |
Line 57: |
| === Output in cell arrays === | | === Output in cell arrays === |
| | | |
− | The following sample code was an answer to [http://stackoverflow.com/questions/27422219/for-every-row-reshape-and-calculate-eigenvectors-in-a-vectorized-way this question]. The goal was to diagonalize 2x2 matrices contained as rows of a 2d array (each row of the array being a flattened 2x2 matrix). | + | The following sample code was an answer to [https://stackoverflow.com/questions/27422219/for-every-row-reshape-and-calculate-eigenvectors-in-a-vectorized-way this question]. The goal was to diagonalize 2x2 matrices contained as rows of a 2d array (each row of the array being a flattened 2x2 matrix). |
| | | |
− | {{code|diagonalize NxN matrices contained in an array|
| + | <syntaxhighlight lang="octave"> |
− | <pre> | |
| A = [0.6060168 0.8340029 0.0064574 0.7133187; | | A = [0.6060168 0.8340029 0.0064574 0.7133187; |
− | 0.6325375 0.0919912 0.5692567 0.7432627; | + | 0.6325375 0.0919912 0.5692567 0.7432627; |
− | 0.8292699 0.5136958 0.4171895 0.2530783; | + | 0.8292699 0.5136958 0.4171895 0.2530783; |
− | 0.7966113 0.1975865 0.6687064 0.3226548; | + | 0.7966113 0.1975865 0.6687064 0.3226548; |
− | 0.0163615 0.2123476 0.9868179 0.1478827]; | + | 0.0163615 0.2123476 0.9868179 0.1478827]; |
| | | |
| N = 2; | | N = 2; |
Line 75: |
Line 70: |
| @(row_idx) eig(reshape(A(row_idx, :), N, N)), | | @(row_idx) eig(reshape(A(row_idx, :), N, N)), |
| 1:rows(A), "UniformOutput", false) | | 1:rows(A), "UniformOutput", false) |
− | </pre> | + | </syntaxhighlight> |
− | }}
| |
| | | |
| With {{codeline|"UniformOutput", false}}, the outputs are contained in cell arrays (one cell per slice). In the sample above, both {{codeline|eigenvectors}} and {{codeline|eigenvalues}} are {{codeline|1x5}} cell arrays. | | With {{codeline|"UniformOutput", false}}, the outputs are contained in cell arrays (one cell per slice). In the sample above, both {{codeline|eigenvectors}} and {{codeline|eigenvalues}} are {{codeline|1x5}} cell arrays. |
| | | |
− | == cluster operation == | + | == References == |
| + | |
| + | <references /> |
| + | |
| + | == See also == |
| | | |
− | Documentation can be found in the {{codeline|README.parallel}} or {{codeline|README.bw}} files, located inside the {{codeline|doc}} directory of the parallel package.
| + | * [[File:]] - examples of how to use <code>parrarrayfun</code> |
| + | * [[NDpar package]] - an extension of these functions to N-dimensional arrays |
| | | |
| [[Category:Octave Forge]] | | [[Category:Octave Forge]] |