# Difference between revisions of "Parallel package"

m (Remove redundant Category:Packages. Categories at bottom.) |
(Overhaul page.) |
||

Line 1: | Line 1: | ||

− | The | + | The {{Forge|parallel|parallel package}} is part of the Octave Forge project. See its {{Forge|parallel|homepage}} for the latest release. |

− | + | This package provides utilities to work with clusters<ref>[https://octave.sourceforge.io/parallel/package_doc/ Package documentation]</ref>, but also functions to parallelize work among cores of a single machine. | |

− | + | * Install: {{Codeline|pkg install -forge parallel}} | |

+ | * Load: {{Codeline|pkg load parallel}} | ||

− | == | + | == Multicore parallelization (parcellfun, pararrayfun) == |

+ | === Calculation on a single array === | ||

− | + | <syntaxhighlight lang="octave"> | |

− | |||

− | |||

− | |||

− | |||

# fun is the function to apply | # fun is the function to apply | ||

fun = @(x) x^2; | fun = @(x) x^2; | ||

Line 19: | Line 17: | ||

vector_y = pararrayfun(nproc, fun, vector_x) | vector_y = pararrayfun(nproc, fun, vector_x) | ||

− | </ | + | </syntaxhighlight> |

− | |||

should output | should output | ||

− | < | + | <syntaxhighlight lang="plain"> |

parcellfun: 10/10 jobs done | parcellfun: 10/10 jobs done | ||

Line 30: | Line 27: | ||

1 4 9 16 25 36 49 64 81 100 | 1 4 9 16 25 36 49 64 81 100 | ||

− | </ | + | </syntaxhighlight> |

{{Codeline|nproc}} returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use {{Codeline|nproc - 1}} instead, in order to leave one cpu free for instance. | {{Codeline|nproc}} returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use {{Codeline|nproc - 1}} instead, in order to leave one cpu free for instance. | ||

Line 39: | Line 36: | ||

If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the {{Codeline|"Vectorized", true}} option. | If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the {{Codeline|"Vectorized", true}} option. | ||

− | + | <syntaxhighlight lang="octave"> | |

# fun is the function to apply, vectorized (see the dot) | # fun is the function to apply, vectorized (see the dot) | ||

fun = @(x) x.^2; | fun = @(x) x.^2; | ||

Line 46: | Line 43: | ||

vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1) | vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1) | ||

− | </ | + | </syntaxhighlight> |

− | |||

should output | should output | ||

− | < | + | <syntaxhighlight lang="plain"> |

parcellfun: 4/4 jobs done | parcellfun: 4/4 jobs done | ||

vector_y = | vector_y = | ||

1 4 9 16 25 36 49 64 81 100 | 1 4 9 16 25 36 49 64 81 100 | ||

− | </ | + | </syntaxhighlight> |

The {{Codeline|"ChunksPerProc"}} option is mandatory with {{Codeline|"Vectorized", true}}. {{Codeline|1}} means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of {{Codeline|"ChunksPerProc"}} allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another. | The {{Codeline|"ChunksPerProc"}} option is mandatory with {{Codeline|"Vectorized", true}}. {{Codeline|1}} means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of {{Codeline|"ChunksPerProc"}} allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another. | ||

Line 61: | Line 57: | ||

=== Output in cell arrays === | === Output in cell arrays === | ||

− | The following sample code was an answer to [ | + | The following sample code was an answer to [https://stackoverflow.com/questions/27422219/for-every-row-reshape-and-calculate-eigenvectors-in-a-vectorized-way this question]. The goal was to diagonalize 2x2 matrices contained as rows of a 2d array (each row of the array being a flattened 2x2 matrix). |

− | + | <syntaxhighlight lang="octave"> | |

− | < | ||

A = [0.6060168 0.8340029 0.0064574 0.7133187; | A = [0.6060168 0.8340029 0.0064574 0.7133187; | ||

− | 0.6325375 0.0919912 0.5692567 0.7432627; | + | 0.6325375 0.0919912 0.5692567 0.7432627; |

− | 0.8292699 0.5136958 0.4171895 0.2530783; | + | 0.8292699 0.5136958 0.4171895 0.2530783; |

− | 0.7966113 0.1975865 0.6687064 0.3226548; | + | 0.7966113 0.1975865 0.6687064 0.3226548; |

− | 0.0163615 0.2123476 0.9868179 0.1478827]; | + | 0.0163615 0.2123476 0.9868179 0.1478827]; |

N = 2; | N = 2; | ||

Line 75: | Line 70: | ||

@(row_idx) eig(reshape(A(row_idx, :), N, N)), | @(row_idx) eig(reshape(A(row_idx, :), N, N)), | ||

1:rows(A), "UniformOutput", false) | 1:rows(A), "UniformOutput", false) | ||

− | </ | + | </syntaxhighlight> |

− | |||

With {{codeline|"UniformOutput", false}}, the outputs are contained in cell arrays (one cell per slice). In the sample above, both {{codeline|eigenvectors}} and {{codeline|eigenvalues}} are {{codeline|1x5}} cell arrays. | With {{codeline|"UniformOutput", false}}, the outputs are contained in cell arrays (one cell per slice). In the sample above, both {{codeline|eigenvectors}} and {{codeline|eigenvalues}} are {{codeline|1x5}} cell arrays. | ||

− | == | + | == References == |

+ | |||

+ | <references /> | ||

+ | |||

+ | == See also == | ||

− | + | * [[File:]] - examples of how to use <code>parrarrayfun</code> | |

+ | * [[NDpar package]] - an extension of these functions to N-dimensional arrays | ||

[[Category:Octave Forge]] | [[Category:Octave Forge]] |

## Revision as of 23:04, 3 March 2021

The parallel package is part of the Octave Forge project. See its homepage for the latest release.

This package provides utilities to work with clusters^{[1]}, but also functions to parallelize work among cores of a single machine.

- Install:
`pkg install -forge parallel`

- Load:
`pkg load parallel`

## Multicore parallelization (parcellfun, pararrayfun)

### Calculation on a single array

```
# fun is the function to apply
fun = @(x) x^2;
vector_x = 1:10;
vector_y = pararrayfun(nproc, fun, vector_x)
```

should output

```
parcellfun: 10/10 jobs done
vector_y =
1 4 9 16 25 36 49 64 81 100
```

`nproc`

returns the number of cpus available (number of cores or twice as much with hyperthreading). One can use `nproc - 1`

instead, in order to leave one cpu free for instance.

`fun`

can be replaced by `@myfun`

if the function resides in the `myfun.m`

file.

In the previous example, the function was executed once for each element of the input `vector_x`

.
If the function is vectorized (can act on a vector and not just on scalar input), then it can be much more efficient to use the `"Vectorized", true`

option.

```
# fun is the function to apply, vectorized (see the dot)
fun = @(x) x.^2;
vector_x = 1:10;
vector_y = pararrayfun(nproc, fun, vector_x, "Vectorized", true, "ChunksPerProc", 1)
```

should output

```
parcellfun: 4/4 jobs done
vector_y =
1 4 9 16 25 36 49 64 81 100
```

The `"ChunksPerProc"`

option is mandatory with `"Vectorized", true`

. `1`

means that each proc will do its job in one shot (chunk). This number can be increased to use less memory for instance. A higher number of `"ChunksPerProc"`

allows also more flexibility in case of long calculations on a busy machine. If one cpu has finished all its jobs, it can take over the pending jobs of another.

### Output in cell arrays

The following sample code was an answer to this question. The goal was to diagonalize 2x2 matrices contained as rows of a 2d array (each row of the array being a flattened 2x2 matrix).

```
A = [0.6060168 0.8340029 0.0064574 0.7133187;
0.6325375 0.0919912 0.5692567 0.7432627;
0.8292699 0.5136958 0.4171895 0.2530783;
0.7966113 0.1975865 0.6687064 0.3226548;
0.0163615 0.2123476 0.9868179 0.1478827];
N = 2;
[eigenvectors, eigenvalues] = pararrayfun(nproc,
@(row_idx) eig(reshape(A(row_idx, :), N, N)),
1:rows(A), "UniformOutput", false)
```

With `"UniformOutput", false`

, the outputs are contained in cell arrays (one cell per slice). In the sample above, both `eigenvectors`

and `eigenvalues`

are `1x5`

cell arrays.

## References

## See also

- [[File:]] - examples of how to use
`parrarrayfun`

- NDpar package - an extension of these functions to N-dimensional arrays