This OEP refers to Octave's design of the pkg system. The purpose of this system is to handle the installation, loading, and removal of Octave packages.
The current implementation of pkg has problems mainly when there's both local and global installations of packages, and when multiple octave and package versions try to coexist. This document attempts to design a solution for this.
The main idea is to have multiple databases with information from installed packages in different locations in the filesystem. While this is similar to the current implementation, we plan to design solutions for when package installations clash.
This proposal also suggests to keep the source of the packages. This will allow for easy reinstall of packages (after an Octave upgrade) and test of .oct files from packages (since their tests are in the .cc sources).
This design is meant to make allow the following:
- keep multiple versions of the same package installed side-by-side
- keep multiple versions of Octave in a system using the same installed packages
- deal with dependencies correctly when multiple Octave and packages co-exist
- allow use of packages that may have been installed anywhere
- reinstall a package
- test installed packages
See the user cases section below for several examples.
The definition of a package manager according to wikipedia:
- Verifying file checksums to ensure correct and complete packages;
- Verifying digital signatures to authenticate the origin of packages;
- Applying file archivers to manage encapsulated files;
- Upgrading software with latest versions, typically from a software repository;
- Grouping of packages by function to reduce user confusion;
- Managing dependencies to ensure a package is installed with all packages it requires. This resolved the problem known as Dependency Hell.
Available vs Loaded
To avoid problems reading this document, the distinction between available and loaded package should be done early.
An available package is a package that is currently available to pkg for loading, unloading or reinstall. It is already installed but not necessarily loaded.
A loaded package is an installed package whose functions have been added to Octave's function search path.
Types of package installs
This design supports 3 types of package installations: global (relative to the Octave installation), local (user specific) and external (in any other place).
- global install
- available from startup to everyone.
- local install
- available from startup only for the user that installed it.
- external install
- needs to be made available first. Octave install has no information about it.
Note that Octave itself can be installed in some different ways. It might be a system-wide installation (located somewhere in /usr/local/ for example), a local installation of a normal user (/home/user/anywhere), or installed in the home directory of a system user (anywhere really).
Packages installed globally will be available to everyone from startup. This is the type of package installation that a system administrator would most likely do. The meaning of global here is relative to the Octave installation though. If an Octave installation is local (installed by a user in ~/my-builds), a global installation of a package will still place its files in the home directory of the user (in ~/my-builds).
A global installation is performed automatically if the user installing the package has write permissions to those directories (localfcnfiledir and localapioctfiledir). In case it has no permissions, a local package installation is performed instead.
Local packages are specific to a user. They are located in that user home directory into an .octave directory. As with global package installations, they are available from startup. Unlike global, they are user specific, only available to the user that installed it. A local install for a user can be an external install for some other user.
This are the type of package installation done by users that want to have the latest package version before is available in their system repository, but are not going to build Octave themselves. Also to be used by those who run Octave in a system that they do not maintain where Octave is installed but not packages.
These are like local packages but in a non-standard location. Octave does not know about this installations at startup even though they might have been installing the same Octave that is running at the moment. These can be packages installed in a filesystem that is not always mounted, local packages installs from another user in the same system, or anything else really.
An external package was still installed with pkg, the difference being that the record is not kept by Octave after it. An external package install will have a db associated file just like the db files for the local installs. To load an external package, the path for the db file needs to be passed to pkg and the db named (because there may be more than one db.
These are most like the less used type of packages and will require a bit more knowledge (they will need to point pkg to a .db file, that is all). They will be mostly used for places that develop their own packages and people who don't want to install the package themselves, instead simply using a local install of others as an external package.
Playing nice with downstream packagers
The recommended method for installing Octave and its packages is to use their OS packaging system. Downstream packagers should have the packaging systems make global installs of the packages. If a user wants to install a new version of a package that is not yet available on its system repository, it should make a local package install (default since has a normal user he won't have write permissions to the Octave directory).
If the user decides to make a global package install (install the package using pkg while running Octave with sudo), then he's trying to act as system administrator and should know what he's doing. If he breaks it, its his own fault. Installation of system-wide software is meant to be handled by the system packaging tool. It is just not possible to make pkg cover all of them.
For the parsing of the commands and files, some limitations on package names are required. This will limit what pkg commands can do. For example, if a package name is allowed to use a hyphen, then commands such as "pkg load image-2.0.0" can no longer be used to load a specific package version. Something such as "pkg load image::2.0.0" would have to be used. Using this alternative syntax means that package names cannot have colons.
This is not only limited to package versions. As pkg is to be expanded to load pkg databases from other files (packages in a not always mounted directory for example), it becomes a possibility to have more than one package with the same version available to "pkg load". This means that it becomes necessary to specify which package to load. Something like "pkg load image-lab-2.0.0" can be used. A nice thing would also be "pkg load image-2.0.0 from lab" but that would add one of following 2 limitations: either no package can be named from; or pkg load becomes limited to load only one package.
Also, supporting multiple packages versions means that the word "all" to refer to all packages has new limitations. Should we load only the latest version of each package? And if there's multiple packages with the same version on various db, which one should be loaded? I'd propose the default to be:
- load the latest version available - load the local install of the package - load the global install of the package - load the package from the external .db, starting from the latest added in case there's more than one.
For package names, the proposal is to limit package names to the same as variable names (makes it even easier to check validity with isvarname). So package name must start with a letter, and otherwise be comprised of alphanumeric and underscores characters. Unlike variable names, package names will not be case sensitive since it would create problems when installing packages in filesystems that are not case sensitive (creating directories named Image and image would not be possible in FAT systems).
Actions dependent on a package version can be specified with a -version modifier for that action. It is however necessary to define the default order. Comparison operators should be used to specify versions. If no comparison is use then greater than or equal is assumed. So that the following:
- pkg load image
- loads latest version of the image package. If package is not installed, give error
- pkg load -version 1.0.5 image
- load the latest version greater than or equal to 1.0.5. If no such version found, give error
- pkg load -version >=1.0.5 image
- same as not specifying comparison
- pkg load -version >1.0.5 image
- load anything above that version (does it make sense supporting this? It's not a lot of trouble...)
- pkg load -version =1.0.5 image
- load image package only if the same version (should we use == instead? Why not only =? Should not support both syntax)
- pkg load -version !1.0.5 image
- load any image package available except 1.0.5 (because regressions do exist)
For the other 2 remaining comparisons (< and <=), the question used for > and >= is the same. Does it make sense to support both? For greater than, the only thing that makes sense is greater than or equal and for lesser than, the only think that makes sense is only lesser than since people will mark them as the first release that implemented, or the first release that no longer had, a specific feature.
Whatever code is used on this section should also be used for solving package dependencies.
Should versions take precedence over the database for loading order? For example, if there is a global installation of image 1.0.5 and a 2.0.0 version on an external database named labdev, what version should be loaded?
- pkg load image
- load version 1.0.5 from global (database takes precedence over version)
- pkg load -version >1.0.0 image
- load version 1.0.5 from global (database takes precedence over version)
- pkg load -version >2.0.0 image
- load version 2.0.0 from labdev (only version that meets the requirements)
- pkg load -version >1.0.0 -db labdev image
- load version 2.0.0 from labdev (while database takes precedence, labdev was specified so we load the latest)
Should the -db modifier make pkg ignore completely version? If a system has signal version 1.0.0 on an external named labdev, and 1.2.0 on a global, what should be loaded?
- pkg load signal
- load version 1.2.0 from global
- pkg load -db labdev image
- load latest version from global or from labdev?
The current implementation only accepts versions on the format x.y.z. This does not allow for dev versions, beta or release candidates releases such x.y.z-rc0, x.y.z+, etc
We have compare_versions in core to check for version numbers, whatever is decided should be used with compare_version (or compare_version should be made to support it).
User case #1: global, local and external
Jenny is using Octave on the department cluster. She is not the administrator but there's already a system-wide installation of Octave with the general and signal image installed. She starts Octave and has these 2 packages available to her. These are globally installed packages, available to everyone that starts Octave.
But Jenny also requires the image package and she installs it with "pkg install -forge image". She does not have permissions to administer the system so the image package is installed locally in her home directory. When she starts Octave, she now has 3 packages available, general and signal package which are global (available to everyone that starts Octave), and the image package which is local (available only to her).
Jenny's supervisor is working on a new package (img_analysis) that he makes available for all his students and wants Jenny to use it. Rather than sending them the packages, he wants them to use the package he has installed on his own home directory and tells them to load it as an external package. Jenny uses "pkg load-db boss /home/supervisor/.octave/octave_packages.db" to make his supervisor packages available to her. She now has 4 available packages, the new one (img_analysis) being an external package. However, relative to her supervisor, the same package is a local installation.
The next time she starts Octave, there is no trace of the external packages, pkg still only have 3 available packages so she adds the "pkg load-db" command to her .octaverc file.
In this case however, her supervisor would do better in installing his img_analysis package in some other place to avoid clash with his own local packages. For example, he could have installed it at /home/supervisor/group/octave. Or he could have a filesystem on the network that his students could mount whenever they needed it.
User case #2: keeping tarball
Denise installs Octave 3.4.3 and installs the latest version of the financial (1.0.4) and image (2.0.0) package with "pkg install -forge financial image". After installing the packages, pkg keeps the tarballs in the system in cache for future use. The financial package is comprised of only .m function files while the image package is a mixture of .m and .oct. After installation, she runs `pkg test financial test` which runs all tests in the package (using the cached package to run the tests in the .cc files).
different package versions
Later, Denise installs Octave 3.6.2 but keeps the previous version of Octave on the system since some of her old code no longer runs correctly. Loading the financial package is no problem but loading the image package returns the error
pkg: image package not built for current version of Octave. Run `pkg reinstall image`
Denise runs `pkg reinstall image` which reinstalls the package (effectively keeping the .m files, but simply rebuilding the oct files for the new version). Depending on the Octave version she will run. Different paths will be loaded even though the package is the same.
A new version of the financial package (1.2.0) is released which is dependent on Octave 3.6.0. While using Octave 3.6.2, Denise installs the new version of the package "pkg install -forge financial". The files for the previous version of the package are kept although "pkg load financial" will only load the latest version. However, when Denise is using Octave 3.4.3, as financial 1.2.0 requires Octave 3.6.0, pkg load will only load financial 1.0.4.
User case #3: installing and loading different package versions
Owen is stuck using the financial package 1.0.4 because some of his code no longer works in the latest versions. However the latest version of financial is 1.2.0 and pkg install -forge would install that version instead. He installs the old version of the package with "pkg install -forge financial-1.0.4".
But Owen wants to fix his code for the new version so also installs the new version of the package to experiment. On his code, he then uses "pkg load financial-1.0.4" while "pkg load financial" always loads the latest version of the package.
User case #4: Local installation of packages and Octave
Lisa is using Octave in a remote machine on the biochemistry department. The system administrator installed Octave 3.6.2, signal package 1.2.0, and general 1.0.0. Lisa uses all of them but she also requires the image package. However, the system administrator does not have time to access security issues with the package and tells her to install that package locally. She runs "pkg install -forge image" which installs the package in her home directory. When she runs "pkg list" she sees both the global packages and her own packages
When Octave 3.6.3 is released, Lisa wants to use the new version since it fixes one bug that has been annoying her for a long time but the system administrator does not want to make the update and tells her to build it herself locally
User case #5: users (no sudo) sharing Octave installation with local & global packages
Diana is a student that wants to run her code in the departmental cluster. However, the system does not have an installation of Octave and she needs to install it on her home directory. When she installs packages, these installations are global (to her home directory) since she has write permissions on the directory where octave is installed. She installs the signal and image package.
Ligia is a colleague of Diana that wants to use the same cluster but wants to save herself from the trouble of building Octave. So she uses Diana's install of Octave. Since all packages were installed globally, Lígia has no trouble using the same packages. However, Lígia also needs to use the struct package and installs it "pkg install -forge struct". Since she does not have permissions to write on Diana's home directory, her install of the struct package is local. When Diana runs Octave she does not see the struct package installed, it only shows up for Ligia.
Diana wants to use the same version of the struct package that Ligia already installed but that package was installed locally to Ligia's home directory. She uses "pkg load-list /home/ligia/.octave/packages.db" to add the list of ligias packages to her own list of available packages. which she can load.
User case #6: Automatic dependency tracking
John is a professor of biomechanics and uses Octave on his classes. Most of the exercises he gives to the class require the use of multiple packages in Octave Forge. Depending on the class, the requires packages are different. He creates a metapackage for his student listing all required packages. The students install it with "pkg install -url path-to-his-metapackage". The metapackage has no file it simply lists a bunch of package as dependencies. Since pkg solves this dependencies automatically, a message showing which packages will be installed is displayed before doing it.
User case #7: Package testing
"pkg test" command that would run all tests for a given package.
Where to install things
These should not be hardcoded and taken from octave_config_info. There's many paths there whose purpose is explained on octave sources buil-aux/common.mk (see the Where To Install Things and Octave-specific directories sections on that file.)