Summer of Code - Getting Started: Difference between revisions

From Octave
Jump to navigation Jump to search
(→‎YAML encoding/decoding: reduce project size)
(→‎Symbolic package: tweaks and formatting)
(3 intermediate revisions by 2 users not shown)
Line 89: Line 89:
== Symbolic package ==
== Symbolic package ==


The [[Symbolic package]] provides symbolic computing and other [https://en.wikipedia.org/wiki/Computer_algebra_system computer algebra system] tools.  The main component of Symbolic is a pure m-file class "@sym" which uses the Python package [https://www.sympy.org SymPy] to do (most of) the actual computations.  The package aims to expose the full functionality of SymPy while also providing a high level of compatibility with the Matlab Symbolic Math Toolbox.  The Symbolic package requires communication between Octave and Python.  In 2016 another GSoC project successfully re-implemented this communication using the new [[Pythonic|Pythonic package]].
The [[Symbolic package]] provides symbolic computing and other [https://en.wikipedia.org/wiki/Computer_algebra_system computer algebra system] tools.  The main component of Symbolic is a pure m-file class "@sym" which uses the Python package [https://www.sympy.org SymPy] to do (most of) the actual computations.  The package aims to expose much of the functionality of SymPy while also providing a high level of compatibility with the Matlab Symbolic Math Toolbox.  The Symbolic package requires communication between Octave and Python.  In 2016 another GSoC project successfully re-implemented this communication using the new [[Pythonic|Pythonic package]].


This project proposes to go further: instead of using Pythonic only for the communication layer, we'll use it throughout the Symbolic project.  For example, we might make "@sym" a subclass of "@pyobject".  We also could stop using the "python_cmd" interface and use Pythonic directly from methods.  The main goal was already mentioned: to expose the ''full functionality'' of SymPy.  For example, we would allow OO-style method calls such as <code>f.diff(x)</code> instead of <code>diff(f, x)</code>.
This project proposes to take this work further while also improving the long-term viability of the Symbolic package.  Some goals include:
* the possibility of using Pythonic directly rather than as one possible communication layer.  For example, we might make "@sym" a subclass of "@pyobject".  We also could stop using the "pycall_sympy__" interface and use Pythonic directly from methods.  Note: there are open questions about how to do this during a transition time when we still support other IPC mechanisms.
* exposing more functionality of SymPy with ''less glue'' in between.  For example, we could allow OO-style method calls such as <code>f.diff(x)</code> as well as <code>diff(f, x)</code>.
* Improvements to the Pythonic package and its long-term maintenance.
* fixing up Symbolic to work with the latest releases of SymPy and Octave.  The project has lagged for a few years and needs some efforts to port to recent and upcoming changes in SymPy code.
* making Symbolic easier to maintain.  The project currently has a low ''bus factor'': improving the CI, making regular releases easier, improving other aspects of maintenance and making the project more welcoming to newcomers.
 
Working on this project involves and interesting and challenging mix of m-file code, Python code, and in the case of Pythonic, perhaps some lower-level C code.


* '''Project size''' [[#Project sizes | [?]]] and '''Difficulty'''
* '''Project size''' [[#Project sizes | [?]]] and '''Difficulty'''
Line 99: Line 106:
* '''Potential mentors'''
* '''Potential mentors'''
: [https://octave.discourse.group/u/cbm Colin B. Macdonald], [https://octave.discourse.group/u/mtmiller Mike Miller], Abhinav Tripathi
: [https://octave.discourse.group/u/cbm Colin B. Macdonald], [https://octave.discourse.group/u/mtmiller Mike Miller], Abhinav Tripathi
== Improve TIFF image support ==
[https://en.wikipedia.org/wiki/TIFF Tag Image File Format (TIFF)] is the de facto standard for scientific images.  Octave uses the [http://www.graphicsmagick.org/ GraphicsMagic] (GM) C++ library to handle [http://www.graphicsmagick.org/formats.html TIFF and many others image formats]. However, GM still has several limitations:
* GM has build option {{codeline|quantum}} which defines the bitdepth to use when reading an image:
** Building GM with '''high quantum''' means that images of smaller bitdepth will take a lot more memory when reading.
** Building GM with '''low quantum''' will make it impossible to read images of higher bitdepth. It also means that the image needs to always be rescaled to the correct range.
* GM supports unsigned integers only, thus incorrectly reading files such as TIFF with floating-point data.
* GM hides details of the image such as whether the image file is indexed.  This makes it hard to access the real data stored on file.
This project aims to implement better TIFF image support using [https://en.wikipedia.org/wiki/Libtiff libtiff], while leaving GM handle all other image formats.  After writing a [https://octave.org/doc/v6.1.0/classdef-Classes.html classdef] interface to libtiff, improve the Octave functions {{manual|imread}}, {{manual|imwrite}}, and {{manual|imfinfo}} to make use of it.
* '''Project size''' [[#Project sizes | [?]]] and '''Difficulty'''
: ~175 hours (medium)
* '''Required skills'''
: Octave, C/C++
* '''Potential mentors'''
: [https://octave.discourse.group/u/carandraug Carnë Draug]


== PolarAxes and Plotting Improvements ==
== PolarAxes and Plotting Improvements ==

Revision as of 18:09, 15 April 2022

Info icon.svg

Since 2011 the GNU Octave project has successfully mentored:

in Summer of Code (SoC) programs by Google and ESA.

Those SoC programs aim to advertise open-source software development and to attract potential new Octave developers.

Steps toward a successful application

  1. 😉💬 We want to get to know you (before the deadline). Communicate with us.
    • Join Octave Discourse or IRC. Using a nickname is fine.
    • Show us that you're motivated to work on Octave 💻. There is no need to present an overwhelming CV 🏆; evidence of involvement with Octave is more important.
    • If you never talked to us, we will likely reject your proposal, even it looks good 🚮
  2. 👩‍🔬 Get your hands dirty.
    • We are curious about your programming skills 🚀
    • Use Octave!
      • If you come across something that does not work the way you like ➡️ try to fix that 🔧
      • Or if you find a missing function ➡️ try to implement it.
  3. 📝💡 Tell us what you are going to do.
    • Do not write just to say what project you're interested in. Be specific about what you are going to do, include links 🔗, show us you know what you are talking about 💡, and ask many smart questions 🤓
    • Remember, we are volunteer developers and not your boss 🙂
  4. 📔 Prepare your proposal with us.
    • Try to show us as early as possible a draft of your proposal 📑
    • If we see your proposal for the first time after the application deadline, it might easily contain some paragraphs not fully clear to us. Ongoing interaction will give us more confidence that you are capable of working on your project 🙂👍
    • Then submit the proposal following the applicable rules, e.g. for GSoC. 📨

How do we judge your application?

Depending on the mentors and SoC program there are varieties, but typically the main factors considered would be:

  • You have demonstrated interest in Octave and an ability to make substantial modifications to Octave
    The most important thing is that you've contributed some interesting code samples to judge your skills. It's OK during the application period to ask for help on how to format these code samples, which normally are Mercurial patches.
  • You showed understanding of your topic
    Your proposal should make it clear that you're reasonably well versed in the subject area and won't need all summer just to read up on it.
  • Well thought out, adequately detailed, realistic project plan
    "I'm good at this, so trust me" isn't enough. In your proposal, you should describe which algorithms you'll use and how you'll integrate with existing Octave code. You should also prepare a project timeline and goals for the midterm and final evaluations.

What you should know about Octave

GNU Octave is mostly written in C++ and its own scripting language that is mostly compatible with Matlab. There are bits and pieces of Fortran, Perl, C, awk, and Unix shell scripts here and there. In addition to being familiar with C++ and Octave's scripting language, you as successful applicant will be familiar with or able to quickly learn about Octave's infrastructure. You can't spend the whole summer learning how to build Octave or prepare a changeset and still successfully complete your project 😇

You should know:

  1. How to build Octave from its source code using the GNU build system.
  2. How to submit patches (changesets).

Suggested projects

The following suggested projects are distilled from the Projects page for the benefit of potential SoC participants. You can also look at our completed past projects for more inspiration.

Info icon.svg
Do you use Octave at your working place or university? Do you have some numerical project in mind? You are always welcome to propose your own projects. If you are passionate about your project, it will be easy to find an Octave developer to mentor and guide you.

openlibm

Over the years Octave faced many issues (see the openlibm page in this wiki for examples) about different C mathematical functions library (in short: "libm") implementations on various systems. To overcome similar issues, developers of the Julia Programming Language started the openlibm project "to have a good libm [ ...] that work[s] consistently across compilers and operating systems, and in 32-bit and 64-bit environments". openlibm is supported by major Linux distributions (e.g. Debian/Ubuntu, RHEL/Fedora,SLES/openSUSE, ...) and the MS Windows MXE package was added as well.

This project consists of learning about the usage of GNU Autotools in Octave and ways to detect openlibm. As the next step the Octave code base has to be reviewed under the guidance of a mentor and relevant code changes should be performed. Finally, relevant code changes in the Octave test suite are performed and tested on various Linux, MS Windows, and macOS machines with the help of the Octave community.

  • Project size [?] and Difficulty
~175 hours (easy)
  • Required skills
Octave, C/C++, Autotools
  • Potential mentors
Carlo de Falco, Kai

ode15{i,s} : Matlab Compatible DAE solvers

An initial implementation of Matlab compatible Differential Algebraic Equations (DAE) solvers, ode15i and ode15s, based on SUNDIALS, was done by Francesco Faccio during GSoC 2016. The code is maintained in the main Octave repository and consists mainly of the following three files: libinterp/dldfcn/__ode15__.cc, scripts/ode/ode15i.m and scripts/ode/ode15s.m.

The decic function for selecting consistent initial conditions for ode15i can be made more Matlab compatible by using another algorithm. Another useful extension is to make ode15{i,s} work with datatypes other than double and to improve interpolation at intermediate time steps.

  • Project size [?] and Difficulty
~350 hours (medium)
  • Required skills
Octave, C/C++; familiarity with numerical methods for DAEs
  • Potential mentors
Francesco Faccio, Carlo de Falco, Marco Caliari, Jacopo Corno, Sebastian Schöps

Symbolic package

The Symbolic package provides symbolic computing and other computer algebra system tools. The main component of Symbolic is a pure m-file class "@sym" which uses the Python package SymPy to do (most of) the actual computations. The package aims to expose much of the functionality of SymPy while also providing a high level of compatibility with the Matlab Symbolic Math Toolbox. The Symbolic package requires communication between Octave and Python. In 2016 another GSoC project successfully re-implemented this communication using the new Pythonic package.

This project proposes to take this work further while also improving the long-term viability of the Symbolic package. Some goals include:

  • the possibility of using Pythonic directly rather than as one possible communication layer. For example, we might make "@sym" a subclass of "@pyobject". We also could stop using the "pycall_sympy__" interface and use Pythonic directly from methods. Note: there are open questions about how to do this during a transition time when we still support other IPC mechanisms.
  • exposing more functionality of SymPy with less glue in between. For example, we could allow OO-style method calls such as f.diff(x) as well as diff(f, x).
  • Improvements to the Pythonic package and its long-term maintenance.
  • fixing up Symbolic to work with the latest releases of SymPy and Octave. The project has lagged for a few years and needs some efforts to port to recent and upcoming changes in SymPy code.
  • making Symbolic easier to maintain. The project currently has a low bus factor: improving the CI, making regular releases easier, improving other aspects of maintenance and making the project more welcoming to newcomers.

Working on this project involves and interesting and challenging mix of m-file code, Python code, and in the case of Pythonic, perhaps some lower-level C code.

  • Project size [?] and Difficulty
~350 hours (medium)
  • Required skills
Octave, C/C++, Python; object-oriented programming (OOP) in Octave
  • Potential mentors
Colin B. Macdonald, Mike Miller, Abhinav Tripathi

Improve TIFF image support

Tag Image File Format (TIFF) is the de facto standard for scientific images. Octave uses the GraphicsMagic (GM) C++ library to handle TIFF and many others image formats. However, GM still has several limitations:

  • GM has build option quantum which defines the bitdepth to use when reading an image:
    • Building GM with high quantum means that images of smaller bitdepth will take a lot more memory when reading.
    • Building GM with low quantum will make it impossible to read images of higher bitdepth. It also means that the image needs to always be rescaled to the correct range.
  • GM supports unsigned integers only, thus incorrectly reading files such as TIFF with floating-point data.
  • GM hides details of the image such as whether the image file is indexed. This makes it hard to access the real data stored on file.

This project aims to implement better TIFF image support using libtiff, while leaving GM handle all other image formats. After writing a classdef interface to libtiff, improve the Octave functions imread, imwrite, and imfinfo to make use of it.

  • Project size [?] and Difficulty
~175 hours (medium)
  • Required skills
Octave, C/C++
  • Potential mentors
Carnë Draug

PolarAxes and Plotting Improvements

Octave currently provides supports for polar axes by using a Cartesian 2-D axes and adding a significant number of properties and callback listeners to get things to work. What is needed is the implementation of a dedicated "polaraxes" object in C++. This will require creating a new fundamental graphics object type, and programming in C++/OpenGL to render the object. When "polaraxes" exists as an object type, then m-files will be written to access them, including polaraxes.m, polarplot.m, rticks.m, rticklabels.m, thetaticks, thetaticklabels.m, rlim.m, thetalim.m. This relates to bug #49804.

  • Project size [?] and Difficulty
~350 hours (medium)
  • Required skills
Octave, C/C++; optional experience with OpenGL programming
  • Potential mentors
Rik

Table datatype

In 2013, Matlab introduced a new table datatype to conveniently organize and access data in tabular form. This datatype has not been introduced to Octave yet (see bug #44571). However, there are two initial implementation approaches https://github.com/apjanke/octave-tablicious and https://github.com/gnu-octave/table.

Based upon the existing approaches, the goal of this project is to define an initial subset of table functions, which involve sorting, splitting, merging, and file I/O and implement it within the given time frame.

  • Project size [?] and Difficulty
~350 hours (hard)
  • Required skills
Octave, C/C++
  • Potential mentors
Kai Abdallah

YAML encoding/decoding

YAML, is a very common human readable and structured data format. Unfortunately, GNU Octave (and Matlab) still lacks of builtin support of that omnipresent data format. Having YAML support, Octave can easily read and write config files, which often use YAML or JSON. The latter JSON format has been successfully implemented for Octave during GSoC 2020.

The goal of this project is to repeat the GSoC 2020 success story with Rapid YAML or another fast C/C++ library.

The first step is research about existing Octave/Matlab and C/C++ implementations, for example:

Then evaluate (and to cherry pick from) existing implementations above, compare strength and weaknesses. After this, an Octave package containing en- and decoding functions (for example yamlencode and yamldecode) shall be created. This involves proper documentation of the work and unit tests to ensure the correctness of the implementation.

Finally, the package is considered to be merged into core Octave, probably after the GSoC project. However, it can be used immediately from Octave as package and is backwards-compatible with older Octave versions.

  • Project size [?] and Difficulty
~175 hours (easy)
  • Required skills
Octave, C/C++
  • Potential mentors
Kai, Abdallah

TISEAN package

The TISEAN package provides an Octave interface to TISEAN is a suite of code for nonlinear time series analysis. In 2015, another GSoC project started with the work to create interfaces to many TISEAN functions, but there is still work left to do. There are missing functions to do computations on spike trains, to simulate autoregresive models, to create specialized plots, etc. These are of importance for many scientific disciplines involving statistical computations and signal processing.

  • Project size [?] and Difficulty
~350 hours (medium)
  • Required skills
Octave, C/C++; FORTRAN API knowledge
  • Potential mentors
KaKiLa

Project sizes

Since GSoC 2022 there exist two project sizes[1][2]:

  • ~175 hours (~12 weeks, Jun 13 - Sept 12)
  • ~350 hours (~22 weeks, Jun 13 - Nov 21)

Footnotes

See also