Summer of Code - Getting Started: Difference between revisions

Jump to navigation Jump to search
(minor edits and conformity to shorter GSoC)
Line 1: Line 1:
The following is distilled from the [[Projects]] page for the benefit of potential [https://summerofcode.withgoogle.com Google] and [https://socis.esa.int/ ESA] Summer of Code (SoC) students. Although students are welcome to attempt any of the projects in that page or any of their own choosing, here we offer some suggestions on what good student projects might be.
The following is largely distilled from the [[Projects]] page for the benefit of potential [https://summerofcode.withgoogle.com Google] and [https://socis.esa.int/ ESA] Summer of Code (SoC) students. Although students are welcome to attempt any of the projects in that page or any of their own choosing, here we offer some suggestions on what good student projects might be.


You can also take a look at last years [[Summer of Code]] projects for inspiration.
You can also take a look at last years [[Summer of Code]] projects for inspiration.
Line 8: Line 8:
* If you aren't communicating with us before the application is due, your application will not be accepted.
* If you aren't communicating with us before the application is due, your application will not be accepted.
*:* '''Join the [https://lists.gnu.org/mailman/listinfo/octave-maintainers maintainers mailing list]''' or read the archives and see what topics we discuss and how the developers interact with each other.
*:* '''Join the [https://lists.gnu.org/mailman/listinfo/octave-maintainers maintainers mailing list]''' or read the archives and see what topics we discuss and how the developers interact with each other.
*:* '''Hang out in our [https://webchat.freenode.net/?channels=#octave IRC channel]'''. Ask questions, answer questions from users, show us that you are motivated, and well-prepared. There will be more applicants than we can effectively mentor, so do ask for feedback on your public application to increase the strength of your proposal!
*:* '''Hang out in our [https://webchat.freenode.net/?channels=#octave IRC channel]'''. Ask questions, answer questions from users, show us that you are motivated and well-prepared. There will be more applicants than we can effectively mentor, so do ask for feedback on your public application to increase the strength of your proposal!
* '''Do not wait for us to tell you what to do'''
* '''Do not wait for us to tell you what to do'''
*: You should be doing something that interests you, and should not need us to tell you what to do.  Similarly, you shouldn't ask us what to do either.
*: You should be doing something that interests you, and should not need us to tell you what to do.  Similarly, you shouldn't ask us what to do either.
*:* When you email the list and mentors, do not write it to say in what project you're interested. Be specific about your questions and clear on the email subject. For example, do not write an email with the subject "GSoC student interested in the ND images projects".  Such email is likely be ignored.  Instead, show you are already working on the topic, and email "Problem implementing morphological operators with bitpacked ND images".
*:* When you email the list and mentors, do not write just to say in what project you're interested. Be specific about your questions and clear on the email subject. For example, do not write an email with the subject "GSoC student interested in the ND images projects".  Such email is likely to be ignored.  Instead, show you are already working on the topic, and email "Problem implementing morphological operators with bitpacked ND images".
*:* It is good to ask advice on how to solve something you can't but you must show some work done.  Remember, we are mentors and not your boss.  Read [http://www.catb.org/esr/faqs/smart-questions.html How to ask questions the smart way]: <blockquote>''Prepare your question. Think it through. Hasty-sounding questions get hasty answers, or none at all. The more you do to demonstrate that having put thought and effort into solving your problem before seeking help, the more likely you are to actually get help.''</blockquote>
*:* It is good to ask advice on how to solve something you can't, but you must show some work done.  Remember, we are mentors and not your boss.  Read [http://www.catb.org/esr/faqs/smart-questions.html How to ask questions the smart way]: <blockquote>''Prepare your question. Think it through. Hasty-sounding questions get hasty answers, or none at all. The more you do to demonstrate that having put thought and effort into solving your problem before seeking help, the more likely you are to actually get help.''</blockquote>
*:* It can be difficult at the beginning to think on something to do.  This is nature of free and open source software development.  You will need to break the mental barrier that prevents you from thinking on what can be done.  Once you do that, you will have no lack of ideas for what to do next.
*:* It can be difficult at the beginning to think on something to do.  This is nature of free and open source software development.  You will need to break the mental barrier that prevents you from thinking on what can be done.  Once you do that, you will have no lack of ideas for what to do next.
*:* Use Octave.  Eventually you will come across something that does not work the way you like.  Fix that.  Or you will come across a missing function.  Implement it.  It may be a hard problem (they usually are). While solving that problem, you may find other missing capabilities or smaller bug fixes.  Implement and contribute those to Octave.
*:* Use Octave.  Eventually you will come across something that does not work the way you like.  Fix that.  Or you will come across a missing function.  Implement it.  It may be a hard problem (they usually are). While solving that problem, you may find other missing capabilities or smaller bug fixes.  Implement and contribute those to Octave.
Line 18: Line 18:


==  Find Something That Interests You ==  
==  Find Something That Interests You ==  
*: It's '''critical''' that you '''find a project that excites you'''.  You'll be spending most of the summer working on it (we expect you to treat the SoC as a full-time job).
*: It's '''critical''' that you '''find a project that excites you'''.  You'll be spending most of the summer working on it (we expect you to treat the SoC as a job).
*: Don't just tell us how interested you are, show us that you're willing and able to '''contribute''' to Octave. You can do that by [https://savannah.gnu.org/bugs/?group=octave fixing a few bugs] or [https://savannah.gnu.org/patch/?group=octave submitting patches] well before the deadline, in addition to regularly interacting with Octave maintainers and users on the mailing list and IRC. Our experience shows us that successful SoC students demonstrate their interest early and often.
*: Don't just tell us how interested you are, show us that you're willing and able to '''contribute''' to Octave. You can do that by [https://savannah.gnu.org/bugs/?group=octave fixing a few bugs] or [https://savannah.gnu.org/patch/?group=octave submitting patches] well before the deadline, in addition to regularly interacting with Octave maintainers and users on the mailing list and IRC. Our experience shows us that successful SoC students demonstrate their interest early and often.
== Prepare Your Proposal With Us ==
== Prepare Your Proposal With Us ==
Line 28: Line 28:
*: Fill out our '''''private''''' application template.
*: Fill out our '''''private''''' application template.
*:* This is best done by copying the '''[[Template:Student_application_template_private|template]]''' from its page and '''adding the required information to your application at Google (melange)''' or at '''ESA'''.<br>
*:* This is best done by copying the '''[[Template:Student_application_template_private|template]]''' from its page and '''adding the required information to your application at Google (melange)''' or at '''ESA'''.<br>
*:* Only the organization admin and the possible mentors will see this data.  You can still edit it after submitting until the deadline!
*:* Only the organization admin and the possible mentors will see this data.  You can still edit it after submitting, until the deadline!


== Things You'll be Expected to Know or Quickly Learn On Your Own ==
== Things You'll be Expected to Know or Quickly Learn On Your Own ==
Line 55: Line 55:
*: We also have [http://webchat.freenode.net?channels=octave the #octave IRC channel in Freenode].
*: We also have [http://webchat.freenode.net?channels=octave the #octave IRC channel in Freenode].
*: You should be familiar with the IRC channel.  It's very helpful for new contributors (you) to get immediate feedback on ideas and code.
*: You should be familiar with the IRC channel.  It's very helpful for new contributors (you) to get immediate feedback on ideas and code.
*: Unless your primary mentor has a strong preference for some other method of communication, the IRC channel will likely be your primary means of communicating with your mentor and Octave developers.
*: Unless your primary mentor has a strong preference for some other method of communication, the IRC channel might be your primary means of communicating with your mentor and Octave developers.
* '''The Octave Forge Project'''
* '''The Octave Forge Project'''
*: [https://octave.sourceforge.io/ Octave Forge] is a collection of contributed packages that enhance the capabilities of core Octave. They are somewhat analogous to Matlab's toolboxes.
*: [https://octave.sourceforge.io/ Octave Forge] is a collection of contributed packages that enhance the capabilities of core Octave. They are somewhat analogous to Matlab's toolboxes.
Line 138: Line 138:
The resulting code has been pushed into the main Octave repository in the development branch and
The resulting code has been pushed into the main Octave repository in the development branch and
consists mainly of the following three files
consists mainly of the following three files
[http://hg.savannah.gnu.org/hgweb/octave/file/4890b1c4a6bd/libinterp/dldfcn/__ode15__.cc __ode15__.cc],
[https://hg.savannah.gnu.org/hgweb/octave/file/tip/libinterp/dldfcn/__ode15__.cc __ode15__.cc],
[http://hg.savannah.gnu.org/hgweb/octave/file/4890b1c4a6bd/scripts/ode/ode15i.m ode15i.m] and
[https://hg.savannah.gnu.org/hgweb/octave/file/tip/scripts/ode/ode15i.m ode15i.m] and
[http://hg.savannah.gnu.org/hgweb/octave/file/4890b1c4a6bd/scripts/ode/ode15s.m ode15s.m].
[https://hg.savannah.gnu.org/hgweb/octave/file/tip/scripts/ode/ode15s.m ode15s.m].
The list of outstanding tracker tickets concerning this implementation can be found  
The list of outstanding tracker tickets concerning this implementation can be found  
[https://savannah.gnu.org/search/?Search=Search&words=ode15&type_of_search=bugs&only_group_id=1925&exact=1&max_rows=25#options here]
[https://savannah.gnu.org/search/?Search=Search&words=ode15&type_of_search=bugs&only_group_id=1925&exact=1&max_rows=25#options here]
Line 148: Line 148:
* Implement a better function for selecting consistent initial conditions compatible with Matlab's decic.m. The algorithm to use is described [http://faculty.smu.edu/shampine/cic.pdf here]
* Implement a better function for selecting consistent initial conditions compatible with Matlab's decic.m. The algorithm to use is described [http://faculty.smu.edu/shampine/cic.pdf here]


* make ode15{i,s} with datatypes other than double
* make ode15{i,s} work with datatypes other than double


* improve interpolation at intermediate time steps.
* improve interpolation at intermediate time steps.
Line 154: Line 154:
* general code profiling and optimization  
* general code profiling and optimization  


Other tasks, not strictly connected to ode15{i,s} but closely related that could be added  
Other tasks, not strictly connected to ode15{i,s} but closely related, that could be added  
to a possible project plan would be improving documentation and tests in odepkg and removing  
to a possible project plan would be improving documentation and tests in odepkg and removing  
overlaps with the documentation in core Octave.
overlaps with the documentation in core Octave.
Line 179: Line 179:


GNU Octave currently has the following Krylov subspace methods for sparse linear systems: pcg (spd matrices) and pcr (Hermitian matrices), bicg,
GNU Octave currently has the following Krylov subspace methods for sparse linear systems: pcg (spd matrices) and pcr (Hermitian matrices), bicg,
bicgstab, cgs, gmres, and qmr (general matrices). The description of some of them (pcr, qmr) and their error messages are not aligned. Moreover, they have similar blocks of code (input check for instance) which can be written once and for all in common functions. The first step in this project could be a revision and a synchronization of the codes, starting from the project [https://socis16octave-improveiterativemethods.blogspot.com/ SOCIS2016] which is already merged into Octave (cset {{cset|6266e321ef22}}).
bicgstab, cgs, gmres, and qmr (general matrices). The description of some of them (pcr, qmr) and their error messages are not aligned. Moreover, they have similar blocks of code (input check for instance) which can be written once and for all in common functions. The first step in this project could be a revision and a synchronization of the codes, starting from the [https://socis16octave-improveiterativemethods.blogspot.com/ SOCIS2016] project, which is already merged into Octave (cset {{cset|6266e321ef22}}).


In Matlab, some additional methods are available: minres and symmlq (symmetric matrices), bicgstabl (general matrices), lsqr (least
In Matlab, some additional methods are available: minres and symmlq (symmetric matrices), bicgstabl (general matrices), lsqr (least
Line 280: Line 280:
=== Symbolic package ===
=== Symbolic package ===


Octave's [https://github.com/cbm755/octsympy Symbolic package] handles symbolic computing and other CAS tools.  The main component of Symbolic is a pure m-file class "@sym" which uses the Python package [https://www.sympy.org SymPy] to do (most of) the actual computations.  The package aims to expose the full functionality of SymPy while also providing a high-level of compatibility with the Matlab Symbolic Math Toolbox.  The Symbolic package requires communication between Octave and Python.  Recently, a GSoC2016 project successfully re-implemented this communication using the new [[Pythonic|Pythonic package]].
Octave's [https://github.com/cbm755/octsympy Symbolic package] provides symbolic computing and other [https://en.wikipedia.org/wiki/Computer_algebra_system computer algebra system] tools.  The main component of Symbolic is a pure m-file class "@sym" which uses the Python package [https://www.sympy.org SymPy] to do (most of) the actual computations.  The package aims to expose the full functionality of SymPy while also providing a high level of compatibility with the Matlab Symbolic Math Toolbox.  The Symbolic package requires communication between Octave and Python.  A GSoC2016 project successfully re-implemented this communication using the new [[Pythonic|Pythonic package]].


This project proposes to go further: instead of using Pythonic only for the communication layer, we'll use it throughout the Symbolic project.  For example, we might make "@sym" a subclass of "@pyobject".  We also could stop using the "python_cmd" interface and use Pythonic directly from methods.  The main goal was already mentioned: to expose the *full functionality* of SymPy.  For example, we would allow OO-style method calls such as "f.diff(x)" instead of "diff(f, x)".
This project proposes to go further: instead of using Pythonic only for the communication layer, we'll use it throughout the Symbolic project.  For example, we might make "@sym" a subclass of "@pyobject".  We also could stop using the "python_cmd" interface and use Pythonic directly from methods.  The main goal was already mentioned: to expose the *full functionality* of SymPy.  For example, we would allow OO-style method calls such as "f.diff(x)" instead of "diff(f, x)".
Line 308: Line 308:
=== Jupyter Notebook Integration ===
=== Jupyter Notebook Integration ===


[http://jupyter.org Jupyter Notebook] is a web-based worksheet interface for computing.  There is a [https://github.com/Calysto/octave_kernel Octave kernel for Jupyter].  This project seeks in first place to improve that kernel to make Octave a first-class experience within the Jupyter Notebook.
[http://jupyter.org Jupyter Notebook] is a web-based worksheet interface for computing.  There is an [https://github.com/Calysto/octave_kernel Octave kernel for Jupyter].  This project seeks to improve that kernel to make Octave a first-class experience within the Jupyter Notebook.


In general the [https://nbformat.readthedocs.io/en/latest/ Jupyter Notebook Format] is a plain JSON document, which is supported since Octave 7 (current development version).  Another valuable project outcome was to run (and fill) those Jupyter Notebooks from within Octave.  This would enable Jupyter Notebook users to evaluate long running Octave Notebooks on a computing server without permanent browser connection, which is [https://github.com/jupyter/notebook/issues/1647 still a pending issue].
In general the [https://nbformat.readthedocs.io/en/latest/ Jupyter Notebook Format] is a plain JSON document, which is supported since Octave 7 (current development version).  Another valuable project outcome is to run (and fill) those Jupyter Notebooks from within Octave.  This would enable Jupyter Notebook users to evaluate long running Octave Notebooks on a computing server without permanent browser connection, which is [https://github.com/jupyter/notebook/issues/1647 still a pending issue].


* '''Minimum requirements'''
* '''Minimum requirements'''
Line 357: Line 357:
Familiarity with how other languages handle this problem will be useful to come up with elegant solutions.
Familiarity with how other languages handle this problem will be useful to come up with elegant solutions.
In some cases, there are standards to follow.
In some cases, there are standards to follow.
For example, there are specifications published by freedesktop.org about where files should go ([http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html base directory spec]) and Windows seems to have its own standards.
For example, there are specifications published by freedesktop.org about where files should go ([http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html base directory spec]), and Windows seems to have its own standards.
See bugs {{bug|36477}} and {{bug|40444}} for more details.
See bugs {{bug|36477}} and {{bug|40444}} for more details.


In addition, package names may start to collide very easily.
In addition, package names may start to collide very easily.
One horrible way to workaround this by is choosing increasingly complex package names that give no hint on the package purpose.
One horrible way to work around this by is choosing increasingly complex package names that give no hint on the package purpose.
A much better is option is providing an Authority category like Perl 6 does.
A much better is option is providing an Authority category like Perl 6 does.
Nested packages is also an easy way to provide packages for specialized subjects (think {{codeline|image::morphology}}).
Nested packages is also an easy way to provide packages for specialized subjects (think {{codeline|image::morphology}}).
Line 380: Line 380:
The image package has partial functionality for N-dimensional images. These images exist for example in medical imaging where slices from scans are assembled to form anatomical 3D images. If taken over time and at different laser wavelengths or light filters, they can also result in 5D images. Albeit less common, images with even more dimensions also exist. However, their existence is irrelevant since most of the image processing operations are mathematical operations which are independent of the number of dimensions.
The image package has partial functionality for N-dimensional images. These images exist for example in medical imaging where slices from scans are assembled to form anatomical 3D images. If taken over time and at different laser wavelengths or light filters, they can also result in 5D images. Albeit less common, images with even more dimensions also exist. However, their existence is irrelevant since most of the image processing operations are mathematical operations which are independent of the number of dimensions.


As part of GSoC 2013, the core functions for image IO, {{codeline|imwrite}} and {{codeline|imread}}, were extended to better support this type of images. Likewise, many functions in the image package, mostly morphology operators, were expanded to deal with this type of image. Since then, many other functions have been improved, sometimes completely rewritten, to abstract from the number of dimensions. In a certain way, supporting ND images is also related to choosing good algorithms since such large images tend to be quite large.
As part of GSoC 2013, the core functions for image IO, {{codeline|imwrite}} and {{codeline|imread}}, were extended to better support this type of images. Likewise, many functions in the image package, mostly morphology operators, were expanded to deal with this type of image. Since then, many other functions have been improved, sometimes completely rewritten, to abstract from the number of dimensions. In a certain way, supporting ND images is also related to choosing good algorithms since such images tend to be quite large.


This project will continue on the previous work, and be mentored by the previous GSoC student and current image package maintainer. Planning the project requires selection of functions lacking ND support and identifying their dependencies. For example, supporting {{codeline|imclose}} and {{codeline|imopen}} was better implemented by supporting {{codeline|imerode}} and {{codeline|imdilate}} which then propagated ND support to all of its dependencies. These dependencies need to be discovered first since often they are not being used yet, and may even be missing function. This project can also be about implementing functions that have [[Image package#Missing functions | not yet been implemented]]. Also note that while some functions in the image package will accept ND images as input, they are actually not correctly implemented and will give incorrect results.
This project will build on the previous work, and be mentored by the previous GSoC student and current image package maintainer. Planning the project requires selection of functions lacking ND support and identifying their dependencies. For example, supporting {{codeline|imclose}} and {{codeline|imopen}} was better implemented by supporting {{codeline|imerode}} and {{codeline|imdilate}} which then propagated ND support to all of its dependencies. These dependencies need to be discovered first since often they are not being used yet, and may even be missing functions. This project can also be about implementing functions that have [[Image package#Missing functions | not yet been implemented]]. Also note that while some functions in the image package will accept ND images as input, they are actually not correctly implemented and will give incorrect results.


* '''Required skills'''
* '''Required skills'''
: m-file scripting, and a fair amount of C++ since a lot of image analysis cannot be vectorized. Familiarity with common CS algorithms and willingness to read literature describing new algorithms will be useful.  
: m-file scripting, and a fair amount of C++ since a lot of image analysis cannot be vectorized. Familiarity with common computer science algorithms and willingness to read literature describing new algorithms will be useful.  
* '''Difficulty'''
* '''Difficulty'''
: Difficult.
: Difficult.
Line 393: Line 393:
=== Improve Octave's image IO ===
=== Improve Octave's image IO ===


There are a lot of image formats. To handle this, Octave uses [http://www.graphicsmagick.org/ GraphicsMagic] (GM), a library capable of handling [http://www.graphicsmagick.org/formats.html a lot of them] in a single C++ interface. However, GraphicsMagick still has its limitations. The most important are:
There are a lot of image formats. Octave uses [http://www.graphicsmagick.org/ GraphicsMagic] (GM), a library capable of handling [http://www.graphicsmagick.org/formats.html a lot of them] in a single C++ interface. However, GraphicsMagick still has its limitations. The most important are:


* GM has build option {{codeline|quantum}} which defines the bitdepth to use when reading an image. Building GM with high quantum means that images of smaller bitdepth will take a lot more memory when reading, but building it too low will make it impossible to read images of higher bitdepth. It also means that the image needs to always be rescaled to the correct range.
* GM has build option {{codeline|quantum}} which defines the bitdepth to use when reading an image. Building GM with high quantum means that images of smaller bitdepth will take a lot more memory when reading, but building it too low will make it impossible to read images of higher bitdepth. It also means that the image needs to always be rescaled to the correct range.
* GM supports unsigned integers only thus incorrectly reading files such as TIFF with floating point data
* GM supports unsigned integers only, thus incorrectly reading files such as TIFF with floating point data
* GM hides away details of the image such as whether the image file is indexed.  This makes it hard to access the real data stored on file.
* GM hides details of the image such as whether the image file is indexed.  This makes it hard to access the real data stored on file.


This project would implement better image IO for scientific file formats while leaving GM handle the others. Since TIFF is the de facto standard for scientific images, this should be done first. Among the targets for the project are:
This project would implement better image IO for scientific file formats while leaving GM handle the others. Since TIFF is the de facto standard for scientific images, this should be done first. Among the targets for the project are:


* implement the Tiff class which is a wrap around libtiff, using classdef. To avoid creating too many private __oct functions, this project could also create a C++ interface to declare new Octave classdef functions.
* implement the Tiff class, which is a wrapper around libtiff, using classdef. To avoid creating too many private __oct functions, this project could also create a C++ interface to declare new Octave classdef functions.
* improve imread, imwrite, and imfinfo for tiff files using the newly created Tiff class
* improve imread, imwrite, and imfinfo for tiff files using the newly created Tiff class
* port the bioformats into Octave and prepare a package for it
* port bioformats into Octave and prepare a package for it
* investigate other image IO libraries
* investigate other image IO libraries
* clean up and finish the dicom package to include into Octave core
* clean up and finish the dicom package to include into Octave core
* prepare a matlab compatible implementation of the FITS package for inclusion in Octave core
* prepare a Matlab-compatible implementation of the FITS package for inclusion in Octave core


* '''Required skills'''
* '''Required skills'''
: Knowledge of C++ and C since most libraries are written in those languages.
: Knowledge of C++ and C, since most libraries are written in those languages.
* '''Difficulty'''
* '''Difficulty'''
: Medium.
: Medium.
Line 419: Line 419:
=== PolarAxes and Plotting Improvements ===
=== PolarAxes and Plotting Improvements ===


Octave currently provides supports for polar axes by using a Cartesian 2-D axes and adding a significant number of properties and callback listerners to get things to work.  What is needed is a first class implementation of a "polaraxes" object in C++.  This will require creating a new fundamental graphics object type, and programming in C++/OpenGL to render the object.  When "polaraxes" exist as an object type then m-files will be written to access them including polaraxes.m, polarplot.m, rticks.m, rticklabels.m, thetaticks, thetaticklabels.m, rlim.m, thetalim.m.  relates to {{bug|35565}}, {{bug|49804}}, {{bug|52643}}.
Octave currently provides supports for polar axes by using a Cartesian 2-D axes and adding a significant number of properties and callback listeners to get things to work.  What is needed is the implementation of a dedicated "polaraxes" object in C++.  This will require creating a new fundamental graphics object type, and programming in C++/OpenGL to render the object.  When "polaraxes" exists as an object type, then m-files will be written to access them, including polaraxes.m, polarplot.m, rticks.m, rticklabels.m, thetaticks, thetaticklabels.m, rlim.m, thetalim.m.  This relates to {{bug|35565}}, {{bug|49804}}, {{bug|52643}}.


* '''Minimum requirements'''
* '''Minimum requirements'''