Difference between revisions of "JWE Project Ideas"

From Octave
Jump to navigation Jump to search
(Created page with "* Improve interface for communication between GUI and interpreter - Currently, communication between the GUI and the interpreter mostly happens when the interpreter is...")
 
 
(15 intermediate revisions by 2 users not shown)
Line 1: Line 1:
* Improve interface for communication between GUI and interpreter
+
<!-- This file should be edited at https://wiki.octave.org/JWE_Project_Ideas -->
  
  - Currently, communication between the GUI and the interpreter
+
== Language and functions ==
    mostly happens when the interpreter is otherwise idle and waiting
 
    for user input at the command prompt and the implementation is
 
    somewhat complicated.  We need to determine whether this is the
 
    best we can do, or if there is some other implementation that
 
    would be more flexible and reliable.
 
  
* GUI command window
+
=== classdef issues ===
  
  - The implementation of the GUI command window for Unix-like systems
+
==== Compatibility issues ====
    is a completely separate implementations from the one used on
 
    Windows systems.  There should be only one, and the GUI should be
 
    completely in charge of user input and output.  This will probably
 
    require implementing some kind of simple output pager internally
 
    instead of using an external program, but overall user interaction
 
    could be improved.
 
  
* Interrupt handling in the GUI
+
Make a list here, pointing to individual bug reports?
  
  - This issue is related to the GUI command window.  Interrupt
+
==== Load/save for classdef ====
    signals (typically generated by typing Control-C at the command
 
    prompt) cause some trouble with the GUI and when multiple threads
 
    are active, particularly inside of library code like the BLAS.
 
    There are a number of bug reports for this problem.  We need to
 
    find a way to reliably interrupt the interpreter.
 
  
* Generating publication-quality figures
+
See also general load/save issues.
  
  - Generating EPS or PDF versions of figures needs improvement.
+
==== Improve / simplify implementation ====
  
* OpenGL graphics system issues
+
Although the basic features that are implemented now appear to mostly work, the implementation seems overly complicated, making it difficult to debug and modify.  There seems to be quite a bit of room for improvement here.
  
  - Scaling plot data values/ranges to fit in single-precision OpenGL values
+
=== Syntax, semantics, and data types ===
  
  - Performance issues
+
==== Function handle refactoring ====
  
  - Lack of WYSIWYG
+
* Load/save for all types of function handles and all data formats (ascii, binary, hdf5, mat5)
 +
* Use std::shared_ptr for function objects instead of bare pointer to octave_function.
  
  - Duplication of effort with FLTK and Qt widgets.  With the rest
+
==== String class ====
    of the GUI using Qt widgets, we should eliminte the FLTK plotting
 
    widget.  To do that, we will need to make the Qt plotting widget
 
    work when Octave is started with --no-gui and ensure that all
 
    features in the FLTK widget are also present in the Qt widget.
 
  
* Improvements to classdef (the Matlab object-oriented programming
+
Matlab now uses "" to create string objects that behave differently from Octave double-quoted strings.  We could start by creating a compatible string class, then hooking it up to the "" syntax.  No matter what, the transition will be difficult because Matlab's "" strings still treat "\n" as two characters (backslash and n) rather than a single character (newline).
  framework)
 
  
  - Resolve remaining Matlab compatibility issues.
+
==== Other new data types ====
  
  - Make it possible to load and save classdef objects.
+
Andrew Janke has implementations of these classes (FIX: link to repos here)
  
  - Improve and simplify the implementation.  Although the basic
+
* table
    features that are implemented now appear to mostly work, the
+
* datetime, duration, calendarDuration
    implementation seems overly complicated, making it difficult to
+
* categorical
    debug and modify.  There seems to be quite a bit of room for
+
* timetable
    improvement here.
+
* timeseries
  
* String class
+
==== single / integer valued ranges ====
  
  - Matlab now uses "" to create string objects that behave
+
This is a compatibility issue.
    differently from Octave double-quoted strings.
 
  
* Handle UTF-8 (or whatever) characters properly
+
==== Refactor load-path ====
  
  - Try to do this in a Matlab-compatible way.
+
* Directories are not properly removed from load path (FIX: link to bug report here)
 +
* Should we really have ADD_PKG and DEL_PKG files?  If so, how can we make them safe?
  
* Handle single and integer values for ranges
+
==== Eliminate special matrix types ====
  
* Local functions
+
Although the special range, diagonal matrix, and permutation matrix data types in Octave require less memory than storing full matrices, they tend to cause trouble when people expect full compatibility or exactly the same results when performing arithmetic on Ranges vs. Matrices.  Now that we have broadcasting operators, the need for diagonal matrices is not as great.
  
  - The semantics for local functions in scripts is different from the
+
==== Special case FOR loop limits ====
    way Octave currently handles functions that are defined in script
 
    files.
 
  
* Allow large files to be loaded and saved
+
Currently, "for i = 1:N ..." uses a Range object for the "1:N" loop bounds.  If we eliminate Ranges as a special space-saving type, then we should handle this syntax as a special case.  Even if we don't eliminate Ranges, that might be a good idea, as we could handle "for i = 1:Inf ..." easily without having to worry about how to deal with that in an ordinary Range object vs. FOR loop bounds.
  
  - Make the load and save commands compatible with Matlab's
+
==== Local functions ====
    HDF5-based file format.  Matlab users expect this and we need
 
    something like this to support large arrays anyway.
 
  
* Matlab packages (+DIR directories in the loadpath; related to classdef)
+
The semantics for local functions in scripts is different from the
 +
way Octave currently handles functions that are defined in script
 +
files.
  
  - Octave already searches for files in package directories and
+
==== Matlab packages ====
    understands the PKG.fcn syntax and functionality.  The big missing
 
    piece is implementation of the "import" functionality and handling
 
    it efficiently and in a way that is compatible with Matlab.
 
  
* Toolboxes
+
+DIR directories in the loadpath; related to classdef
  
  - Move some core toolboxes (communications, control systems, image
+
Octave already searches for files in package directories and
    processing, optimization, signal processing, and statistics), to
+
understands the PKG.fcn syntax and functionalityThe big missing
    core Octave so development is managed along with Octave.  Core
+
piece is implementation of the "import" functionality and handling
    Octave developers are already responsible for these packages
+
it efficiently and in a way that is compatible with Matlab.
    anyway, and users don't seem to understand why they need to
 
    install them separatelyCore parts of the ordinary differential
 
    equations package have already been moved to Octave.
 
  
* General code quality improvements
+
==== Refactor broadcasting ====
  
  - Use C++11 features where possible.
+
Are there better ways to use templates to handle function calls rather than using macros to define a set of functions for array/array, array/scalar, and scalar/array ops as in DEFMXBINOP in mx-inlines.cc?
  
- Better and more complete use of C++ namespaces.
+
==== Sparse matrix issues ====
  
  - Better use of C++ features.  Especially standard library features
+
==== Broadcasting ====
    as their implementation becomes more widely available.  For
 
    example, we might be able to simplify some things in Octave by
 
    using the C++17 filesystem and special functions libraries, if
 
    they provide results that are at least as good what we are using
 
    now.
 
  
  - Eliminate C preprocessor macros where possible.
+
Broadcasting does not work for sparse matrices.  This seems like a big missing feature.
  
* GUI code editor
+
==== Structural zeros ====
  
  - Make it possible to use external editors such as Emacs, vim, or
+
Octave currently skips structural zeros for most (all?) sparse matrix operations.  Matlab returns a sparse matrix filled with NaNs for something like "sprand (5, 5, 0.1) .^ NaN".  Should we go for full compatibility?  Mathematical correctness?  Traditional behavior of sparse matrix libraries?  It seems no one really agrees on what is correct or best.  Maybe compatibility should win?
    others with the GUI in addition to Octave's built-in code editor
 
  
* Documentation
+
==== Indexed assignment ====
  
  - Continue to improve Doxygen documentation for Octave internals to
+
In an assignment like Sparse_object(idx) = GrB_object(idx), Octave does not attempt to apply a conversion operator to transform the RHS type to the LHS type. Is this also a problem for assignments of objects with conversion operators to full matrix objects?
    make it easier for new contributors to understand the Octave code
 
    base.
 
  
* JIT compiler
+
==== graph and digraph ====
  
  - A proof-of-concept implementation was done several years ago by a
+
Would it be difficult to provide these objects?
    Google Summer of Code student.  It was never complete and little
 
    work has been done since.  It also depends on an old version of
 
    LLVM.  In addition to LLVM, we should consider the JIT library
 
    features of GCC.
 
  
    This is probably the most difficult item (at least for me) since it
+
== GUI ==
    will require fairly advanced knowledge of compiler infrastructure
 
    and Octave internals.
 
  
* Windows distribution:
+
=== Communication with interpreter ===
  
  - Eliminate the following msys packages.  Some might be removed
+
Currently, communication between the GUI and the interpreter
    entirely if they are unnecessary for running Octave or building
+
mostly happens when the interpreter is otherwise idle and waiting
    Octave Forge packages.  Otherwise, we should be building them from
+
for user input at the command prompt and the implementation is
    source as we do all other tools and libraries that are distributed
+
somewhat complicated.  We need to determine whether this is the
    with Octave.  The difficulty is that although the msys packges are
+
best we can do, or if there is some other implementation that
    typically based on old versions of these packages, they sometimes
+
would be more flexible and reliable.
    have fixes that are needed to allow them to run properly on
+
 
    Windows systems.  Note also that we distribute a termcap library,
+
=== [[GUI terminal widget|GUI command window]] ===
    but the msys version of less depends on the msys termcap library.
+
 
<pre>
+
The implementation of the GUI command window for Unix-like systems
      bash       less       perl
+
is a completely separate implementations from the one used on
      coreutils  libcrypt   regex
+
Windows systems.  There should be only one, and the GUI should be
      diffutils  libiconv   sed
+
completely in charge of user input and output.  This will probably
      dos2unix  libintl     tar
+
require implementing some kind of simple output pager internally
      file      libmagic   termcap
+
instead of using an external program, but overall user interaction
      findutils  libopenssl unzip
+
could be improved.
      gawk      make       zip
+
 
      grep      msys-core   wget
+
=== GUI code editor ===
      gzip      patch      zlib
+
 
</pre>
+
Make it possible to use external editors such as Emacs, vim, or
 +
others with the GUI in addition to Octave's built-in code editor
 +
 
 +
== Graphics ==
 +
 
 +
=== Publication-quality figures ===
 +
 
 +
Generating EPS or PDF versions of figures needs improvement.
 +
 
 +
=== OpenGL graphics ===
 +
 
 +
* Scaling plot data values/ranges to fit in single-precision OpenGL values
 +
* Performance issues
 +
* Lack of WYSIWYG
 +
 
 +
=== FLTK widget ===
 +
 
 +
With the rest of the GUI using Qt widgets, we should eliminate the FLTK plotting widget.  It duplicates functionality and requires additional effort to maintain.  Maybe we no longer need the octave-cli binary (the one that is not linked with Qt libraries)?
 +
 
 +
=== Qt toolkit threading ===
 +
 
 +
It seems likely that the locking of the gh_manager object is insufficient or even incorrect in some cases.
 +
 
 +
=== classdef graphics objects ===
 +
 
 +
This is a large project, but one that will likely have to be tackled at some point.
 +
 
 +
=== Miscellaneous ===
 +
 
 +
==== Handle UTF-8 ====
 +
 
 +
We need to handle UTF-8 (or whatever) characters properly in all parts of Octave.  Try to do this in a Matlab-compatible way.
 +
 
 +
==== Load / save for large files ====
 +
 
 +
* Make the load and save commands compatible with Matlab's HDF5-based file format.  Matlab users expect this and we need something like this to support large arrays anyway.
 +
* Phase out Octave's own text and binary formats.  Too much effort is required to maintain the code to support all the various formats.
 +
 
 +
==== RNG issues ====
 +
 
 +
RandStream and Other RNG issues
 +
 
 +
This is likely a large project, but it would be nice to have updated, compatible interfaces.
 +
 
 +
==== MEX Interface ====
 +
 
 +
Implement mxMakeReal and mxMakeComplex functions.
 +
 
 +
==== JIT compiler ====
 +
 
 +
A proof-of-concept implementation was done several years ago by a
 +
Google Summer of Code student.  It was never complete and little
 +
work has been done since.  It also depends on an old version of
 +
LLVM.  In addition to LLVM, we should consider the JIT library
 +
features of GCC.
 +
 
 +
This is probably the most difficult item (at least for me) since it
 +
will require fairly advanced knowledge of compiler infrastructure
 +
and Octave internals.
 +
 
 +
==== loadlibrary ====
 +
 
 +
This feature might be nice to have but it has a low priority.
 +
 
 +
==== Complex integers ====
 +
 
 +
Should we support this feature?  Should we refactor the implementation of array objects to make this job easier?
 +
 
 +
==== who -file option ====
 +
 
 +
Should just read file and list info, not create dummy scope.  Likewise for whos function.
 +
 
 +
=== Maintenance and packaging ===
 +
 
 +
==== General code quality ====
 +
 
 +
* Use C++11 features where possible.
 +
* Better and more complete use of C++ namespaces.
 +
* Better use of C++ features.  Especially standard library features as their implementation becomes more widely available.  For example, we might be able to simplify some things in Octave by using the C++17 filesystem and special functions libraries, if they provide results that are at least as good what we are using now.
 +
* Eliminate C preprocessor macros where possible
 +
* added_static must go! (not sure about this now)
 +
* Should not expose symbol_record in call_stack functions if possible
 +
* Remove unused symbol_table/scope/record functions
 +
* Use const in more parse tree functions
 +
* Do recursive functions work properly with load/save now?
 +
* Use enums for options internally (typically to replace bool values)
 +
* Audit global variables and eliminate them where possible
 +
 
 +
==== Symbol visibility ====
 +
 
 +
We really should be tagging the functions that we wish to export from shared libraries.
 +
 
 +
==== Dispatch types for functions ====
 +
 
 +
Search for "classes:" in sources to find the few current examples.
 +
 
 +
==== min/max nargin values ====
 +
 
 +
Should we do this, and allow the interpreter to automatically error when a function is given too few/many arguments?
 +
 
 +
==== Toolboxes ====
 +
 
 +
Move some core toolboxes (communications, control systems, image
 +
processing, optimization, signal processing, and statistics), to
 +
core Octave so development is managed along with Octave.  Core
 +
Octave developers are already responsible for these packages
 +
anyway, and users don't seem to understand why they need to
 +
install them separately.  Core parts of the ordinary differential
 +
equations package have already been moved to Octave.
 +
 
 +
==== Documentation ====
 +
 
 +
* Docs for call stack with examples and illustrations
 +
* Docs for lexer and parser with examples and illustrations
 +
* Docs for fcn_info object
 +
* Docs for load_path object
 +
* Docs for classdef internals
 +
* Docs for Qt graphics toolkit internals
 +
* Docs for Qt GUI and communication with interpreter
 +
* Improve other Doxygen docs for internals to make it easier for new contributors to understand the Octave code base.
 +
 
 +
==== Windows distribution ====
 +
 
 +
Eliminate the following msys packages.  Some might be removed
 +
entirely if they are unnecessary for running Octave or building
 +
Octave Forge packages.  Otherwise, we should be building them from
 +
source as we do all other tools and libraries that are distributed
 +
with Octave.  The difficulty is that although the msys packges are
 +
typically based on old versions of these packages, they sometimes
 +
have fixes that are needed to allow them to run properly on
 +
Windows systems.  Note also that we distribute a termcap library,
 +
but the msys version of less depends on the msys termcap library.
 +
 
 +
* bash
 +
* coreutils
 +
* diffutils
 +
* dos2unix
 +
* file
 +
* findutils
 +
* gawk
 +
* grep
 +
* gzip
 +
* less
 +
* libcrypt
 +
* libiconv
 +
* libintl
 +
* libmagic
 +
* libopenssl
 +
* make
 +
* msys-core
 +
* patch
 +
* perl
 +
* regex
 +
* sed
 +
* tar
 +
* termcap
 +
* unzip
 +
* wget
 +
* zip
 +
* zlib

Latest revision as of 08:20, 6 June 2020


Language and functions[edit]

classdef issues[edit]

Compatibility issues[edit]

Make a list here, pointing to individual bug reports?

Load/save for classdef[edit]

See also general load/save issues.

Improve / simplify implementation[edit]

Although the basic features that are implemented now appear to mostly work, the implementation seems overly complicated, making it difficult to debug and modify. There seems to be quite a bit of room for improvement here.

Syntax, semantics, and data types[edit]

Function handle refactoring[edit]

  • Load/save for all types of function handles and all data formats (ascii, binary, hdf5, mat5)
  • Use std::shared_ptr for function objects instead of bare pointer to octave_function.

String class[edit]

Matlab now uses "" to create string objects that behave differently from Octave double-quoted strings. We could start by creating a compatible string class, then hooking it up to the "" syntax. No matter what, the transition will be difficult because Matlab's "" strings still treat "\n" as two characters (backslash and n) rather than a single character (newline).

Other new data types[edit]

Andrew Janke has implementations of these classes (FIX: link to repos here)

  • table
  • datetime, duration, calendarDuration
  • categorical
  • timetable
  • timeseries

single / integer valued ranges[edit]

This is a compatibility issue.

Refactor load-path[edit]

  • Directories are not properly removed from load path (FIX: link to bug report here)
  • Should we really have ADD_PKG and DEL_PKG files? If so, how can we make them safe?

Eliminate special matrix types[edit]

Although the special range, diagonal matrix, and permutation matrix data types in Octave require less memory than storing full matrices, they tend to cause trouble when people expect full compatibility or exactly the same results when performing arithmetic on Ranges vs. Matrices. Now that we have broadcasting operators, the need for diagonal matrices is not as great.

Special case FOR loop limits[edit]

Currently, "for i = 1:N ..." uses a Range object for the "1:N" loop bounds. If we eliminate Ranges as a special space-saving type, then we should handle this syntax as a special case. Even if we don't eliminate Ranges, that might be a good idea, as we could handle "for i = 1:Inf ..." easily without having to worry about how to deal with that in an ordinary Range object vs. FOR loop bounds.

Local functions[edit]

The semantics for local functions in scripts is different from the way Octave currently handles functions that are defined in script files.

Matlab packages[edit]

+DIR directories in the loadpath; related to classdef

Octave already searches for files in package directories and understands the PKG.fcn syntax and functionality. The big missing piece is implementation of the "import" functionality and handling it efficiently and in a way that is compatible with Matlab.

Refactor broadcasting[edit]

Are there better ways to use templates to handle function calls rather than using macros to define a set of functions for array/array, array/scalar, and scalar/array ops as in DEFMXBINOP in mx-inlines.cc?

Sparse matrix issues[edit]

Broadcasting[edit]

Broadcasting does not work for sparse matrices. This seems like a big missing feature.

Structural zeros[edit]

Octave currently skips structural zeros for most (all?) sparse matrix operations. Matlab returns a sparse matrix filled with NaNs for something like "sprand (5, 5, 0.1) .^ NaN". Should we go for full compatibility? Mathematical correctness? Traditional behavior of sparse matrix libraries? It seems no one really agrees on what is correct or best. Maybe compatibility should win?

Indexed assignment[edit]

In an assignment like Sparse_object(idx) = GrB_object(idx), Octave does not attempt to apply a conversion operator to transform the RHS type to the LHS type. Is this also a problem for assignments of objects with conversion operators to full matrix objects?

graph and digraph[edit]

Would it be difficult to provide these objects?

GUI[edit]

Communication with interpreter[edit]

Currently, communication between the GUI and the interpreter mostly happens when the interpreter is otherwise idle and waiting for user input at the command prompt and the implementation is somewhat complicated. We need to determine whether this is the best we can do, or if there is some other implementation that would be more flexible and reliable.

GUI command window[edit]

The implementation of the GUI command window for Unix-like systems is a completely separate implementations from the one used on Windows systems. There should be only one, and the GUI should be completely in charge of user input and output. This will probably require implementing some kind of simple output pager internally instead of using an external program, but overall user interaction could be improved.

GUI code editor[edit]

Make it possible to use external editors such as Emacs, vim, or others with the GUI in addition to Octave's built-in code editor

Graphics[edit]

Publication-quality figures[edit]

Generating EPS or PDF versions of figures needs improvement.

OpenGL graphics[edit]

  • Scaling plot data values/ranges to fit in single-precision OpenGL values
  • Performance issues
  • Lack of WYSIWYG

FLTK widget[edit]

With the rest of the GUI using Qt widgets, we should eliminate the FLTK plotting widget. It duplicates functionality and requires additional effort to maintain. Maybe we no longer need the octave-cli binary (the one that is not linked with Qt libraries)?

Qt toolkit threading[edit]

It seems likely that the locking of the gh_manager object is insufficient or even incorrect in some cases.

classdef graphics objects[edit]

This is a large project, but one that will likely have to be tackled at some point.

Miscellaneous[edit]

Handle UTF-8[edit]

We need to handle UTF-8 (or whatever) characters properly in all parts of Octave. Try to do this in a Matlab-compatible way.

Load / save for large files[edit]

  • Make the load and save commands compatible with Matlab's HDF5-based file format. Matlab users expect this and we need something like this to support large arrays anyway.
  • Phase out Octave's own text and binary formats. Too much effort is required to maintain the code to support all the various formats.

RNG issues[edit]

RandStream and Other RNG issues

This is likely a large project, but it would be nice to have updated, compatible interfaces.

MEX Interface[edit]

Implement mxMakeReal and mxMakeComplex functions.

JIT compiler[edit]

A proof-of-concept implementation was done several years ago by a Google Summer of Code student. It was never complete and little work has been done since. It also depends on an old version of LLVM. In addition to LLVM, we should consider the JIT library features of GCC.

This is probably the most difficult item (at least for me) since it will require fairly advanced knowledge of compiler infrastructure and Octave internals.

loadlibrary[edit]

This feature might be nice to have but it has a low priority.

Complex integers[edit]

Should we support this feature? Should we refactor the implementation of array objects to make this job easier?

who -file option[edit]

Should just read file and list info, not create dummy scope. Likewise for whos function.

Maintenance and packaging[edit]

General code quality[edit]

  • Use C++11 features where possible.
  • Better and more complete use of C++ namespaces.
  • Better use of C++ features. Especially standard library features as their implementation becomes more widely available. For example, we might be able to simplify some things in Octave by using the C++17 filesystem and special functions libraries, if they provide results that are at least as good what we are using now.
  • Eliminate C preprocessor macros where possible
  • added_static must go! (not sure about this now)
  • Should not expose symbol_record in call_stack functions if possible
  • Remove unused symbol_table/scope/record functions
  • Use const in more parse tree functions
  • Do recursive functions work properly with load/save now?
  • Use enums for options internally (typically to replace bool values)
  • Audit global variables and eliminate them where possible

Symbol visibility[edit]

We really should be tagging the functions that we wish to export from shared libraries.

Dispatch types for functions[edit]

Search for "classes:" in sources to find the few current examples.

min/max nargin values[edit]

Should we do this, and allow the interpreter to automatically error when a function is given too few/many arguments?

Toolboxes[edit]

Move some core toolboxes (communications, control systems, image processing, optimization, signal processing, and statistics), to core Octave so development is managed along with Octave. Core Octave developers are already responsible for these packages anyway, and users don't seem to understand why they need to install them separately. Core parts of the ordinary differential equations package have already been moved to Octave.

Documentation[edit]

  • Docs for call stack with examples and illustrations
  • Docs for lexer and parser with examples and illustrations
  • Docs for fcn_info object
  • Docs for load_path object
  • Docs for classdef internals
  • Docs for Qt graphics toolkit internals
  • Docs for Qt GUI and communication with interpreter
  • Improve other Doxygen docs for internals to make it easier for new contributors to understand the Octave code base.

Windows distribution[edit]

Eliminate the following msys packages. Some might be removed entirely if they are unnecessary for running Octave or building Octave Forge packages. Otherwise, we should be building them from source as we do all other tools and libraries that are distributed with Octave. The difficulty is that although the msys packges are typically based on old versions of these packages, they sometimes have fixes that are needed to allow them to run properly on Windows systems. Note also that we distribute a termcap library, but the msys version of less depends on the msys termcap library.

  • bash
  • coreutils
  • diffutils
  • dos2unix
  • file
  • findutils
  • gawk
  • grep
  • gzip
  • less
  • libcrypt
  • libiconv
  • libintl
  • libmagic
  • libopenssl
  • make
  • msys-core
  • patch
  • perl
  • regex
  • sed
  • tar
  • termcap
  • unzip
  • wget
  • zip
  • zlib