Sudeepam

Joined 11 March 2018
1 byte removed ,  22 March 2018
Line 162: Line 162:
::Yes, I have decided to work on the '''command line suggestion feature''' [https://savannah.gnu.org/bugs/?46881)]. This feature is essentially a complex, decision making problem and therefore, I will approach it with Neural Networks, made using Octave (m-scripts) itself.  
::Yes, I have decided to work on the '''command line suggestion feature''' [https://savannah.gnu.org/bugs/?46881)]. This feature is essentially a complex, decision making problem and therefore, I will approach it with Neural Networks, made using Octave (m-scripts) itself.  


::''My special focus would be to have a minimal trade-off between the accuracy and speed of the feature.'' Please look at the last and additional section of 'Project Description' for technical details. I would like to apologize for creating this extra part but it describes some of the important technicalities of this project and I believed that it should have been present.
::''My special focus would be to have a minimal trade-off between the accuracy and speed of the feature.'' Please take a look at the last, and additional section of 'Project Description' for technical details. I would like to apologize for creating this extra part but it describes some of the important technicalities of this project and I believed that they should be present here.


*'''Please provide a rough estimated timeline for your work on the task.'''
*'''Please provide a rough estimated timeline for your work on the task.'''


:'''Preparations for the project (pre-community bonding)'''
:'''Preparations for the project (pre-community bonding)'''
::While this application is being reviewed, I have started working on a m-script which will be used to catch the most common spelling errors that the users make. This list of errors could then be...
::While this application is being reviewed, I have started working on a m-script which will be used to catch the most common typographic errors that the users make. This list of errors could then be...


::-Uploaded to a secure server directly.
::-Uploaded to a secure server directly.
::-Stored as a text file and we can ask the users to share this file with us.
::-Stored as a text file, and we can ask the users to share this file with us.


:'''Community Bonding period'''
:'''Community Bonding period'''
Line 176: Line 176:
::I will use the community bonding period to...
::I will use the community bonding period to...


::-Persuade the community to use our data extraction script and help us collect training data. Will be done by discussing the benefits of a command line suggestion feature and sharing my rough, small scale implementation (Please see the 'Project description' section) of this feature [https://github.com/Sudeepam97/Did_You_Mean].
::-Persuade the community to use our data extraction script and help us collect training data. This will be done by discussing the benefits of a command line suggestion feature and sharing my rough, small scale implementation (Please see the 'Project description' section) of this feature [https://github.com/Sudeepam97/Did_You_Mean].


::-Ask the community to report issues with the m-script containing the current implementation. I’ll shift the current implementation to mercurial if required.
::-Ask the community to report issues with the m-script containing the current implementation. I’ll shift the current implementation to mercurial if required.
Line 182: Line 182:
::-Discuss how we should receive the data generated by the users, work on the approach, and start the collection of data.
::-Discuss how we should receive the data generated by the users, work on the approach, and start the collection of data.


::-Organize the data as it is received and divide it to create proper, training, cross-validation, and test sets for the Neural Network.
::-Organize the data as it is received and divide it to create proper, training, cross-validation, and test sets.


:'''May, 14 – June, 10 (4 weeks)'''
:'''May, 14 – June, 10 (4 weeks)'''


::'''Week 1 (May, 14 – May, 21):''' I would not be able to do a lot of work in this week as I have my final examinations at this time. I’ll take this week as an extension of the community bonding period and use it to collect issues, collect more data and divide it into proper datasets.
::'''Week 1 (May, 14 – May, 21):''' I would not be able to do a lot of work in this week as I have my final examinations during this time. I will take this week as an extension of the community bonding period and use it to collect issues, collect more data and divide it into proper data-sets.
::'''Week 2 and Week 3 (May, 21 – June, 3):''' Most of the code of the Neural Network would be identical to my current implementation and so I’ll start by making my current implementation bug free (Some known issues can be found here: [https://github.com/Sudeepam97/Did_You_Mean/issues]) and by coding it according to the Octave coding standards. I plan to keep the user data coming for these weeks also and so I’ll leave room for network parameters such as the number of hidden layers and the number of neurons per hidden layer because these are data dependent parameters. If all this work gets completed before the expected time, I’ll automatically move on to complete next week’s work.
::'''Week 2 and Week 3 (May, 21 – June, 3):''' Most of the code of the Neural Network would be identical to my current implementation and so I’ll start by making my current implementation bug free (Some known issues can be found here: [https://github.com/Sudeepam97/Did_You_Mean/issues]) and by coding it according to the Octave coding standards. I plan to keep the user data coming for these weeks also and so I’ll leave room for network parameters such as the number of hidden layers and the number of neurons per hidden layer because these are data dependent parameters. If all this work gets completed before the expected time, I’ll automatically move on to complete next week’s work.
::'''Week 4 (June, 4 – June, 10):''' By now we will have sufficient data, data from octave-online.net and from approximately 6 weeks of extraction script’s usage. I’ll quickly give a final look to the data and start training the Neural Network with it. I will choose appropriate values of the data dependent network parameters which, while keeping the speed of the Neural Network fast, would fit the learning parameters (weights) of the Neural Network to our data with a high level of accuracy. I would then measure the accuracy of the Network on cross validation and test sets and see how our network generalizes to unknown typographic errors. I will also write some additional tests for various m-scripts used.
::'''Week 4 (June, 4 – June, 10):''' By now we will have sufficient data, data from octave-online.net and from approximately 6 weeks of extraction script’s usage. I’ll quickly give a final look to the data and start training the Neural Network with it. I will choose appropriate values of the data dependent network parameters which, while keeping the speed of the Neural Network fast, would fit the learning parameters (weights) of the Neural Network to our data with a high level of accuracy. I would then measure the accuracy of the Network on cross validation and test sets and see how our network generalizes to unknown typographic errors. I will also write some additional tests for various m-scripts used.
98

edits