User:Sudeepam: Difference between revisions

Jump to navigation Jump to search
9 bytes added ,  22 March 2018
Line 212: Line 212:
Let me first describe the three kinds of Neural Networks that we can end up making (Depending on the training data available).
Let me first describe the three kinds of Neural Networks that we can end up making (Depending on the training data available).


:'''A network trained with only the correct spellings of the inbuilt functions'''
:'''1) A network trained with only the correct spellings of the inbuilt functions'''
This type of network would be very easy to make because only a list of all the existing functions of GNU Octave and no additional data will be required. With this approach, we would end up creating a Neural Network which would easily understand typographic errors caused due to '''letter substitutions '''and '''transportation of adjacent letters.''' In-fact, this network would understand multiple letter substitutions and transportations also and not only single letter substitutions or transportations. I say this with such confidence because I have already made a working neural network of this type [https://github.com/Sudeepam97/Did_You_Mean]. This network would however, perform poorly if an error is caused due to '''accidental inclusion''' or ''''accidental deletion of letters.'''
This type of network would be very easy to make because only a list of all the existing functions of GNU Octave and no additional data will be required. With this approach, we would end up creating a Neural Network which would easily understand typographic errors caused due to '''letter substitutions '''and '''transportation of adjacent letters.''' In-fact, this network would understand multiple letter substitutions and transportations also and not only single letter substitutions or transportations. I say this with such confidence because I have already made a working neural network of this type [https://github.com/Sudeepam97/Did_You_Mean]. This network would however, perform poorly if an error is caused due to '''accidental inclusion''' or ''''accidental deletion of letters.'''


:'''A network trained with the correct spellings of the functions and self created errors'''
:'''2) A network trained with the correct spellings of the functions and self created errors'''
This would be slightly harder to make but should give us an improved performance. I will '''create some misspellings''' for all the functions, by additional inclusion, deletion, substitution, and transportation of one or two letters and then add all these self created misspellings to the dataset which will be used to train the network. Such a network would understand what '''correct spellings and random typographic errors''' look like. It will easily understand substitutions and transportations like the previous network but would also be more accurate while predicting errors caused due to additions/deletions. However, it is worth mentioning here that we may create errors while creating errors. Because our training data will be modified randomly, although the chances are rare, the Neural Network may show uncertain behaviour.
This would be slightly harder to make but should give us an improved performance. I will '''create some misspellings''' for all the functions, by additional inclusion, deletion, substitution, and transportation of one or two letters and then add all these self created misspellings to the dataset which will be used to train the network. Such a network would understand what '''correct spellings and random typographic errors''' look like. It will easily understand substitutions and transportations like the previous network but would also be more accurate while predicting errors caused due to additions/deletions. However, it is worth mentioning here that we may create errors while creating errors. Because our training data will be modified randomly, although the chances are rare, the Neural Network may show uncertain behaviour.


:'''A network trained with the correct spellings of the functions and the most common typographic errors'''
:'''3) A network trained with the correct spellings of the functions and the most common typographic errors'''
To make this kind of Neural Network, we need to know what common typographic errors look like. With that goal in mind, I have already contacted the people behind octave-online.net [https://octave-online.net/] who say that they are happy to support the development of GNU Octave and have shared a list of top 1000 misspellings with me through email. However the users of octave-online.net are only one of the parts of the entire user group. '''For best results''', we would require the involvement of the entire Octave community, which, also implies that it will be the hardest and the most fun Neural Network to make.  
To make this kind of Neural Network, we need to know what common typographic errors look like. With that goal in mind, I have already contacted the people behind octave-online.net [https://octave-online.net/] who say that they are happy to support the development of GNU Octave and have shared a list of top 1000 misspellings with me through email. However the users of octave-online.net are only one of the parts of the entire user group. '''For best results''', we would require the involvement of the entire Octave community, which, also implies that it will be the hardest and the most fun Neural Network to make.  
By creating a script that would be able to catch typographic errors and by asking the users of GNU Octave to use this script and share the most common spelling errors with us, and training the network on the dataset thus created, we’ll create a Neural Network which would understand what '''correct spellings and the most common typographic errors''' look like. Such a network would give good results, almost every-time and with all kinds of errors. This is because when our network knows what common errors are like, most of the times it would '''know the answer''' beforehand. For the remaining times, the network would be able to '''predict the correct answer'''.
By creating a script that would be able to catch typographic errors and by asking the users of GNU Octave to use this script and share the most common spelling errors with us, and training the network on the dataset thus created, we’ll create a Neural Network which would understand what '''correct spellings and the most common typographic errors''' look like. Such a network would give good results, almost every-time and with all kinds of errors. This is because when our network knows what common errors are like, most of the times it would '''know the answer''' beforehand. For the remaining times, the network would be able to '''predict the correct answer'''.
98

edits

Navigation menu