Editing Statistics package
Jump to navigation
Jump to search
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 3: | Line 3: | ||
<code>pkg install -forge statistics</code> | <code>pkg install -forge statistics</code> | ||
The following sections provide an overview of the functions available in the statistics package sorted alphabetically and arranged in groups similarly to the package's INDEX file | The following sections provide an overview of the functions available in the statistics package sorted alphabetically and arranged in groups similarly to the package's INDEX file. | ||
== Clustering == | == Clustering == | ||
Line 15: | Line 15: | ||
! Description | ! Description | ||
|- | |- | ||
| | | cluster | ||
| Define clusters from an agglomerative hierarchical cluster tree. | | Define clusters from an agglomerative hierarchical cluster tree. | ||
|- | |- | ||
| | | cmdscale | ||
| Classical multidimensional scaling of a matrix. | | Classical multidimensional scaling of a matrix. | ||
|- | |- | ||
| | | confusionmat | ||
| Compute a confusion matrix for classification problems. | | Compute a confusion matrix for classification problems. | ||
|- | |- | ||
| | | cophenet | ||
| Compute the cophenetic correlation coefficient. | | Compute the cophenetic correlation coefficient. | ||
|- | |- | ||
| | | evalclusters | ||
| Create a clustering evaluation object to find the optimal number of clusters. | | Create a clustering evaluation object to find the optimal number of clusters. | ||
|- | |- | ||
| | | inconsistent | ||
| Compute the inconsistency coefficient for each link of a hierarchical cluster tree. | | Compute the inconsistency coefficient for each link of a hierarchical cluster tree. | ||
|- | |- | ||
| | | kmeans | ||
| Perform a K-means clustering of an NxD matrix. | | Perform a K-means clustering of an NxD matrix. | ||
|- | |- | ||
| | | linkage | ||
| Produce a hierarchical clustering dendrogram. | | Produce a hierarchical clustering dendrogram. | ||
|- | |- | ||
| | | mahal | ||
| Mahalanobis' D-square distance. | | Mahalanobis' D-square distance. | ||
|- | |- | ||
| | | mhsample | ||
| Draws NSAMPLES samples from a target stationary distribution PDF using Metropolis-Hastings algorithm. | | Draws NSAMPLES samples from a target stationary distribution PDF using Metropolis-Hastings algorithm. | ||
|- | |- | ||
| | | optimalleaforder | ||
| Compute the optimal leaf ordering of a hierarchical binary cluster tree. | | Compute the optimal leaf ordering of a hierarchical binary cluster tree. | ||
|- | |- | ||
| | | pdist | ||
| Return the distance between any two rows in X. | | Return the distance between any two rows in X. | ||
|- | |- | ||
| | | pdist2 | ||
| Compute pairwise distance between two sets of vectors. | | Compute pairwise distance between two sets of vectors. | ||
|- | |- | ||
| | | slicesample | ||
| Draws NSAMPLES samples from a target stationary distribution PDF using slice sampling of Radford M. Neal. | | Draws NSAMPLES samples from a target stationary distribution PDF using slice sampling of Radford M. Neal. | ||
|- | |- | ||
| | | squareform | ||
| Interchange between distance matrix and distance vector formats. | | Interchange between distance matrix and distance vector formats. | ||
|} | |} | ||
Line 88: | Line 79: | ||
! Description | ! Description | ||
|- | |- | ||
| | | combnk | ||
| Return all combinations of K elements in DATA. | | Return all combinations of K elements in DATA. | ||
|- | |- | ||
| | | crosstab | ||
| Create a cross-tabulation (contingency table) T from data vectors. | | Create a cross-tabulation (contingency table) T from data vectors. | ||
|- | |- | ||
| | | datasample | ||
| Randomly sample data. | | Randomly sample data. | ||
|- | |- | ||
| | | grp2idx | ||
| Get index for group variables. | | Get index for group variables. | ||
|- | |- | ||
| | | tabulate | ||
| Compute a frequency table. | | Compute a frequency table. | ||
|} | |} | ||
Line 129: | Line 105: | ||
! Description | ! Description | ||
|- | |- | ||
| | | geomean | ||
| Compute the geometric mean. | | Compute the geometric mean. | ||
|- | |- | ||
| | | grpstats | ||
| Compute summary statistics by group. Fully MATLAB compatible. | | Compute summary statistics by group. Fully MATLAB compatible. | ||
|- | |- | ||
| | | harmmean | ||
| Compute the harmonic mean. | | Compute the harmonic mean. | ||
|- | |- | ||
| | | jackknife | ||
| Compute jackknife estimates of a parameter taking one or more given samples as parameters. | | Compute jackknife estimates of a parameter taking one or more given samples as parameters. | ||
|- | |- | ||
| | | mean | ||
| Compute the mean. Fully MATLAB compatible. | | Compute the mean. Fully MATLAB compatible. | ||
|- | |- | ||
| | | median | ||
| Compute the median. Fully MATLAB compatible. | | Compute the median. Fully MATLAB compatible. | ||
|- | |- | ||
| | | nanmax | ||
| Find the maximal element while ignoring NaN values. | | Find the maximal element while ignoring NaN values. | ||
|- | |- | ||
| | | nanmin | ||
| Find the minimal element while ignoring NaN values. | | Find the minimal element while ignoring NaN values. | ||
|- | |- | ||
| | | nansum | ||
| Compute the sum while ignoring NaN values. | | Compute the sum while ignoring NaN values. | ||
|- | |- | ||
| | | std | ||
| Compute the standard deviation. Fully MATLAB compatible. | | Compute the standard deviation. Fully MATLAB compatible. | ||
|- | |- | ||
| | | trimmean | ||
| Compute the trimmed mean. | | Compute the trimmed mean. | ||
|- | |- | ||
| | | std | ||
| Compute the variance. Fully MATLAB compatible. | | Compute the variance. Fully MATLAB compatible. | ||
|} | |} | ||
Line 171: | Line 144: | ||
=== In external packages === | === In external packages === | ||
<code>bootci</code>, <code>bootstrp</code> are implemented in the [https://gnu-octave.github.io/packages/statistics- | <code>bootci</code>, <code>bootstrp</code> are implemented in the [https://gnu-octave.github.io/packages/statistics-bootstrap statistics-bootstrap] package. | ||
=== Shadowing Octave core functions === | === Shadowing Octave core functions === | ||
Line 186: | Line 159: | ||
=== TODO list === | === TODO list === | ||
Update <code>trimmean</code> | Update <code>geomean</code>, <code>harmmean</code>, and <code>trimmean</code> functions to be fully MATLAB compatible. | ||
Re-introduce the <code>nan*</code> functions implemented in C++ with the <code>"all"</code> and <code>"vecdim"</code> options. | Re-introduce the <code>nan*</code> functions implemented in C++ with the <code>"all"</code> and <code>"vecdim"</code> options. | ||
Line 233: | Line 206: | ||
| binornd | | binornd | ||
|- | |- | ||
| | | Bivariate | ||
| bvncdf | | bvncdf | ||
| | | | ||
| | | | ||
Line 263: | Line 230: | ||
| chi2rnd | | chi2rnd | ||
|- | |- | ||
| | | Copula Family | ||
| copulacdf | | copulacdf | ||
| copulainv | | copulainv | ||
Line 269: | Line 236: | ||
| copularnd | | copularnd | ||
|- | |- | ||
| | | Extreme Value | ||
| evcdf | | evcdf | ||
| evinv | | evinv | ||
Line 359: | Line 326: | ||
| mvnrnd | | mvnrnd | ||
|- | |- | ||
| [https://en.wikipedia.org/wiki/Multivariate_t-distribution Multivariate Student's | | [https://en.wikipedia.org/wiki/Multivariate_t-distribution Multivariate Student's T] | ||
| mvtcdf mvtcdfqmc | | mvtcdf mvtcdfqmc | ||
| mvtinv | | mvtinv | ||
Line 383: | Line 350: | ||
| ncfrnd | | ncfrnd | ||
|- | |- | ||
| [https://en.wikipedia.org/wiki/Noncentral_t-distribution Noncentral Student's | | [https://en.wikipedia.org/wiki/Noncentral_t-distribution Noncentral Student's T] | ||
| nctcdf | | nctcdf | ||
| nctinv | | nctinv | ||
Line 419: | Line 386: | ||
| stdnormal_rnd | | stdnormal_rnd | ||
|- | |- | ||
| [https://en.wikipedia.org/wiki/Student%27s_t-distribution Student's | | [https://en.wikipedia.org/wiki/Student%27s_t-distribution Student's T] | ||
| tcdf | | tcdf | ||
| tinv | | tinv | ||
Line 467: | Line 434: | ||
| wishrnd | | wishrnd | ||
|} | |} | ||
=== Distribution Fitting === | === Distribution Fitting === | ||
Line 534: | Line 502: | ||
== Experimental Design == | == Experimental Design == | ||
Functions available for computing design matrices. | Functions available for computing design matrices. | ||
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1"> | <div style="column-count:1;-moz-column-count:1;-webkit-column-count:1"> | ||
* <code> | * <code>fullfact</code> | ||
* <code> | * <code>ff2n</code> | ||
* <code>sigma_pts</code> | |||
* <code>x2fx</code> | |||
</div> | </div> | ||
== Model Fitting == | == Model Fitting == | ||
Functions available for fitting or evaluating statistical models. | Functions available for fitting or evaluating statistical models. | ||
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1"> | |||
* <code>crossval</code> | |||
* <code>fitgmdist</code> | |||
* <code>fitlm</code> | |||
</div> | |||
=== Cross Validation === | === Cross Validation === | ||
Line 633: | Line 542: | ||
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1"> | <div style="column-count:1;-moz-column-count:1;-webkit-column-count:1"> | ||
* <code>anova</code> | * <code>anova</code> | ||
* <code>glmfit</code> | |||
* <code>glmval</code> | |||
* <code>manova</code> | * <code>manova</code> | ||
* <code>mnrfit</code> | |||
* <code>mnrval</code> | |||
</div> | </div> | ||
== Hypothesis Testing == | == Hypothesis Testing == | ||
Functions available for hypothesis testing | Functions available for hypothesis testing | ||
Line 646: | Line 558: | ||
! Description | ! Description | ||
|- | |- | ||
| | | adtest | ||
| Anderson-Darling goodness-of-fit hypothesis test. | | Anderson-Darling goodness-of-fit hypothesis test. | ||
|- | |- | ||
| | | anova1 | ||
| Perform a one-way analysis of variance (ANOVA) | | Perform a one-way analysis of variance (ANOVA) | ||
|- | |- | ||
| | | anova2 | ||
| Performs two-way factorial (crossed) or a nested analysis of variance (ANOVA) for balanced designs. | | Performs two-way factorial (crossed) or a nested analysis of variance (ANOVA) for balanced designs. | ||
|- | |- | ||
| | | anovan | ||
| Perform a multi (N)-way analysis of (co)variance (ANOVA or ANCOVA) to evaluate the effect of one or more categorical or continuous predictors (i.e. independent variables) on a continuous outcome (i.e. dependent variable). | | Perform a multi (N)-way analysis of (co)variance (ANOVA or ANCOVA) to evaluate the effect of one or more categorical or continuous predictors (i.e. independent variables) on a continuous outcome (i.e. dependent variable). | ||
|- | |- | ||
| | | bartlett_test | ||
| Perform a Bartlett test for the homogeneity of variances. | | Perform a Bartlett test for the homogeneity of variances. | ||
|- | |- | ||
| | | barttest | ||
| Bartlett's test of sphericity for correlation. | | Bartlett's test of sphericity for correlation. | ||
|- | |- | ||
| | | binotest | ||
| Test for probability P of a binomial sample | | Test for probability P of a binomial sample | ||
|- | |- | ||
| | | chi2gof | ||
| Chi-square goodness-of-fit test. | | Chi-square goodness-of-fit test. | ||
|- | |- | ||
| | | chi2test | ||
| Perform a chi-squared test (for independence or homogeneity). | | Perform a chi-squared test (for independence or homogeneity). | ||
|- | |- | ||
| | | friedman | ||
| Performs the nonparametric Friedman's test to compare column effects in a two-way layout. | | Performs the nonparametric Friedman's test to compare column effects in a two-way layout. | ||
|- | |- | ||
| | | hotelling_t2test | ||
| Compute Hotelling's T^2 ("T-squared") test for a single sample or two dependent samples (paired-samples). | | Compute Hotelling's T^2 ("T-squared") test for a single sample or two dependent samples (paired-samples). | ||
|- | |- | ||
| | | hotelling_t2test2 | ||
| Compute Hotelling's T^2 ("T-squared") test for two independent samples. | | Compute Hotelling's T^2 ("T-squared") test for two independent samples. | ||
|- | |- | ||
| | | kruskalwallis | ||
| Perform a Kruskal-Wallis test, the non-parametric alternative of a one-way analysis of variance (ANOVA). | | Perform a Kruskal-Wallis test, the non-parametric alternative of a one-way analysis of variance (ANOVA). | ||
|- | |- | ||
| | | kstest | ||
| Single sample Kolmogorov-Smirnov (K-S) goodness-of-fit hypothesis test. | | Single sample Kolmogorov-Smirnov (K-S) goodness-of-fit hypothesis test. | ||
|- | |- | ||
| | | kstest2 | ||
| Two-sample Kolmogorov-Smirnov goodness-of-fit hypothesis test. | | Two-sample Kolmogorov-Smirnov goodness-of-fit hypothesis test. | ||
|- | |- | ||
| | | levene_test | ||
| Perform a Levene's test for the homogeneity of variances. | | Perform a Levene's test for the homogeneity of variances. | ||
|- | |- | ||
| | | manova1 | ||
| One-way multivariate analysis of variance (MANOVA). | | One-way multivariate analysis of variance (MANOVA). | ||
|- | |- | ||
| | | multcompare | ||
| Perform posthoc multiple comparison tests or p-value adjustments to control the family-wise error rate (FWER) or false discovery rate (FDR). | | Perform posthoc multiple comparison tests or p-value adjustments to control the family-wise error rate (FWER) or false discovery rate (FDR). | ||
|- | |- | ||
| | | ranksum | ||
| Wilcoxon rank sum test for equal medians. This test is equivalent to a Mann-Whitney U-test. | | Wilcoxon rank sum test for equal medians. This test is equivalent to a Mann-Whitney U-test. | ||
|- | |- | ||
| | | regression_ftest | ||
| F-test for General Linear Regression Analysis | | F-test for General Linear Regression Analysis | ||
|- | |- | ||
| | | regression_ttest | ||
| Perform a linear regression t-test. | | Perform a linear regression t-test for the null hypothesis ''RR * B = R'' in a classical normal regression model ''Y = X * B + E''. | ||
|- | |- | ||
| | | runstest | ||
| Runs test for detecting serial correlation in the vector X. | | Runs test for detecting serial correlation in the vector X. | ||
|- | |- | ||
| | | sampsizepwr | ||
| Sample size and power calculation for hypothesis test. | | Sample size and power calculation for hypothesis test. | ||
|- | |- | ||
| | | signtest | ||
| Test for median. | | Test for median. | ||
|- | |- | ||
| | | ttest | ||
| Test for mean of a normal sample with unknown variance or a paired-sample t-test. | | Test for mean of a normal sample with unknown variance or a paired-sample t-test. | ||
|- | |- | ||
| | | ttest2 | ||
| Perform a two independent samples t-test. | | Perform a two independent samples t-test. | ||
|- | |- | ||
| | | vartest | ||
| One-sample test of variance. | | One-sample test of variance. | ||
|- | |- | ||
| | | vartest2 | ||
| Two-sample F test for equal variances. | | Two-sample F test for equal variances. | ||
|- | |- | ||
| | | vartestn | ||
| Test for equal variances across multiple groups. | | Test for equal variances across multiple groups. | ||
|- | |- | ||
| | | ztest | ||
| One-sample Z-test. | | One-sample Z-test. | ||
|} | |} | ||
Line 753: | Line 656: | ||
* <code>fishertest</code> | * <code>fishertest</code> | ||
* <code>meanEffectSize</code> | * <code>meanEffectSize</code> | ||
</div> | |||
== Machine Learning == | |||
=== Available functions === | |||
The following table lists the available functions. | |||
{| class="wikitable" | |||
! Function | |||
! Description | |||
|- | |||
| hmmestimate | |||
| Estimation of a hidden Markov model for a given sequence. | |||
|- | |||
| hmmgenerate | |||
| Output sequence and hidden states of a hidden Markov model. | |||
|- | |||
| hmmviterbi | |||
| Viterbi path of a hidden Markov model. | |||
|- | |||
| svmpredict | |||
| Perform a K-means clustering of an NxD matrix. | |||
|- | |||
| svmtrain | |||
| Produce a hierarchical clustering dendrogram. | |||
|} | |||
=== TODO list === | |||
Update <code>svmpredict</code> and <code>svmtrain</code> to libsvm 3.0. | |||
Missing functions: | |||
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1"> | |||
* <code>hmmdecode</code> | |||
* <code>hmmtrain</code> | |||
</div> | </div> | ||
Line 765: | Line 705: | ||
! Description | ! Description | ||
|- | |- | ||
| | | boxplot | ||
| Produce a box plot. | | Produce a box plot. | ||
|- | |- | ||
| | | cdfplot | ||
| Display an empirical cumulative distribution function. | | Display an empirical cumulative distribution function. | ||
|- | |- | ||
| | | confusionchart | ||
| Display a chart of a confusion matrix. | | Display a chart of a confusion matrix. | ||
|- | |- | ||
| | | dendrogram | ||
| Plot a dendrogram of a hierarchical binary cluster tree. | | Plot a dendrogram of a hierarchical binary cluster tree. | ||
|- | |- | ||
| | | ecdf | ||
| Empirical (Kaplan-Meier) cumulative distribution function. | | Empirical (Kaplan-Meier) cumulative distribution function. | ||
|- | |- | ||
| | | gscatter | ||
| Draw a scatter plot with grouped data. | | Draw a scatter plot with grouped data. | ||
|- | |- | ||
| | | histfit | ||
| Plot histogram with superimposed fitted normal density. | | Plot histogram with superimposed fitted normal density. | ||
|- | |- | ||
| | | hist3 | ||
| Produce bivariate (2D) histogram counts or plots. | | Produce bivariate (2D) histogram counts or plots. | ||
|- | |- | ||
| | | manovacluster | ||
| Cluster group means using manova1 output. | | Cluster group means using manova1 output. | ||
|- | |- | ||
| | | normplot | ||
| Produce normal probability plot of the data. | | Produce normal probability plot of the data. | ||
|- | |- | ||
| | | ppplot | ||
| | | Produce a probability plot. | ||
|- | |- | ||
| | | qqplot | ||
| | | Produce an empirical quantile-quantile plot. | ||
|- | |- | ||
| | | silhouette | ||
| Compute the silhouette values of clustered data and show them on a plot. | | Compute the silhouette values of clustered data and show them on a plot. | ||
|- | |- | ||
| | | violin | ||
| Produce a Violin plot of the data. | | Produce a Violin plot of the data. | ||
|- | |- | ||
| | | wblplot | ||
| Plot a column vector DATA on a Weibull probability plot using rank regression. | | Plot a column vector DATA on a Weibull probability plot using rank regression. | ||
|} | |} | ||
Line 834: | Line 774: | ||
! Description | ! Description | ||
|- | |- | ||
| | | canoncorr | ||
| Canonical correlation analysis. | | Canonical correlation analysis. | ||
|- | |- | ||
| | | cholcov | ||
| Cholesky-like decomposition for covariance matrix. | | Cholesky-like decomposition for covariance matrix. | ||
|- | |- | ||
| | | dcov | ||
| Distance correlation, covariance and correlation statistics. | | Distance correlation, covariance and correlation statistics. | ||
|- | |- | ||
| | | logistic_regression | ||
| Perform ordinal logistic regression. | | Perform ordinal logistic regression. | ||
|- | |- | ||
| | | monotone_smooth | ||
| Produce a smooth monotone increasing approximation to a sampled functional dependence. | | Produce a smooth monotone increasing approximation to a sampled functional dependence. | ||
|- | |- | ||
| | | pca | ||
| Performs a principal component analysis on a data matrix. | | Performs a principal component analysis on a data matrix. | ||
|- | |- | ||
| | | pcacov | ||
| Perform principal component analysis on the NxN covariance matrix X | | Perform principal component analysis on the NxN covariance matrix X | ||
|- | |- | ||
| | | pcares | ||
| Calculate residuals from principal component analysis. | | Calculate residuals from principal component analysis. | ||
|- | |- | ||
| | | plsregress | ||
| Calculate partial least squares regression using SIMPLS algorithm. | | Calculate partial least squares regression using SIMPLS algorithm. | ||
|- | |- | ||
| | | princomp | ||
| Performs a principal component analysis on a NxP data matrix. | | Performs a principal component analysis on a NxP data matrix. | ||
|- | |- | ||
| | | regress | ||
| Multiple Linear Regression using Least Squares Fit. | | Multiple Linear Regression using Least Squares Fit. | ||
|- | |- | ||
| | | regress_gp | ||
| Linear scalar regression using gaussian processes. | | Linear scalar regression using gaussian processes. | ||
|- | |- | ||
| | | stepwisefit | ||
| Linear regression with stepwise variable selection. | | Linear regression with stepwise variable selection. | ||
|} | |} | ||
Line 886: | Line 826: | ||
== Wrappers == | == Wrappers == | ||
Functions available for wrapping other functions or group of functions. | Functions available for wrapping other functions or group of functions. | ||
Line 895: | Line 833: | ||
! Description | ! Description | ||
|- | |- | ||
| | | cdf | ||
| This is a wrapper | | This is a wrapper around various NAMEcdf and NAME_cdf functions. | ||
|- | |- | ||
| | | clusterdata | ||
| | | Wrapper function for 'linkage' and 'cluster'. | ||
|- | |- | ||
| | | pdf | ||
| This is a wrapper | | This is a wrapper around various NAMEpdf and NAME_pdf functions. | ||
|- | |- | ||
| | | random | ||
| Generates pseudo-random numbers from a given one-, two-, or three-parameter distribution. | | Generates pseudo-random numbers from a given one-, two-, or three-parameter distribution. | ||
|} | |} | ||
=== TODO list === | |||
Update <code>cdf</code>, <code>pdf</code>, and <code>random</code> to include the latest changes in distribution functions available in statistics-1.5.3. | |||
Missing functions: | |||
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1"> | |||
* <code>icdf</code> | |||
</div> | |||
[[Category:Packages]] | [[Category:Packages]] | ||
[[Category:Missing functions]] | [[Category:Missing functions]] |