NTSYSpc ver. 2.2 details

Technical details about NTSYSpc ver. 2.2

Computational modules

bullet AUTOREGR - Fits data using the pure autoregressive model Used in spatial and phylogenetic autocorrelation analyses.
bullet CANPLS - Performs canonical correlation and partial least-squares analyses. Used to study correlations between two sets of variables. Permutation test used to test for associations.
bullet COMBINE - Combines two or more matrices into one.
bullet CONSEN - Computes a consensus tree for two of two or more trees (such as multiple tied trees from SAHN clustering or between two different methods). Several consensus indices are also computed to measure the degree of agreement between trees. Can read nexus tree files.
bullet COPH - Produces a cophenetic value matrix (matrix of ultrametric values) from a tree matrix (produced, e.g., by the SAHN clustering program). This matrix can be used by the MXCOMP program to test the goodness of fit of a cluster analysis to the similarity or dissimilarity matrix on which it was based. Can also produce path length distance matrices and phylogenetic covariance matrices.
bullet CORRESP - Correspondence analysis. This is a useful way to investigate the structure of 2-way contingency table.
bullet CPCA - Common principal components analysis of Flury (1984, 1988). Fits a single set of eigenvectors to a series of variance-covariance matrices.
bullet CVA - Performs a canonical vectors analysis (a generalization of discriminant function analysis). Can also test closeness of each specimen to each group mean.
bullet DCENTER - Performs a "double-centering" of a matrix of similarities or dissimilarities among the objects. The resulting matrix can then be factored to perform a principal coordinates analysis (a method for displaying relationships among objects in terms of their positions along a set of axes based on a dissimilarity matrix, see Gower 1966).
bullet EIGEN - Computes eigenvector and eigenvalue matrices of a real symmetric similarity matrix. This program can be used to perform a principal components or a principal coordinates analysis by extracting eigenvectors (factors) from a correlation or variance-covariance matrix.
bullet FACTOR - Performs the initial step (factor extraction) for a factor analysis of a correlation or a covariance matrix. The principal factor and maximum likelihood methods are included. The max diagonal, squared multiple correlation, and Jöreskog (1963) methods for initial communality estimation.
bullet FOURIER - Computes Fourier and elliptic Fourier transformations and their inverses. Can be used on both 2D and 3D outline curves.
bullet FOURPLOT - Plots outlines and their estimates based on Fourier coefficients.
bullet FREQ - Computes matrices of gene frequencies for input to the SIMINT or SIMGEND modules.
bullet FROTATE - Performs the orthogonal or oblique factor rotation step in a factor analysis. The function plane, primary product function plane, Harris-Kaiser independent cluster, Varimax, and Promax methods are included.
bullet FSCORES - Computes factor scores. The Anderson-Rubin, Bartlett, Least-squares, Regression, multigroup, and Thompson methods are included.
bullet MDSCALE - Nonmetric and linear multidimensional scaling analysis. This can be used as an alternative to PCA.
bullet MOD3D - Plots a 3-way scatter diagram as an interactive 3D perspective view of a model with n "objects" at tops of wires attached from a base plane. The view can be rotated interactively. This program is often used to view the results of a principal components or principal coordinates analysis.
bullet MST - Computes a minimum-length spanning tree from a similarity or dissimilarity matrix. This is useful for showing the nearest neighbors of objects based on their positions in a multidimensional space.
bullet MXCOMP - Compares two symmetric matrices by computing their matrix correlation and then plotting a scatter diagram. Can also compare two matrices with the effects of a third held constant (the Smouse, Long, Sokal test). The statistics for a 2-way Mantel test are also computed. It can be used to compute the goodness of fit of a cluster analysis to a dataset (by comparing a cophenetic value matrix with a dissimilarity matrix).
bullet MXPLOT - Plots 2-way scatter diagrams of rows or columns of a matrix.
bullet NJOIN - Computes Saitou and Nei's (1987) neighbor-joining method trees as estimated phylogenetic trees. Unweighted neighbor-joining clustering trees can also be computed. As in the UPGMA module, checks can be made for the effects of ties.
bullet OUTPUT - Formats matrices into pages for printing. Results can be pasted into most word processors. This formatted output is also useful for checking to make sure that an input file has been prepared in the correct format for NTSYSpc.
bullet PLOT - Plot one or more variables against another.
bullet POOLVC - Computes a pooled within-groups variance-covariance matrix from two or more data matrices. Can also perform a test for homogeneity of covariance matrices.
bullet PROCRUSTES - Performs a Procrustes superimposition or a generalized Procrustes analysis to compute and average configuration of points and to align configurations to the average. Useful for comparing ordinations and in geometric morphometrics. Analyses can be performed for two or higher dimensional data.
bullet PROCPLOT - Plots the results of a Procrustes analysis.
bullet PROJ - Projects a set of objects onto one or more vectors—or onto a space orthogonal to a set of vectors. In principal components analysis one will project standardized data onto the eigenvectors of the correlation matrix in order to see the best (in a least-squares sense) low-dimensional view of a data set. The orthogonal projection option can be used to implement Burnaby's (1966) method for size adjustment. Can also compute predictions using the results of a regression analysis.
bullet MULREGR - Performs a regression, multivariate regression, multiple regression, and generalized least-squares regression.
bullet RESAMPLE - Create samples using bootstrap, jackknife, random permutation, or random normal deviates.
bullet SAHN clustering - Performs the sequential, agglomerative, hierarchical, and nested clustering methods as defined by Sneath and Sokal (1973) . These include such commonly used hierarchical clustering methods as listed below. The program can find alternative trees when there are ties in the input matrix.
bullet complete-link (maximum method)
bullet single-link (minimum method)
bullet flexible clustering
bullet UPGMA (unweighted pair-group method)
bullet WPGMA (weighted pair-group method)
bullet WPGM using centroid clustering (either similarities or dissimilarities)
bullet WPGM using Spearman's average
bullet SIMGEND - Computes matrices of genetic distance coefficients from gene-frequency and DNA sequence data. The following coefficients can be selected.
bullet Cavalli-Sforza and Edwards (1967) arc distance
bullet Balakrishnan and Sanghvi (1968) distance.
bullet Cavalli-Sforza and Edwards (1967) chord distance.
bullet Hillis (1984) distance
bullet Swofford and Olsen's (1990) suggestion to unbias the distance by using same correction as in Nei's distance.
bullet Nei's (1972) distance (default).
bullet Nei's (1978) unbiased distance. Formula as above but with denominator:
bullet Prevosti (Wright, 1978 ) distance.
bullet Rogers (1972) distance
bullet Rogers distance as modified by Wright (1978
bullet Jukes and Cantor (1969) distance modified for DNA sequence data.
bullet SIMINT - Computes various similarity or dissimilarity indices for interval measure (continuous) data (e.g., correlation, distance, etc. coefficients).
bullet Bray-Curtis distance
bullet Canberra metric
bullet Chi-squared distance
bullet Average taxonomic distance
bullet Squared average distance
bullet Euclidean distance
bullet Euclidean distance squared
bullet Manhattan distance
bullet Penrose's shape coefficient
bullet Penrose's size coefficient
bullet Product-moment correlation
bullet Cosine of angle
bullet Sample size
bullet Morisita (1959) index
bullet Horn's (1966) modification of Morisita index
bullet Renkonen (1938) similarity
bullet Variances and covariances
bullet Inner product
bullet SIMQUAL - Computes various association coefficients for qualitative data— data with unordered states (e.g., simple matching, Jaccard, phi, etc. coefficients). Hamann (1961) coefficient
bullet Rogers and Tanimoto (1960) distance
bullet Simple matching coefficient
bullet Dice (1945) coefficient
bullet Jaccard (1908) coefficient
bullet Kulcznski (1927) coefficients 1 and 2
bullet Phi coefficient
bullet Russel and Rao (1940) coefficient
bullet Ochiai coefficient
bullet Yule (1911) coefficient
bullet also several unnamed coefficients from Sokal and Sneath (1961)
bullet SPLIT- Divides a matrix into two or more matrices.
bullet STAND - Performs a linear transformation of a data matrix so as to eliminate the effects of different scales of measurement. Several options for what gets subtracted off and what gets used as a divisor.
bullet SUMMARY - Summarizes results of a resampling experiment (bootstrap, jackknife, etc.).
bullet SVD - Computes a singular-value decomposition of a rectangular matrix. It allows you to compute principal axes and projections in a single step.
bullet TPSWTS - Computes projections of the 2D or 3D coordinates of objects onto the principal warps of a thin-plate spline bending energy matrix (see Bookstein, 1991). This is done to enable a statistical analysis of the components of shape variation. Includes both 2D and 3D estimates of the uniform shape component.
bullet TRANSF - Performs various linear and non-linear transformations of the rows or columns of a matrix. Computes Bookstein shape coordinates (both scaled and unscaled). Can also be used to delete rows or columns and alter the form of storage of some matrices.
bullet TREE - Displays a tree (e.g., from a cluster analysis) as a phenogram or the results of the neighbor-joining method as a phylogenetic tree with branch lengths. Options are provided for scaling and scrolling through a tree interactively.

Data size limitations

Most modules in NTSYSpc do not have explicit dimension limits for objects or variables. The limitation will be disk space and time. Larger amounts of RAM will speed up to computations for very large datasets. With the present capacity and power of modern PCs, a data set with  a few hundred samples or variables is considered a small dataset for most computations. However, the MDSCALE module must manipulate many matrices simultaneously and hence is more limited in the size of the matrix it can handle (512 variables is the maximum). 




 

This file was last modified on 9 June 2023. SiteLock