Multidimensional scaling

What is multidimensional scaling in machine learning?

Multidimensional Scaling (MDS) is a family of mathematical models that can be utilized for the purpose of analyzing and visualizing the distances between objects, where the distance is known between pairs of these objects.

It is a way to visualize the level of similarity of individual cases of a dataset. It is employed to make possible the translation of “information about the pairwise 'distances' among a set of n objects or individuals" into a configuration of n points that is mapped into an abstract Cartesian space.

It refers to a set of related ordination techniques that is utilized in information visualization, especially for displaying the information that is contained in a distance matrix. MDS is a type of non-linear dimensionality reduction.

James O. Ramsay, who is considered by many to be the founder of functional data analysis made many core theoretical contributions to multidimensional scaling.

Multidimensional scaling is based on similarity or dissimilarity data.

‍

What is classical multidimensional scaling?

Classical Multidimensional Scaling (CMDS) is a method that showcases the structure of distance-like data in the form of a geometrical picture. CMDS is also referred to as Principal Coordinates Analysis (PCoA), Torgerson Scaling, or Torgerson–Gower scaling.

It takes an input matrix displaying dissimilarities between pairs of items and generates an output in the form of a coordinate matrix whose configuration minimizes a loss function known as strain.

This is how a Classical MDS algorithm tends to work:

It makes use of the fact that it can derive the coordinate matrix X through eigenvalue decomposition from B = XX’. By making use of double centering, the matrix B can be computed from proximity matrix D.

The steps involved are:

Setting up the squared proximity matrix D2 = [dij2]
Applying double centering: B = -12CD(2)Cby making use of the centering matrix C = I - 1nJn. Here, n refers to the number of objects, I is the n x n identity matrix, and Jn is an n x n matrix of all ones.
Determining the m largest eigenvalues ƛ1, ƛ2,…, ƛmalong with the corresponding eigenvectors e1, e2,..., em, of B. Here m is the number of dimensions desired for the output).
Now, X = Emm1/2. Here, Em is the matrix of m eigenvectors and mis the the diagonal matrix of m eigenvalues of B.

This does not apply to direct dissimilarity ratings because Classical Multidimensional Scaling assumes Euclidean distances.

What is metric multidimensional scaling (mMDS)?

mMDS is a superset of classical multidimensional scaling. It generalizes the optimization procedure to a wide range of loss functions and input matrices of known distances with weights and so on.

What is non-metric multidimensional scaling (nMDS)?

Unlike metric multidimensional scaling, non-metric multidimensional scaling identifies a non-parametric monotonic relationship between the dissimilarities in the item-item matrix as well as the Euclidean distances between items, along with the location of every item in the low-dimensional space.

The basic steps that are involved in the functioning of an nMDS algorithm are:

Identifying a random configuration of points.
Calculate the distances d between the points.
Find the optimal monotonic transformation of the proximities, for the purpose of obtaining optimally scaled data f(x).
Minimizing the stress between the optimally scaled data and the distances by identifying a fresh configuration of points.
Comparing the stress to some criterion. If the stress is small enough then you can leave the algorithm; if not, you can return to the second step.

What is generalized multidimensional scaling (GMDS)?

This is an extension of metric multidimensional scaling. In GMDS, an arbitrary smooth non-Euclidean space becomes the target space. Generalized multidimensional scaling allows you to find the minimum-distortion embedding of one surface into another when the dissimilarities are distances on a surface and the target space is another surface.

What is stress in multidimensional scaling?

Stress is the measure of goodness-of-fit in multidimensional scaling. It is based on the differences between the predicted and the actual distances.

This measures the difference between the observed (dis)similarity matrix and the estimated similarity matrix by using one or multiple estimated stimuli dimensions.

Lower stress indicates that the fit is better.

‍

What is the purpose of multidimensional scaling?

The key purpose of multidimensional scaling is mapping the relative location of objects using data that can demonstrate how the objects differ. Torgerson undertook seminal work on this method.

What is multidimensional scaling used for?

Multi-dimensional scaling (MDS) is used to represent high-dimensional data in a low-dimensional space while preserving the similarities between data points. This is critical for analyzing and revealing the genuine structure hidden in the data.

It can be used on noisy data to reduce the effect of the noise on the embedded structure.

MDS can also lower information retrieval complexity in large datasets. It has many applications in data mining, gene network research, genomics research, and other domains.

How is multidimensional scaling calculated?

Here are the basic steps for calculating multidimensional space:

Start by assigning a number of points to coordinates in n-dimensional space.
Next, you’ll need to calculate the Euclidean distances for all pairs of points. This can be accomplished with the use of the Pythagorean theorem (a2 + b2 = c2). However, this gets somewhat more complicated for n-dimensional space (learn how to calculate Euclidean distance in n-dimensional space). Now you get the similarity matrix.
After that, you want to make a comparison between the similarity matrix and the initial input matrix by means of evaluating the stress function.
All that’s left now, is to adjust coordinates, if need be, for minimizing stress

‍

What is the difference between MDS and PCA?

Classic Torgerson's metric MDS transforms distances into similarities and then performs PCA on those similarities. So, you can technically call PCA or PCoA (Principal Coordinate Analysis) the algorithm for the simplest multidimensional scaling.

nMDS is based on the iterative ALSCAL or PROXSCAL algorithms or other similar algorithms. This is more versatile as a mapping technique than PCA is.

PCA is simply a method, while MDS is an entire class of analysis.