The Analysis of Dynamical Diseases by Optimal Transportation Distancesby Michael Muskulus and Sjoerd Verduyn-Lunel Diseases influence the dynamics of normal physiological processes, and by analysing measurements of the latter it is possible to accurately and automatically detect and diagnose diseases. It is a fact that diseases with similar symptoms are sometimes incorrectly diagnosed and treated, but methods such as ours can help prevent this. The method itself is based on calculating abstract distances between time series of measurements. Optimal transportation problems arise in a variety of ways in everyday life. Originally formulated by Kantorovich as the problem of optimally transporting manufactured goods from their suppliers to some markets, it can also be studied in its measure theoretic version, where it corresponds to the problem of optimally moving a distribution of sand and rocks such that the surface flattens out. When the distributions considered are instead given by the time averages of two dynamical systems, the optimal cost is an abstract measure of the distance between their long-term dynamical behaviour. These time averages are computed from time series by way of a delay vector reconstruction, as is common in nonlinear time series analysis. When computing the optimal cost, which is called the Wasserstein distance in the mathematical literature, the cost per unit of (probability) mass moved is proportional to the distance travelled. Numerically, the continuous problem is approximated by a discrete transportation problem, for which polynomial general-purpose algorithms exist. One direction for future research is the search for more efficient algorithms that make use of the special structure of the problem, i.e. that costs fulfil the triangle inequality. At the moment, the problem is further approximated by bootstrapping smaller subproblems from it, as the computations are otherwise too time consuming. Residual Wasserstein Distances We therefore defined the residual Wasserstein distances as optimal transportation distances where an initial translation and relative scaling of the two time averages incurs no cost. The resulting distances are scale-invariant and can be computed efficiently by an iterative majorization method, combined with the AUCTION algorithm due to Bertsekas. The latter starts from already computed solutions and relaxes them to optimal solutions of similar transportation problems. A software package for the statistical computing environment R has been developed which implements this method, and is available from the author's web page. Dynamical Diseases Application to Breathing Data Figure 1 shows some of the original time series and two-dimensional representations of the computed distances. The latter are results obtained by residual Wasserstein distances (panel E) and two alternative methods, distances based on the means of the two parameters (panel D) and standard Euclidean distances between the time series (panel C). ![]() Figure 1: Time series of four randomly chosen patients either with asthma (A) or COPD (B), where the upper curve in each panel shows resistance at 8 Hz and the lower curve elasticity at 8 Hz. Time is given in samples. The distances between the time series of all 25 patients are shown in a two-dimensional representation, the upper panel (C) shows Euclidean distances, the middle (D) shows distances in mean, and the lower panel (E) residual Wasserstein distances. Conclusions Link: Please contact: |