poster.pdf (2.95 MB)

III:Small: Partitioning Big Data for High Performance Computation of Persistent Homology

Download (2.95 MB)
posted on 31.01.2020 by Philip Wilsey
Persistent Homology (PH) is computationally expensive and cannot be directly applied on more than a few thousand data points. This project aims to develop mechanisms to allow the computation of PH on large, high-dimensional data sets. The proposed method will significantly reduce the run-time and memory requirements for the computation of PH without significantly compromising accuracy of the results.

This project explores techniques to map a large point cloud P to another point cloud P' with fewer total points such that the topology space characterized by P and P' is nearly equivalent. The mapping from P to P' will potentially hide some of the smaller topological features during the PH computation on P'. Restoration of accurate PH results is achieved by (i) upscaling data for the identified large topological features, and (b) partition the data to run concurrent PH computations that locate the smaller topological features.