Skip to main content

Topological and Geometric Modeling and Computation of Structures and Functions in Single-Cell Omics Data

PI: Dr. Zixuan Cang (Assistant Professor of Mathematics, NCSU)

Support:  NSF (National Science Foundation)

Period of Performance:  September 1st, 2022 – August 31st, 2025

Budget: $374,000

Summary: Numerous single-cell data analysis tools rely on structural representations with reduced dimensions, and the observations could be sensitive to the low-dimensional representation used. A systematic exploration of structural representations is thus needed to control the reliability and interpretability of downstream analysis results. Methods based on applied topology and geometry will be developed to extract low-dimensional structural characteristics from the high-dimensional single-cell omics data by scanning a wide range of scales and parameters. Methods will be developed to adapt to the application of single-cell omics data analysis, for example, local topological fingerprints and topology-guided optimal transport. An atlas of structural representations for a single-cell dataset with well-defined metrics quantifying the difference between structures will be assembled to provide a systematic way of representing the structures of single-cell omics data. A generally applicable pipeline of applying downstream analysis tools upon this structure atlas will be introduced and evaluated in various application cases. The systematic structural analysis method will be combined with machine learning to further address two important questions: establishment of structure-function relationships in single-cell datasets, such as identifying transition cells based on their local structures in the dataset, and integration of single-cell multi-omics datasets based on topological and geometric characterizations, especially for datasets without shared features. Efficient, stable, and accurate numerical methods and algorithms will be developed for these mathematical questions motivated by biological applications. The tools will be implemented to be easily usable by both computational and experimental scientists.