Projects

A diagram illustrating a process where phages are mixed with bacterial cultures and controls in a microplate, followed by incubation at 37°C for 24 hours. The microplate wells are then analyzed after 10 passages to produce trained phages. The process involves dilutions, filtration, pooled lysate, and the use of control groups, with effect visualization through color changes in the wells.

Causal Mediation Analysis of Carbohydrate Intake and Insulin on Postprandial Glycemia

Mobile health (mHealth) which uses digital devices to collect real-time data and deliver health care interventions, has emerged as accessible tools to support various health behaviors, including continuous glucose monitoring, physical activity tracking, and mental health management. Careful analysis of mHealth data has the potential to provide improved and individualized care for patients. However, the high frequency, longitudinal scope, and multi-modal nature of mHealth data introduce unique methodological challenges, complicating analysis and the extraction of actionable insights. This research aims to integrate statistical theory with deep learning techniques to address these challenges and ultimately provide practitioners with precise and interpretable results.

Scatter plot titled 'Gene Detection: Joint vs Univariate, α=0.05'. It shows gene names like Vgf, Cx3cr1, Fermt2, Rhoc, Ngrm, Grin3a, Bin1, Tnf, and Cxcl2 among data points. Data points are colored based on significance: yellow for joint-only, gray for none, red for T1+Joint, and teal for T2+Joint. The axes are labeled 't-stat 1' (x-axis) and 't-stat 2' (y-axis).

Machine Learning for Multimodal Modeling and Early Detection of Alzheimer’s Disease

This research aims to develop interpretable and generalizable machine learning models that integrate neuroimaging, biomarker, and cognitive data to enhance the early detection and understanding of Alzheimer’s disease (AD). The first component focuses on modeling redundancy in diffusion-weighted imaging (DWI) metrics by utilizing statistical and unsupervised learning techniques to identify an optimized set of diffusion metrics. A second project expands early detection efforts by constructing a temporal event-based model to identify the order of decline across multiple cognitive domains during AD progression. Finally, a third project aims to predict regional tau PET burden using more accessible data sources such as structural MRI, DWI, and plasma biomarkers.

Overcoming Antibiotic Resistance through Data-Driven Phage Cocktail Design for Evolution-Proof Therapies

This research addresses the antimicrobial resistance crisis through the design of advanced therapeutic phage cocktails. We combine statistical modeling with directed evolution to design and assess phage cocktails with broad host ranges and minimized likelihood of bacterial resistance. By combining genetic changes in bacteria and phages with phenotypic outcomes like fitness, host range, and resistance suppression into a framework called CAPE (Cocktail Analysis and Predicted Evolution), this research aims to transform phage cocktail design from trial-and-error to a rigorous, evolution-informed process. Ultimately, bring phage therapy into the 21st century, and use nature’s perfect predator to tackle the antibiotic resistance crisis head-on.

Diagram of a person wearing an insulin pump and continuous glucose monitor on their abdomen. The graph shows blood sugar levels throughout the day, with short-acting insulin (Bolus) peaks at mealtime and long-acting insulin (Basal) levels. Below, a timeline depicts frequent physiological data collection every minute, CGM data every five minutes, and occasional insulin boluses indicated by syringe icons, illustrating insulin management for blood glucose control.

Joint False Discovery Rate Control for Correlated High-Dimensional Genomic Data

This research focuses on developing and applying statistical methods to improve inference from high-dimensional biological data. The goal is to develop statistical models that enable more powerful procedures for multiple testing in genomics, with an overarching aim of creating robust methods that account for dependencies across data sources. Typically, in large-scale genomics studies thousands of hypotheses are tested simultaneously, often across related conditions. Standard false discovery rate (FDR) procedures typically treat each condition independently, which can reduce power and obscure true signals. This research develops a joint FDR framework that explicitly models correlation between test statistics across conditions. Simulations and applications to mouse astrocyte gene expression data show that the joint approach improves sensitivity while maintaining FDR control, uncovering significant genes missed by univariate methods.

Diagram showing PET brain imaging process with MRI scanner and test tube, illustrating analysis of brain scans for Tau protein levels, indicating neocortical Tau positive, MTL Tau positive, or Tau negative results.