Mathematical Modeling of Epigenetic Regulation: Correlating DNA Methylation and Gene Expression in Breast Cancer Analysis

Abstract:

This study explores the computational integration of multi-omics data to analyze the regulatory mechanisms of gene expression in cancer. Specifically, it focuses on the correlation between DNA methylation in promoter regions and transcriptomic expression levels. A mathematical model is developed to define patients as data vectors containing paired high-dimensional methylation and expression values. Using this model, a formal aggregation function is defined to reduce noise
from individual CpG probes, establishing a metric for regulatory strength. The model is validated using a case study of the BRCA1 gene in the TCGA-BRCA cohort (n = 498). The analysis reveals a statistically significant negative correlation (r = −0.32, p < 0.001), confirming the hypothesis of epigenetic silencing. This framework provides a standardized approach for bioinformatics pipelines aiming to identify epigenetically regulated tumor suppressor genes.