AI-based selection of superior proteomic biomarkers for classifying Alzheimer’s disease
Abstract
Background
Proteomic changes are a hallmark of Alzheimer’s Disease (AD) and related disorders. Thousands of different proteins can be measured in clinical patients. A succinct, universally available clinical AD protein diagnostic panel is needed. The objective of the present study was to utilize artificial intelligence (AI) to determine a subset of proteins best for differentiating healthy controls, asymptomatic AD (ASymAD) cases, and AD cases.
Method
An AI-based protein selection approach is constructed using published clinical proteomic datasets, which had over 3300+ measured proteins from brain tissues in the dorsolateral prefrontal cortex. Four published data sets (n = 419 patients) were used to construct and train the model while two published data sets (n = 201 patients) were used to independently validate the model results. Machine learning classification with support vector machine and logistical regression was combined with recursive feature elimination to identify the most discriminative proteins, which distinguish between AD, Control, and AsymAD cases. The final protein subset was a small fraction of all the proteins measured in the data.
Result
The algorithm identified a primary subset of 29 important proteins out of the 3300+, which can differentiate AD cases from control cases with high accuracy (AUC = 0.93). However, 88 proteins of 3300+ are necessary to differentiate AD from asymptomatic AD with high accuracy (AUC = 0.96 for AD; AUC = 0.89 for Control; AUC = 0.85 for ASymAD). Selected proteins were significantly enriched for metabolic function, namely glucose metabolism. Notably, amyloid precursor protein (APP), VGF, and RABEP1 are strong differentiators of AD compared to Control or ASymAD. However, to differentiate ASymAD and AD, examples of superior predictive proteins include ALDH1A1, BDH2, C4A, FABP7, GABBR2, GNAI3, PBXIP1, and PRKAR1B.
Conclusion
Clinical cases can be successfully differentiated based on their measured biomarker levels. AI platforms can be leveraged to identify a subset of the most discriminative proteins useful for future AD clinical diagnostics and disease staging.