Volume 17, Issue S5 e058741
BIOMARKERS
Free Access

AI-based selection of superior proteomic biomarkers for classifying Alzheimer’s disease

Raghav Tandon

Corresponding Author

Raghav Tandon

Georgia Institute of Technology, Atlanta, GA, USA

Correspondence

Raghav Tandon, Georgia Institute of Technology, Atlanta, GA, USA.

Email: [email protected]

Search for more papers by this author
Nicholas T. Seyfried

Nicholas T. Seyfried

Emory University School of Medicine, Atlanta, GA, USA

Search for more papers by this author
Cassie S Mitchell

Cassie S Mitchell

Georgia Institute of Technology, Atlanta, GA, USA

Emory University School of Medicine, Atlanta, GA, USA

Search for more papers by this author
First published: 31 December 2021

Abstract

Background

Proteomic changes are a hallmark of Alzheimer’s Disease (AD) and related disorders. Thousands of different proteins can be measured in clinical patients. A succinct, universally available clinical AD protein diagnostic panel is needed. The objective of the present study was to utilize artificial intelligence (AI) to determine a subset of proteins best for differentiating healthy controls, asymptomatic AD (ASymAD) cases, and AD cases.

Method

An AI-based protein selection approach is constructed using published clinical proteomic datasets, which had over 3300+ measured proteins from brain tissues in the dorsolateral prefrontal cortex. Four published data sets (n = 419 patients) were used to construct and train the model while two published data sets (n = 201 patients) were used to independently validate the model results. Machine learning classification with support vector machine and logistical regression was combined with recursive feature elimination to identify the most discriminative proteins, which distinguish between AD, Control, and AsymAD cases. The final protein subset was a small fraction of all the proteins measured in the data.

Result

The algorithm identified a primary subset of 29 important proteins out of the 3300+, which can differentiate AD cases from control cases with high accuracy (AUC = 0.93). However, 88 proteins of 3300+ are necessary to differentiate AD from asymptomatic AD with high accuracy (AUC = 0.96 for AD; AUC = 0.89 for Control; AUC = 0.85 for ASymAD). Selected proteins were significantly enriched for metabolic function, namely glucose metabolism. Notably, amyloid precursor protein (APP), VGF, and RABEP1 are strong differentiators of AD compared to Control or ASymAD. However, to differentiate ASymAD and AD, examples of superior predictive proteins include ALDH1A1, BDH2, C4A, FABP7, GABBR2, GNAI3, PBXIP1, and PRKAR1B.

Conclusion

Clinical cases can be successfully differentiated based on their measured biomarker levels. AI platforms can be leveraged to identify a subset of the most discriminative proteins useful for future AD clinical diagnostics and disease staging.