Volume 20, Issue 7 pp. 4970-4984
RESEARCH ARTICLE
Open Access

In vivo validation of late-onset Alzheimer's disease genetic risk factors

Michael Sasner

Michael Sasner

The Jackson Laboratory, Bar Harbor, Maine, USA

Search for more papers by this author
Christoph Preuss

Christoph Preuss

The Jackson Laboratory, Bar Harbor, Maine, USA

Search for more papers by this author
Ravi S. Pandey

Ravi S. Pandey

The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA

Search for more papers by this author
Asli Uyar

Asli Uyar

The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA

Search for more papers by this author
Dylan Garceau

Dylan Garceau

The Jackson Laboratory, Bar Harbor, Maine, USA

Search for more papers by this author
Kevin P. Kotredes

Kevin P. Kotredes

The Jackson Laboratory, Bar Harbor, Maine, USA

Search for more papers by this author
Harriet Williams

Harriet Williams

The Jackson Laboratory, Bar Harbor, Maine, USA

Search for more papers by this author
Adrian L. Oblak

Adrian L. Oblak

Stark Neurosciences Research Institute, School of Medicine, Indiana University, Indianapolis, Indiana, USA

Search for more papers by this author
Peter Bor-Chian Lin

Peter Bor-Chian Lin

Stark Neurosciences Research Institute, School of Medicine, Indiana University, Indianapolis, Indiana, USA

Search for more papers by this author
Bridget Perkins

Bridget Perkins

Stark Neurosciences Research Institute, School of Medicine, Indiana University, Indianapolis, Indiana, USA

Search for more papers by this author
Disha Soni

Disha Soni

Stark Neurosciences Research Institute, School of Medicine, Indiana University, Indianapolis, Indiana, USA

Search for more papers by this author
Cindy Ingraham

Cindy Ingraham

Stark Neurosciences Research Institute, School of Medicine, Indiana University, Indianapolis, Indiana, USA

Search for more papers by this author
Audrey Lee-Gosselin

Audrey Lee-Gosselin

Stark Neurosciences Research Institute, School of Medicine, Indiana University, Indianapolis, Indiana, USA

Search for more papers by this author
Bruce T. Lamb

Bruce T. Lamb

Stark Neurosciences Research Institute, School of Medicine, Indiana University, Indianapolis, Indiana, USA

Search for more papers by this author
Gareth R. Howell

Gareth R. Howell

The Jackson Laboratory, Bar Harbor, Maine, USA

Search for more papers by this author
Gregory W. Carter

Corresponding Author

Gregory W. Carter

The Jackson Laboratory, Bar Harbor, Maine, USA

The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA

Correspondence

Gregory W. Carter, The Jackson Laboratory, 600 Main St, Bar Harbor, ME 04609, USA.

Email: [email protected]

Search for more papers by this author
First published: 30 April 2024
Citations: 2

Abstract

INTRODUCTION

Genome-wide association studies have identified over 70 genetic loci associated with late-onset Alzheimer's disease (LOAD), but few candidate polymorphisms have been functionally assessed for disease relevance and mechanism of action.

METHODS

Candidate genetic risk variants were informatically prioritized and individually engineered into a LOAD-sensitized mouse model that carries the AD risk variants APOE ε4/ε4 and Trem2*R47H. The potential disease relevance of each model was assessed by comparing brain transcriptomes measured with the Nanostring Mouse AD Panel at 4 and 12 months of age with human study cohorts.

RESULTS

We created new models for 11 coding and loss-of-function risk variants. Transcriptomic effects from multiple genetic variants recapitulated a variety of human gene expression patterns observed in LOAD study cohorts. Specific models matched to emerging molecular LOAD subtypes.

DISCUSSION

These results provide an initial functionalization of 11 candidate risk variants and identify potential preclinical models for testing targeted therapeutics.

Highlights

  • A novel approach to validate genetic risk factors for late-onset AD (LOAD) is presented.
  • LOAD risk variants were knocked in to conserved mouse loci.
  • Variant effects were assayed by transcriptional analysis.
  • Risk variants in Abca7, Mthfr, Plcg2, and Sorl1 loci modeled molecular signatures of clinical disease.
  • This approach should generate more translationally relevant animal models.

1 BACKGROUND

Alzheimer's disease (AD) is the most common cause of dementia, with a growing clinical, financial, and social impact. An increasing body of evidence highlights the importance of genetic risk in AD.1-3 While a small percentage of AD cases are linked to causative, familial mutations in the amyloid precursor protein (APP) processing pathway, the vast majority of cases are late-onset AD (LOAD), have heterogeneous symptoms and etiology, and are associated with polygenic risk from a combination of low-risk, relatively common variants.4-6 Genome-wide association studies (GWAS) have identified numerous LOAD risk variants, but few have been experimentally validated, and physiological mechanisms have not been elucidated, even for the single strongest risk variant, the ε4 allele of the APOE gene.4, 7 This is but one example8 of the general problem of how to progress from the identification of genetic variants to the functional impact of variants to getting to physiological disease mechanisms.9 Here we present a novel approach to assay the impact of individual polygenic risk factors using an in vivo approach.

While numerous potential therapeutics have shown promising results in transgenic mouse models of familial AD, few have advanced in clinical trials. This may result from numerous causes, but it is clear that one reason may be the lack of translational animal models available for preclinical studies.10-12 Almost all existing rodent models are based on causative mutations in proteins in the APP processing pathway expressed in neurons. Most AD genetic risk resides in genes mainly expressed in microglia and other non-neuronal cell types, as recently reviewed,5, 13, 14 indicating that complex cellular interactions play a causative role in disease etiology. While in vitro systems have been shown to have value, more relevant in vivo models are necessary to understand these cell–cell interactions.15 In particular, animal models are required to study the early and progressive stages of pathology, which are not accessible in clinical studies but are critical to understand disease mechanisms so as to better target novel therapeutic approaches.

The Model Organism Development and Evaluation for Late-onset Alzheimer's Disease (MODEL-AD) Consortium was established to create and characterize translationally relevant mouse models of LOAD and to set up protocols for preclinical testing in these new models.16 In this study we provide an overview of novel mouse models expressing human risk variants. Variants were introduced using a knock-in approach to avoid known issues with transgenic models.11, 17-19 To potentially enhance disease-relevant outcomes, variants were created on a more LOAD-susceptible genetic background expressing humanized APOE with the ε4 variant and the R47H mutation in Trem2, two of the strongest genetic risk factors for LOAD.20 The effects of each variant were assessed by gene expression changes in aging male and female brains using a newly developed transcriptomics panel,21 representing key LOAD-associated changes in clinical AD samples.22 This allowed us to functionalize GWAS variants with small but significant increases in disease risk and avoided a reliance on amyloid deposition or cognitive assays, which have not proven to translate to clinical studies.

2 METHODS

2.1 Late-onset AD risk variant prioritization

Prioritization and construction of the APOE and TREM2 variants in the LOAD1 strain were previously discussed.20 Late-onset variants were selected based on human genetic association, predicted pathogenicity, conservation with mouse homolog, and allele frequency. We further prioritized based on diversity in predicted function to maximize our exploration of potential LOAD biology. Determining specific variants was primarily limited by the rarity of strong coding candidates (eg, non-synonymous, stop-gain) and strict mouse sequence homology that required the same single nucleotide polymorphism (SNP) be engineered into mice. This led to a mix of variants at high-confidence GWAS loci, functional candidates, and exploratory variants. Exome sequencing from the Alzheimer's Disease Sequencing Project (ADSP) was initially used to identify specific variants at loci,23 buttressed by summary data at the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) (https://www.niagads.org/genomics/app). All variants are annotated as “ADSP Variants” that passed NIAGADS quality control checks (https://www.niagads.org/genomics/app).

ABCA7*A1527G (rs3752246) is the most common of multiple predicted loss-of-function variants associated with increased LOAD risk at the ABCA7 locus.24, 25 The SORL1*A528T (rs2298813) variant is among candidates in the SORL1 gene and likely involved in retromer function26; deficits in retromer-dependent endosomal recycling have been implicated as causal in AD.27-29 The SNX1*D465N (rs1802376) variant locus is associated with AD,24 and SNX1 is involved in retromer function relevant to LOAD.30 PLCG2*M28L (rs61749044) has been associated with LOAD (https://www.biorxiv.org/content/10.1101/2020.05.19.104216v1),24, 31 and Plcg2 is a key protein in microglial activation in response to AD pathology.32 The SHC2*V433M (rs61749990) variant was identified in ADSP exomes and has been associated with neurodegeneration and neuron loss.33, 34 SLC6A17*P61P (rs41281364) reduces gene expression in the brain (gtexportal.org/home/gene/SLC6A17), and its reduction is also associated with LOAD (agora.adknowledgeportal.org/genes/ENSG00000197106). Rare variants have been associated with neurological phenotypes.35, 36 The CLASP2*L163P (rs61738888) variant has been associated with neurodegeneration from meta-analysis.37 The MTMR4*V297G (rs2302189) variant has been linked to cognitive function.38, 39 Predicted CEACAM1 loss-of-function variants had a high disease burden in ADSP exome sequencing data (SKAT-O Bonferroni-adjusted = 7.47 × 10−7), and the gene was associated with AD-related traits in a model of mouse genetic variability.40 The common MTHFR*677C > T (rs1801133) has been associated with increased risk for LOAD and other age-related disorders.41, 42 To explore a copy-number variant linked to vascular function, we used an existing MEOX2 knockout based on an association with AD43 that may be related to the gene's role in neurovascular health.44 This variant was assessed in a heterozygous state due to the non-viability of the homozygote.

2.2 Model development

All experiments were approved by the Animal Care and Use Committee at The Jackson Laboratory. Mice were bred in the mouse facility at The Jackson Laboratory and maintained in a 12/12-h light/dark cycle, consisting of 12 h-ON 7 am-7 pm, followed by 12 h-OFF. Room temperatures are maintained at 18°C to 24°C (65°F to 75°F) with 40% to 60% humidity. All mice were housed in positive, individually ventilated cages (PIV). Standard autoclaved 6% fat diet (Purina Lab Diet 5K52) was available to the mice ad libitum, as was water with acidity regulated from pH 2.5 to 3.0.

Novel mouse alleles were generated using direct delivery of CRISPR-Cas9 reagents to LOAD1 (JAX No. 28709)20 mouse zygotes. Analysis of genomic DNA sequence surrounding the target region, using the Benchling (www.benchling.com) guide RNA design tool, identified appropriate gRNA sequences with a suitable target endonuclease site.

Streptococcus pyogenes Cas9 (SpCas9) V3 protein and gRNA were purchased as part of the Alt-R CRISPR-Cas9 system using the crRNA:tracrRNA duplex format as the gRNA species (IDT, USA). Alt-R CRISPR-Cas9 crRNAs (Product 1072532, IDT, USA) were synthesized using the gRNA sequences specified in the DESIGN section and hybridized with the Alt-R tracrRNA (Product 1072534, IDT, USA) as per the manufacturer's instructions. Plasmid or oligonucleotide constructs were synthesized by Genscript. See Table S1 for CRISPR reagents.

RESEARCH IN CONTEXT

  1. Systematic review: The authors review the literature and associated public datasets to identify genetic risk factors for late-onset Alzheimer's disease (LOAD).

  2. Interpretation: Our findings support the use of mouse models to validate and prioritize disease risk variants identified by clinical studies and are an essential step toward the development of models of LOAD to be used in mechanistic studies and in therapeutic development efforts.

  3. Future directions: This manuscript establishes a process to validate and prioritize animal models expressing genetic risk factors for LOAD, which are currently lacking. Based on the relevance of transcriptomic signatures to those seen in clinical studies, we will combine alleles to create polygenic models that can serve as useful models of LOAD. Moving forward, we will do in-depth analysis of these novel models at extended ages using translationally relevant measures including: fluid biomarkers; transcriptomics, proteomics, and metabolomics; neuropathology; and in vivo imaging.

To prepare the gene editing reagent for electroporation, SpCas9:gRNA Ribonucleoprotein (RNP) complexes were formed by incubating AltR-SpCas9 V3 (Product 1081059, IDT, USA) and gRNA duplexes for 20 min at room temperature in embryo tested TE buffer (pH 7.5). The SpCas9 protein and gRNA duplex were at 833 ng/ul and 389 ng/ul, respectively, during complex formation. After RNP formation, the purified plasmid was added and the mixture spun at 14,000 rpm in a microcentrifuge. The supernatant was transferred to a clean tube and stored on ice until use in the embryo electroporation procedure. The final concentrations of the gRNA, SpCas9, and plasmid components in the electroporation mixture were 600, 500, and 20 ng/ul, respectively.

Founders were selected that were positive by short-range polymerase chain reaction (PCR) assays, had appropriate sequence across the homology arm junctions, were negative for the plasmid backbone, and had correct sequences of the inserted construct.

Allele-specific genotyping protocols for all models are available on JAX mice data sheets for each model.

Other models were obtained from the JAX mouse repository (Table 1).

TABLE 1. Listing of gene loci, human risk variants, and corresponding mouse alleles, allele type, and JAX ID of mouse models created.
Locus Allele (Human) Allele (Mouse) SNP Allele Type JAX No.
Abca7 A1527G A1541G rs3752246 missense 30283
Ceacam1 LOF variants KO  — KO 30673
Clasp2 L163P L163P rs61738888 missense 31944
Meox2 LOF variants HET KO  — HET KO 33770
Mthfr A222V (677C > T) A262V rs1801133 missense 30922
Mtmr4 V297G V297G rs2302189 missense 31950
Plcg2 M28L M28L rs61749044 missense 30674
Shc2 V577M V433M rs2298813 missense 31952
Slc6a17 P61P P61P rs41281364 silent mutation 31948
Snx1 D466N D465N rs1802376 missense 31942
Sorl1 A528T A528T rs41281364 missense 31940
Other models used
57BL/6J 664
5xFAD 8730
LOAD1 28709
  • Note: All models also contain a humanized APOE ε4 allele and a Trem2*R47H allele on the C57BL6/J background (“LOAD1”), which was used as a control.

2.3 Brain harvest at 4 months of age

Anesthetized and subsequently perfused animals were decapitated and heads submerged quickly in cold 1X PBS. The brain was carefully removed from the skull, weighed, and divided midsagitally into left and right hemispheres using a brain matrix. The right hemisphere was quickly homogenized on ice and equally aliquoted into cryotubes for proteomic and transcriptomic analysis. Cryotubes were immediately snap-frozen on dry ice and stored long term at −80°C.

2.4 RNA sample extraction

Total RNA was extracted from snap-frozen right brain hemispheres using Trizol (Invitrogen, Carlsbad, CA). mRNA was purified from total RNA using biotin-tagged poly dT oligonucleotides and streptavidin-coated magnetic beads, and quality was assessed using an Agilent Technologies 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA).

RNA-Sequencing Assay Library Preparation Sequencing libraries were constructed using TruSeq DNA V2 (Illumina, San Diego, CA, USA) sample prep kits and quantified using qPCR (Kapa Biosystems, Wilmington, MA). The mRNA was fragmented, and double-stranded cDNA was generated by random priming. The ends of the fragmented DNA were converted into phosphorylated blunt ends. An “A” base was added to the 3′ ends. Illumina-specific adapters were ligated to the DNA fragments. Using magnetic bead technology, the ligated fragments were size-selected, and then a final PCR was performed to enrich the adapter-modified DNA fragments since only the DNA fragments with adapters at both ends will amplify.

2.5 RNA-Sequencing

Libraries were pooled and sequenced by the Genome Technologies core facility at The Jackson Laboratory. All samples were sequenced on Illumina HiSeq 4000 using HiSeq 3000/4000 SBS Kit reagents (Illumina), targeting 30 million read pairs per sample. Samples were split across multiple lanes when being run on the Illumina HiSeq; once the data were received, the samples were concatenated to have a single file for paired-end analysis.

2.6 RNA-Sequencing data processing

Sequence quality of reads was assessed using FastQC (version 0.11.3, Babraham). Low-quality bases were trimmed from sequencing reads using Trimmomatic (version 0.33).45 After trimming, reads of length longer than 36 bases were retained. The average quality score was greater than 30 at each base position, and sequencing depth was in a range of 60 to 80 million reads. RNA-Seq sequencing reads from all samples were mapped to the mouse genome (version GRCm38.p6) using ultrafast RNA-Seq aligner STAR (version 2.5.3).46 To measure human APOE gene expression, we created a chimeric mouse genome by concatenating the human APOE gene sequence (human chromosome 19:44905754-44909393) into the mouse genome (GRCm38.p6) as a separate chromosome (referred to as chromosome 21 in chimeric mouse genome). Subsequently, we added gene annotation of the human APOE gene into the mouse gene annotation file. Additionally, we have also introduced annotation for novel Trem2 isoform in mouse gene annotation file (GTF file), which is identical to the primary transcript but truncated exon2 by 119 bp from its start position.20 Afterward, a STAR index was built for this chimeric mouse genome sequence for alignment, then STAR aligner output coordinate-sorted BAM files for each sample were mapped to the chimeric mouse genome using this index. Gene expression was quantified in two ways to enable multiple analytical methods: transcripts per million (TPM) using RSEM (version 1.2.31)47 and raw read counts using HTSeq-count (version 0.8.0).48

Engineered variants were validated by inspecting RNA-Seq reads (Figure S1A), and targeted transcript expression was verified by RNA abundance (Figure S1B).

2.7 NanoString transcriptomic analysis

The NanoString Mouse AD gene expression panel21 was used for gene expression profiling on the nCounter platform (NanoString, Seattle, WA, USA). Mouse NanoString gene expression data were collected from brain hemisphere homogenates at 4, 8, and 12 months of age for both sexes, from approximately six animals per group. The nSolver software was used for generating NanoString gene expression counts. Normalization was done by dividing counts within a lane by geometric mean of the designated housekeeping genes from the same lane. Next, normalized count values were log-transformed and corrected for potential batch effects using ComBat.49

Next, we determined the effects of each factor (sex and genetic variants) by fitting a multiple regression model using the lm function in R as50
log expr = β 0 + i β i + ε . $$\begin{equation*}{\mathrm{log}}\left( {{\mathrm{expr}}} \right) = \ \ {{\beta }_0} + \ \sum\nolimits_i {{{\beta }_i} + \ \varepsilon } .\end{equation*}$$

The sum is over sex (male), and all genetic variants (5xFAD, LOAD1, Abca7*A1527G, Ceacam1KO, Mthfr*677C > T, Shc2*V433M, Slc6a17*P61P, Clasp2*L163P, Sorl1*A528T, Meox2 KO [HET], Snx1*D465N, Plcg2*M28L, Mtmr4*V297G) used in this study. The log(expr) represents log-transformed normalized count from the NanoString gene expression panel.21 In this formulation, B6J was used as the control for the 5xFAD and LOAD1 mouse models, whereas LOAD1 served as controls for GWAS-based models in order to estimate the effects of individual variants. Separate models were run for each age cohort.

2.8 Human AMP-AD gene co-expression modules

Data for 30 human brain co-expression modules from the Accelerating Medicines Partnership for Alzheimer's Disease (AMP-AD) studies were obtained from the Synapse data repository (https://www.synapse.org/#!Synapse:syn11932957/tables/; SynapseID: syn11932957). Briefly, Wan et al. (2020)22 identified 30 human brain co-expression modules based on meta-analysis of differential gene expression from seven distinct brain regions in post mortem samples obtained from three independent LOAD cohorts.51-53 These 30 human AMP-AD modules were further classified into five distinct consensus clusters that describe the major functional alterations observed in human LOAD.21, 22 Module and consensus cluster annotations for each NanoString gene are listed in Table S2.

2.9 Human AD subtypes

Milind et al.54 integrated post mortem brain co-expression data from the frontal cortex, temporal cortex, and hippocampus brain regions and stratified patients into different molecular subtypes based on molecular profiles in three independent human LOAD cohorts (ROS/MAP, Mount Sinai Brain Bank, and Mayo Clinic).51-53 Two distinct LOAD subtypes were identified in the ROSMAP cohort, three LOAD subtypes were identified in the Mayo cohort, and two distinct LOAD subtypes were identified in the MSBB cohort. Similar subtype results were observed in each cohort, with LOAD subtypes found to primarily differ in their inflammatory response based on differential expression analysis.54 Data for LOAD subtypes were obtained through AD Knowledge Portal55 (https://www.synapse.org/#!Synapse:syn23660885).

2.10 Mouse–human expression comparison

To assess the human disease relevance of LOAD risk variants in mice, we determined the extent to which changes due to genetic perturbations in mice matched those observed in human AD cases versus controls. For each mouse perturbation, we tested each of the 30 AMP-AD modules using mouse-human gene homologs and limited to the genes both present in the module and the NanoString Mouse AD Panel, which was designed to optimize coverage of these modules.21 Pearson's correlations were computed for changes in gene expression (log-fold change) across all module genes for human AD cases versus controls22 against the effect of each mouse perturbation (β) as measured previously.21, 50 We used the cor.test function in R as follows:
cor . test Lo g 2 FC AD / control , β , $$\begin{equation*}{\mathrm{cor}}.{\mathrm{test}}\left( {{\mathrm{Lo}}{{{\mathrm{g}}}_2}{\mathrm{FC}}\left( {{\mathrm{AD}}/{\mathrm{control}}} \right),{{\beta}}} \right),\end{equation*}$$
from which we obtained the correlation coefficient and the significance level (p) of the correlation for each perturbation–module pair. Log2FC values for human transcripts were obtained through the AD Knowledge Portal55 (https://www.synapse.org/#!Synapse:syn14237651).
To determine the similarity of each mouse perturbation and the LOAD subtypes, we computed the Pearson's correlation between gene expression changes (log-fold change) in human AD subtype cases versus controls54 and the effect of each mouse perturbation (β) across genes on the NanoString panel21 using cor.test function in R as follows:
cor . test Lo g 2 FC LOADSubtype / control , β , $$\begin{equation*}{\mathrm{cor}}.{\mathrm{test}}\left( {{\mathrm{Lo}}{{{\mathrm{g}}}_2}{\mathrm{FC}}\left( {{\mathrm{LOADSubtype}}/{\mathrm{control}}} \right),{{\beta}}} \right),\end{equation*}$$
from which we obtained both the correlation coefficient and the significance level (p) of the correlation. Here, Log2FC(LOAD Subtype/control) represented the log-fold change in gene expression in each subtype versus control, and the correlation spanned all homologous genes on the NanoString AD Mouse Panel.

We plotted the correlation results using the ggplot2 package in R. Framed circles were used to denote significant (p < 0.05) positive (blue) and negative (red) Pearson's correlation coefficients. The color intensity and size of the circles were sized proportional to Pearson's correlation coefficient.

2.11 Functional enrichment analysis

Gene Set Enrichment Analysis (GSEA) was used based on the method proposed by Subramanian et al.56 as implemented in the R Bioconductor package clusterProfiler57 for the Reactome pathway library and Gene Ontology (GO) terms. Nanostring Mouse AD Panel genes21 were ranked based on regression coefficients calculated for each factor, and GSEA was performed on this ranked dataset. The use of GSEA ensured that pathway effects were assessed relative to the genes on the panel, as the panel was enriched for AD-relevant genes. Enrichment scores for all Reactome pathways and GO terms were computed to compare relative expression on the pathway level between each factor estimate from the regression models. We also performed GO term enrichment analyses using the “enrichGO” function in the clusterProfiler package.57 The significance of pathways and GO terms was determined using false discovery rate (FDR) multiple testing correction (FDR-adjusted p < 0.05).

3 RESULTS

3.1 Validation of novel models

Sequence analysis demonstrated that the appropriate sequence variants had been established (Figure S1A). Quantification of transcript counts in homozygous LOAD models relative to littermate wild-type controls showed no significant differences in expression levels (Figure S1B).

3.2 LOAD-associated risk variants showed age-dependent concordance with distinct human co-expression modules

We assess the relevance of each LOAD risk variant to the molecular changes observed in human disease51-53, 58 by correlating the effect of each mouse perturbation (sex and genetic variants) with 30 human AMP-AD brain gene co-expression modules22 using the NanoString Mouse AD Panel21 (Figure 1). We analyzed mouse NanoString data from brain hemispheres at different ages (4 and 12 months) to assess the correlation with human post mortem co-expression modules as animals aged.

Details are in the caption following the image
Strategy to prioritize loci and LOAD risk variants. Summary of strategies for variant selection for (A) LOAD and (B) neurovascular risk factors. (C) Gene expression analysis comparing human and mouse gene expression data to identify human LOAD modules that are altered by genetically engineered variants in mice.

The amyloidogenic 5XFAD transgenic model exhibited significant positive correlations (p < 0.05) with several human co-expression modules in Consensus Cluster B enriched for immune-system-related pathways at both 4 and 12 months but showed significant positive correlations (p < 0.05) with neurodegeneration-associated human co-expression modules in Consensus Cluster C only at 12 months (Figure 2A-B). However, we did not observe significant positive correlations between effect of 5xFAD and human co-expression modules in Consensus Clusters A, D, and E, validating that the 5xFAD strain is primarily a model of amyloidosis and does not fully recapitulate LOAD changes.

Details are in the caption following the image
Correlation between LOAD-associated risk variants and 30 human AMP-AD brain co-expression modules using the NanoString Mouse AD panel. (A) Correlation between the effect of each mouse perturbation relative to the LOAD1 background in 4-month-old mice and 30 human co-expression modules,22 also including the early-onset transgenic model 5XFAD and the LOAD1 background relative to C57BL/6J. The 30 human co-expression modules were grouped into five consensus clusters with similar gene content across the multiple studies and brain regions.22 Framed circles correspond to significant (p < 0.05) positive (blue) and negative (red) Pearson's correlation coefficients, with size and color intensity proportional to the correlation. The effects of multiple LOAD risk variants in mice were positively correlated (p < 0.05) with cell cycle and myelination-associated modules in Consensus Cluster D and cellular stress-response-associated modules in Consensus Cluster E. (B) Correlation between effect of each mouse perturbation at 12 months and the 30 human co-expression modules. LOAD risk variants showed significant correlation with functionally distinct AMP-AD co-expression modules. The effects of Abca7*A1527G, Shc2*V433M, Ceacam1 KO, and Slc6a17*P61P in aged mice correlated with the immune modules in Consensus Cluster B, while the effects of Sorl1*A528T and Plcg2*M28L correlated with the neuronal modules in Consensus Cluster C.

At 4 months, among all LOAD risk variants, only Slc6a17*P61P showed significant positive correlations (p < 0.05) with the immune-related modules (Figure 2A). The Abca7*A1527G, Sorl1*A528T, and Mtmr4*V297G risk variants exhibited significant positive correlations (p < 0.05) with extracellular matrix organization-related modules in Consensus Cluster A (Figure 2A). The Ceacam1 KO, Plcg2*M28L, Meox2 KO(HET), and Mtmr4*V297G strains exhibited significant positive correlations (p < 0.05) with cell-cycle- and myelination-associated modules in Consensus Cluster D and cellular stress-response-associated modules in Consensus Cluster E (Figure 2A). Abca7*A1527G and Sorl1*A528T variants generated significant positive correlations (p < 0.05) with cellular stress-response-associated modules in Consensus Cluster E.

We observed more significant correlations between LOAD risk variants and human AMP-AD modules at 12 months for most strains. The Abca7*A1527G variant had the most pronounced correlations with LOAD expression changes, exhibiting significant positive correlations (p < 0.05) with immune-related modules in Consensus Cluster B, cell-cycle- and myelination-associated modules in Consensus Cluster D, and cellular stress-response associated modules in Consensus Cluster E (Figure 2B). The Mthfr*677C > T variant exhibited significant positive correlations (p < 0.05) with cell-cycle- and myelination-associated modules in Consensus Cluster D and cellular stress-response-associated modules in Consensus Cluster E (Figure 2B). Sorl1*A528T led to significant positive correlations (p < 0.05) with several human co-expression modules in Consensus Cluster C enriched for neuron-related pathways (Figure 2B). The Plcg2*M28L variant had significant positive correlations (p < 0.05) with human co-expression modules in Consensus Cluster C enriched for neuron-related pathways and with cell-cycle- and myelination-associated modules in Consensus Cluster D (Figure 2B). Ceacam1 KO, Slc6a17*P61P, and Shc2*V433M exhibited significant positive correlations (p < 0.05) with human co-expression modules in Consensus Cluster B enriched for transcripts associated with immune-related pathways in multiple brain regions, while Clasp2*L163P and Sorl1*A528T led to significant positive correlations (p < 0.05) with the human co-expression module in Consensus Cluster B enriched for immune related pathways in cerebellum and frontal pole brain region, respectively (Figure 2B). The Mtmr4*V297G variants exhibited significant positive correlations (p < 0.05) with cell-cycle- and myelination-associated modules in Consensus Cluster D and cellular stress-response-associated modules in Consensus Cluster E (Figure 2B). Snx1*D465N also exhibited significant positive correlation with cell-cycle- and myelination-associated modules in Consensus Cluster D (Figure 2B).

Overall, we observed LOAD risk variants in mice showed concordance with distinct human co-expression modules, reflecting a different transcriptional response resulting from each LOAD risk variant. The associations between LOAD risk variants and human gene co-expression modules increased with age. We note that models harboring LOAD risk variants exhibited significant positive correlation with human modules in Consensus Clusters A, D, and E, which were not captured by the 5xFAD strain, highlighting the importance of using LOAD risk variants to fully capture LOAD molecular pathologies.

We next assessed the similarities between variant effects in mice by comparing each model to all other models. To identify LOAD risk variants driving similar transcriptional responses in mice, we performed a correlation between regression coefficients calculated for each genetic variant at 4 and 12 months. At 4 months, the effects of the LOAD1 construct (APOE4 and TREM2*R47H) were weakly and positively correlated with the effect of the 5xFAD transgene (p < 0.05), but this correlation diminished at 12 months (Figure 3A,B). The effects of LOAD1 were also significantly positively correlated (p < 0.05) with Sorl1*A528T and Mtmr4*V297G at 4 months, but this correlation diminished by 12 months (Figure 3A,B). The effects of the Abca7*A1527G and Ceacam1 KO variants were weakly correlated at 4 months (p < 0.05), and this correlation increased at 12 months (Figure 3A,B). The effects of the Shc2*V433M and Slc6a17*P161P variants were also significantly positively correlated at 4 months (p < 0.05) and became stronger with age (Figure 3A,B). Furthermore, the effects of the Snx1*D465N, Plcg2*M28L, and Mtmr4*V297G risk variants were significantly positively correlated (< 0.05) at 12 months. Similarly, the effects of the Sorl1*A528T and Meox2 KO(HET) variants were significantly positively correlated (p < 0.05) at 12 months (Figure 3A-B). In summary, we observed that LOAD risk variants generally increased in similarity with age, supporting an age-dependent role for these genetic factors. However, not all strains converged on similar transcriptional responses, suggesting distinct mechanisms of influence on LOAD risk.

Details are in the caption following the image
Correlation between effect of genetic variants and Gene Set Enrichment Analysis (GSEA). (A) Correlation between regression coefficients calculated for each genetic variant at 4 months. Color intensity and size of circles are proportional to Pearson correlation coefficient, with insignificant correlations (p > 0.05) left blank. (B) Correlation between regression coefficients calculated for each genetic variant at 12 months. The effects of Snx1*D465N, Plcg2*M28L, and Mtmr4*V297G risk variants in mice showed significantly positively correlation (p < 0.05) at 12 months. (C) GSEA results of selected AD-associated pathways from Reactome library in presence of each LOAD risk variant in mice. Enriched pathways are grouped by their overlap with functional annotations of human AMP-AD Consensus Clusters. Immune-related pathways had increased expression in the presence of multiple risk variants such as Abca7*A1527G, Mthfr*677C > T, and Snx1*D465N, while neuron-associated pathways had reduced expression in the presence of risk variants such as Abca7*A1527G, Mthfr*677C > T, Sorl1*A528T, Plcg2*M28L, Ceacam1 KO, Shc2*V433M, and Slc6a17*P161P.

3.3 Pathway alterations varied by LOAD genetic perturbation

To further elucidate the functional role of these LOAD risk variants in aged mice, we performed GSEA56 for the Reactome pathway library for all 12-month samples. GSEA revealed upregulation of immune system pathway in the presence of Abca7*A1527G (NES = 1.42, p = 0.01) and cytokine signaling in the immune system pathway in the presence of Abca7*A1527G (NES = 1.83, p = 0.006) and Sorl1*A528T (NES = 1.45, p = 0.04) (Figure 3C, Table S3), while neuron-associated pathways were downregulated in the presence of risk variants such as Ceacam1 KO (NES = −1.51, p = 0.04), Shc2*V433M (NES = −1.72, p = 0.004), and Slc6a17*P161P (NES = −1.74, p = 0.005) (Figure 3C, Table S3). The extracellular matrix organization pathway was downregulated in risk variants such as Sorl1*A528T (NES = −1.56, p = 0.01) and Ceacam1 KO (NES = −1.85, p = 0.007) but possibly upregulated in the presence of risk variants such as Abca7*A1527G (NES = 1.45, p = 0.07) and Mthfr*677C > T (NES = 1.25, p = 0.16) (Figure 3C, Table S3). The cell cycle pathway was downregulated in the presence of Shc2*V433M (NES = −1.65, p = 0.01) and Slc6a17*P161P (NES = −1.73, p = 0.004) but upregulated in the presence of other risk variants such as Abca7*A1527G (NES = 1.75, p = 0.01), Meox2 KO(HET) (NES = 1.85, p = 0.006), and Sorl1*A528T (NES = 2.61, p = 0.002) (Figure 3C, Table S3). Cellular response to the heat stress pathway were upregulated in the presence of risk variants such as Abca7*A1527G (NES = 1.83, p = 0.01), Mthfr*677C > T (NES = 1.81, p = 0.009), and Ceacam1 KO (NES = 1.57, p = 0.04) (Figure 3C, Table S3). Overall, we observed that multiple AD-associated pathways were upregulated in the presence of some LOAD risk variants but downregulated in the presence of another set of risk variants. This suggests that distinct risk variants perturb distinct molecular changes associated with LOAD in aging mice.

3.4 Age-dependent pathway effects driving AMP-AD module correlations in ABCA7, MTHFR, SORL1, and PLCG2 mouse models

In our mouse–human correlation analysis, the effects of multiple LOAD variants (Abca7*A1527G, Mthfr*677C > T, Sorl1*A528T, and Plcg2*M28L) correlated with human AMP-AD co-expression modules in age-dependent and pathway-specific manners. To further identify the AD-relevant biological processes associated with these selected LOAD risk variants (Abca7*A1527G, Mthfr*677C > T, Sorl1*A528T, and Plcg2*M28L) we adopted two approaches. First, we performed GSEA56 on the NanoString Mouse AD Panel genes ranked based on regression coefficients calculated for each factor at 12 months and identified significantly enriched GO terms (padj < .05). Next, we isolated the homologous genes exhibiting directional coherence between the effects of genetic risk variants (Abca7*A1527G, Mthfr*677C > T, Sorl1*A528T, and Plcg2*M28L) at 12 months and changes in human cases versus controls and performed GO enrichment analysis to find processes underlying the module-level correlations. These subsets represent the pathways that1 are altered in each mouse model and2 quantitatively underlie the mouse–human module associations. GO terms common to both enrichment tests were then annotated to the modules in which they appeared.

The Abca7*A1527G variant showed significant negative correlations (p < 0.05) with immune-related modules in Consensus Cluster B, cell-cycle- and myelination-associated modules in Consensus Cluster D, and cellular stress-response-associated modules in Consensus Cluster E (Figure 4A) at 4 months. However, at 12 months these effects were reversed and the variant exhibited significant positive correlations (p < 0.05) with several immune-related modules in Consensus Cluster B, cell-cycle-associated and myelination-associated modules in Consensus Cluster D, and cellular stress-response-associated modules in Consensus Cluster E (Figure 4A). Biological processes such as “de novo” protein folding, “de novo” post-translational protein folding, granulocyte migration, cytokine-mediated signaling pathway, insulin receptor signaling pathway, and neutrophil migration increased expression in the presence of Abca7*A1527G (Figure 4A, Table S4). The correlation between the Abca7*A1527G variant and the immune-associated human co-expression modules (Consensus Cluster B) (Figure 4A, Table S5) was exhibited by genes enriched for granulocyte migration, cytokine-mediated signaling pathway, and neutrophil migration (including Pecam1, Cd74, Trem2, Trem1, Csf1, Il1rap, and Ceacam1) (Table S6). As key correlating genes between Abca7*A1527G and Consensus Cluster E modules (Figure 4A, Table S5), we found genes enriched in “de novo” protein folding and “de novo” post-translational protein folding (eg, Hspa2, Hspa1b, and Dnajb4) (Table S6). Insulin receptor signaling was enriched in genes (Foxo1, Prkcq, and Bcar3) (Table S6), driving the correlation between Abca7*A1527G and modules in Consensus Cluster D (Figure 4A, Table S5).

Details are in the caption following the image
Identification of specific AD-associated processes in LOAD risk variants exhibiting transcriptomic changes similar to human LOAD in age-dependent manner. For four new mouse strains the following are displayed: the six top enriched GO terms identified by GSEA and GO enrichment analysis of genes with common directional changes with human AD modules (top left); gene module networks with common directional changes with human AMP-AD modules, where node colors correspond to human AMP-AD Consensus Clusters A (orange), B (green), C (blue), D (turquoise), or E (pink) (top right); and the effects of each variant at multiple ages correlated across LOAD effects in 30 AMP-AD modules, following the legend of Figure 3. Results for (A) Abca7*A1527G model, (B) Mthfr*677C > T model, (C) Plcg2*M28L model, and (D) Sorl1*A528T model. All results are relative to LOAD1 genetic background for all strains.

A similar reversal of effects with age was observed for MTHFR. The Mthfr*677C > T variants exhibited significant negative correlations (p < 0.05), with several cell-cycle- and myelination-associated modules in Consensus Cluster D and cellular stress-response-associated modules in Consensus Cluster E (Figure 4B) at 4 months. At 12 months, these correlations were positive (Figure 4B). GSEA of the Mthfr*677C > T variant identified significant enrichments of response to unfolded protein, positive regulation of cellular catabolic process, negative regulation translation, positive regulation of GTPase activity, B-cell-mediated immunity, and purine ribonucleotide metabolic process (Figure 4B, Table S4). B-cell-mediated immunity and negative regulation translation biological processes were also enriched in genes (including C1qa, C1qb, Cd81, and Zfp36) (Table S6) with directional coherence for Mthfr*677C > T and LOAD effects in Consensus Cluster B (Figure 4B, Table S5). Correlations between the Mthfr*677C > T variant and Consensus Cluster D changes (Figure 4B, Table S5) were exhibited by genes enriched for positive regulation of cellular catabolic process and positive regulation of GTPase activity (including Bin1, Picalm, Dock10, and Psen1) (Table S6). Biological processes such as response to unfolded protein and purine ribonucleotide metabolic process were enriched in genes (eg, Hspa1b, Hsph1, Hsp90aa1, Snca, and Atpp5h) (Table S6) underlying the correlations between Mthfr*677C > T and Consensus Cluster E effects (Figure 4B, Table S5).

The Plcg2*M28L variant caused significant positive correlations (p < 0.05) with neuron-related modules in Consensus Cluster C and cell-cycle-associated modules in Consensus Cluster D at both 4 and 12 months (Figure 4C). Enriched biological processes included postsynapse organization, regulation of axonogenesis, cognition, locomotory behavior, glial cell development, and regulation of protein catabolic process (Figure 4C, Table S4). Biological processes such as postsynapse organization, cognition, and locomotory behavior were enriched in genes (Mapt, Gabrb3, App, Ppp3cb, and Slc8a2) (Table S6) with directional coherence for Plcg2*M28L human AD changes in Consensus Cluster C (Figure 4C, Table S5). Biological processes such as regulation of axonogenesis, glial cell development, and regulation of protein catabolic process were enriched in genes (Snx1, Picalm, Psen1, Mag, Foxo1, and Kif13b) (Table S6) and drove the correlations between Plcg2*M28L and Consensus Cluster D effects (Figure 4C, Table S5).

Aged Sorl1*A528T mice (12 months) showed positive correlations (< .05) with neuron-associated modules in Consensus Cluster C that were not apparent at 4 months of age (Figure 4D). Enriched processes included the downregulation of synapse organization, synapse assembly, regulation of synaptic plasticity, and regulation of epithelial cell proliferation and the increased expression of negative regulation of transporter activity and SNARE complex assembly genes. These processes drove the correlation between the SORL1 variant and LOAD effects in Consensus Cluster C modules (Figure 4D, Table S5), where GSEA for genes with directional coherence generated synapse organization, synapse assembly, regulation of synaptic plasticity, upregulation of negative regulation of transporter activity, and soluble N-ethylmaleimid-sensitive factor attachment receptor (SNARE) complex assembly (including the genes Mapt, App, Gabrb3, Calm3, Snca, Cdkl5, Vgf, and Ywhag) (Table S6).

Overall, we found that late-onset genetic factors in mice generally led to both more abundant changes with age and increasingly disease-relevant pathway changes with age.

3.5 Alignment of mouse models with AD subtypes

Post mortem transcriptomics from AMP-AD and similar studies have enabled the partitioning of AD cases into potential disease subtypes. These studies have often stratified AD subjects into inflammatory and non-inflammatory subtypes.54, 59, 60 To determine whether our mouse models preferentially resembled putative AD subtypes, we correlated the effect of each variant with inflammatory and non-inflammatory subtypes associated with LOAD54 in the ROSMAP, MSBB, and Mayo cohorts.51-53

We found that at 4 months of age, variants did stratify by human subtypes. The effects of Abca7*A1527G, Sorl1*A528T, and Plcg2*M28L were positively correlated (p < 0.05) with the inflammatory subtypes across all three cohorts, while Mtmr4*V297G was positively correlated (p < 0.05) with ROSMAP and MSBB inflammatory subtypes (Figure 5). In contrast, Shc2*V433M and Clasp2*L163P exhibited significant positive correlations (p < 0.05) with non-inflammatory subtypes across all three cohorts (Figure 5).

Details are in the caption following the image
Correlation between effect of each mouse perturbation and molecular subtypes of LOAD. Two molecular LOAD subtypes inferred in ROSMAP cohort, three subtypes in Mayo cohort, and two subtypes in Mount Sinai Brain Bank (MSBB) cohort.54 Framed circles correspond to significant (p < 0.05) positive (blue) and negative (red) Pearson's correlation coefficients across all genes on the NanoString panel, with color intensity and circle size proportional to the correlation. (B) At 4 months, the Abca7*A1527G and Sorl1*A528T variants represent inflammatory subtypes of LOAD (Subtypes A) in each cohort, while Shc2*V433M and Clasp2*L163P variants mimic the non-inflammatory subtypes of LOAD (Subtypes B). (C) At 12 months, the Abca7*A1527G and Ceacam1 KO variants recapitulate inflammatory subtypes of LOAD (Subtypes A), while the Snx1*D465N, Mtmr4*V297G, and LOAD1 variants model non-inflammatory subtypes of LOAD (Subtypes B).

At 12 months, the correlations between Abca7*A1527G effects and the inflammatory subtypes across all three cohorts increased (p < 0.05), and the Ceacam1 KO variant had become positively correlated (p < 0.05) with the inflammatory subtypes across all three cohorts (Figure 5). On the other hand, LOAD1, Meox2 KO (HET), and Snx1*D465N were positively correlated (p < 0.05) with non-inflammatory subtypes across all three cohorts (Figure 5). Three strains, Sorl1*A528T, Plcg2*M28L, and Mtmr4*V297G, which were positively correlated (p < 0.05) with inflammatory subtypes at 4 months, transitioned to correlation (p < 0.05) with non-inflammatory subtypes at 12 months (Figure 5). These results are in concordance with our findings that Abca7*A1527G was significantly correlated with immune-related human modules and were enriched for immune-associated biological processes (Figure 4A), while the Sorl1*A528T and Plcg2*M28L variants were significantly correlated with neuron-related human modules and enriched for neuron-associated biological processes (Figure 4C,D). Overall, these findings suggest that different mouse strains may provide better models for distinct AD subtypes and that risk for these subtypes may be influenced by distinct AD genetic factors.

4 DISCUSSION

In this study, we performed gene expression screening of new knock-in mouse models harboring candidate genetic variants for LOAD. Our ultimate goal is to provide the research community and therapeutic development programs with improved preclinical models of LOAD, suitable for preclinical testing of therapeutics that target molecular processes contributing to LOAD origins and progression. By basing these models on human genetics, we also provide a preliminary functional characterization of possible disease-relevant effects from the candidate genetic variants.

Notable results include the finding that many AD-related pathways, modules, and processes are affected by the introduction of late-onset variants. However, the changes were not consistent across strains, suggesting that different genetic loci contributed to distinct AD-related dysfunction (Figures 2 and 4). For example, we determined that the SORL1 risk factor impinged primarily on AD-relevant synaptic gene expression, while the ABCA7 variant broadly affected non-neuronal gene expression, including immune, protein folding, and metabolic pathways. Meanwhile, the PLCG2 variant primarily affected genes that were annotated to behavior, synapses, and glial cells and similarly changed in human LOAD. We noted that a transgenic model harboring familial AD mutations in App and Psen1 exhibited different gene expression changes focused on an acute inflammatory response. This can be contrasted with the LOAD1 model that exhibited reduced immune activity and more subtle overall results, augmented by our panel of late-onset GWAS variants. We observed effects on multiple non-immune modules resulting from these variants, providing utility to study pathways not well modeled by the amyloid strains. Finally, the limited effects of variants like Clasp2*L163P suggest that the specific variant is not disease-associated, its AD-related effects are not visible in the transcriptome, and/or it does not trigger changes until later age. This diversity of effects across mouse strains provides specific models to study different aspects of AD biology and paves the way for precision preclinical testing of candidate therapeutics that target these pathways.

Preliminary analysis further suggested that the different loci contributed in an age-dependent manner (Figures 2 and 4) and model putative disease subtypes (Figure 5). However, validation of such partitioning of genetic risk is difficult in human studies due to post mortem tissue sampling and limited cohort size for multiomic data.54 We also found that the gene expression effects of LOAD variant knock-ins generally increased in terms of magnitude and disease relevance as mice aged from 4 to 12 months (Figures 2 and 4). This finding supports the notion that LOAD genetic factors become more relevant in an aging brain as they contribute to late-life disease risk.

We note that genetic variants from frequently associated loci tended to produce the most consistent AD-relevant phenotypes (eg, SORL1, ABCA7, PLCG2), although many of the more exploratory variants also generated AD-like expression signatures across multiple modules in aging mice (eg, CEACAM1, MTMR4) (Figure 2). Recent advances in variant inference and functional prediction, including many non-coding variants and major GWAS loci, will enable the next round of models to address additional GWAS loci without candidate coding variants, such as the EPHA1 locus.25 Furthermore, many AD-associated loci suffered from insufficient homology in mice (eg, MS4A4/MS4A6E, INPP5D, CR1), which will be addressed by ongoing efforts to humanize these relevant regions of the mouse genome (Benzow, et al., this issue).

This study had several caveats that need to be noted. Most importantly, aging is the strongest risk factor for LOAD,61 and it needs to be recognized that mice at 12 months of age are roughly equivalent to humans at 38 to 47 years of age. Therefore, our transcriptomic comparison to post mortem AMP-AD clinical samples, while practical, is unrealistic, and we are now testing those models that best approximated human transcriptional changes at 12 months to at least 24 months of age31, 62 (Oblak, et al., this issue). Likewise, recent studies (as well as our pilot data) have shown that proteomics is a more reliable means to correlate models to disease than transcriptomics63, 64 (Oblak, et al., this issue), so we will be using proteomic analysis on prioritized models.

The Trem2*R47H allele in the LOAD1 base model used here has been shown to cause an ∼twofold decrease in Trem2 expression.65 However, our analysis technique factors out allele effects individually, so we are confident in our results. We have since created a new model (JAX No. 33781) that we have shown has normal Trem2 transcript levels and that will replace the allele used here in future projects.

In this study, we have focused on introducing coding variants on a LOAD1 background,20 aged the mice to middle age (12 months), and characterized the animals using a gene expression panel developed for rapid comparison to recent human study results.21 In future work, we will extend our approach to model candidate non-coding variants at LOAD genetic loci without strong candidate coding SNPs, humanizing loci and regulatory regions when necessary (Benzow, et al., this issue). We will breed the most promising variants presented here – Abca7*A1527G, Sorl1*A528T, Mthfr*677C > T, and Plcg2*M28L – to a genetic background with humanized Aβ peptide (the LOAD2 strain) and age cohorts beyond 18 months to assess additional disease-related progression with advanced age. These select strains will be assessed in depth with multiple genome-scale omics measures (RNA-Seq, tandem mass tag proteomics, metabolomics), plasma biomarkers, in vivo imaging, neuropathology, and behavioral metrics. Each assay will be optimized for translational value. We will also introduce modifiable risk factors through unhealthy diets and exposure to common environmental toxicants. At the same time, all models are distributed without use restrictions to enable all researchers to obtain, study, and modify these models as desired.

ACKNOWLEDGMENTS

The authors gratefully acknowledge the contribution of Genome Technology core and Candice Baker and Kim Martens in the Genetic Engineering Technologies Service at The Jackson Laboratory for expert assistance with the work described in this publication. The results published here are based, in whole or in part, on data obtained from the AD Knowledge Portal (https://adknowledgeportal.org). Study data were provided by the Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago. Data collection was supported through funding by National Institute on Aging (NIA) grants P30AG10161 (ROS), R01AG15819 (ROSMAP; genomics and RNAseq), R01AG17917 (MAP), R01AG36836 (RNAseq), the Illinois Department of Public Health (ROSMAP), and the Translational Genomics Research Institute (genomic). Additional phenotypic data can be requested at www.radc.rush.edu. Mount Sinai Brain Bank data were generated from post mortem brain tissue collected through the Mount Sinai VA Medical Center Brain Bank and were provided by Dr. Eric Schadt from Mount Sinai School of Medicine. The Mayo RNAseq study data were led by Nilüfer Ertekin-Taner, Mayo Clinic, Jacksonville, Florida, as part of the multi-PI U01 AG046139 (MPIs Golde, Ertekin-Taner, Younkin, Price). Samples were provided from the following sources: The Mayo Clinic Brain Bank. Data collection was supported through funding by NIA grants P50 AG016574, R01 AG032990, U01 AG046139, R01 AG018023, U01 AG006576, U01 AG006786, R01 AG025711, R01 AG017216, R01 AG003949, National Institute of Neurological Disorders and Stroke (NINDS) grant R01 NS080820, CurePSP Foundation, and support from Mayo Foundation. Study data include samples collected through the Sun Health Research Institute Brain and Body Donation Program of Sun City, Arizona. The Brain and Body Donation Program was supported by the NINDS (U24 NS072026 National Brain and Tissue Resource for Parkinson's Disease and Related Disorders), the NIA (P30 AG19610 Arizona Alzheimer's Disease Core Center), the Arizona Department of Health Services (contract 211002, Arizona Alzheimer's Research Center), the Arizona Biomedical Research Commission (contracts 4001, 0011, 05-901 and 1001 to the Arizona Parkinson's Disease Consortium), and the Michael J. Fox Foundation for Parkinson's Research. The IU/JAX/PITT MODEL-AD Center was supported through funding by NIA grant U54AG054345.

    CONFLICT OF INTEREST STATEMENT

    The authors declare that this research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

    CONSENT STATEMENT

    No consent was required as all human subject data were reused under controlled access with an active AD Knowledge Portal Data Use Certificate (version 7.3). All data were anonymized by original sources with no possibility of deanonymization.

    ETHICS STATEMENT

    The animal study was reviewed and approved by the Jackson Laboratory Animal Use Committee.

    DATA AVAILABILITY STATEMENT

    The MODEL-AD datasets are available via the AD Knowledge Portal (https://adknowledgeportal.org). The AD Knowledge Portal is a platform for accessing data, analyses, and tools generated by the Accelerating Medicines Partnership (AMP-AD) Target Discovery Program and other National Institute on Aging (NIA)-supported programs to enable open-science practices and accelerate translational learning. The data, analyses, and tools are shared early in the research cycle without a publication embargo on secondary use. Data are available for general research use according to the following requirements for data access and data attribution (https://adknowledgeportal.org/DataAccess/Instructions). All mouse models are available from the Jackson Laboratory mouse repository.