The observed effect sizes of the three replicated individual SNPs are small [see (
5) for discussion]. For
EduYears, the strongest effect identified (rs9320913) explains 0.022% of phenotypic variance in the replication sample. This
R2 corresponds to a difference of ~1 months of schooling per allele. For college completion, the SNP with the strongest estimated effect (rs11584700) has an odds ratio of 0.912 in the replication sample, equivalent to a 1.8 percentage-point difference per allele in the frequency of completing college.
We subsequently conducted a “combined stage” meta-analysis, including both the discovery and replication samples. This analysis revealed additional genome-wide significant SNPs: four for
EduYears and three for
College. Three of these newly genome-wide significant SNPs (rs1487441, rs11584700, rs4851264) are in linkage disequilibrium with the replicated SNPs. The remaining four are located in different loci and warrant replication attempts in future research: rs7309, a 3′UTR variant in
TANK; rs11687170, close to
GBX2; rs1056667, a 3′UTR variant in
BTN1A1; and rs13401104 in
ASB18.
Using the results of the combined meta-analyses of discovery and replication cohorts, we conducted a series of complementary and exploratory supplemental analyses to aid in interpreting and contextualizing the results: gene-based association tests; eQTL analyses of brain and blood tissue data; pathway analysis; functional annotation searches; enrichment analysis for cell-type-specific overlap with H3K4me3 chromatin marks; and predictions of likely gene function using gene-expression data.
Table S20 summarizes promising candidate loci identified through follow-up analyses (
5). Two regions in particular showed convergent evidence from functional annotation, blood cis-eQTL analyses, and gene-based tests: chromosome 1q32 (including
LRRN2,
MDM4, and
PIK3C2B) and chromosome 6 near the Major Histocompatibility Complex (MHC). We also find evidence that in anterior caudate cells, there is enrichment of H3K4me3 chromatin marks (believed to be more common in active regulatory regions) in the genomic regions implicated by our analyses (
fig. S20). Many of the implicated genes have previously been associated with health, central nervous system, or cognitive-process phenotypes in either human-GWAS or model-animal studies (
table S22). Gene co-expression analysis revealed that several implicated genes (including
BSN,
GBX2,
LRRN2, and
PIK3C2B) are likely involved in pathways related to cognitive processes (such as learning and long-term memory) and neuronal development or function (
table S21).