Interpretable deep learning reveals the role of an E-box motif in suppressing somatic hypermutation of AGCT motifs within human immunoglobulin variable regions.
E-box transcription factors
E2A
activation induced deaminase (AID)
deep learning
immunoglobulin heavy chain
integrated gradients
somatic hypermutation (SHM)
Journal
Frontiers in immunology
ISSN: 1664-3224
Titre abrégé: Front Immunol
Pays: Switzerland
ID NLM: 101560960
Informations de publication
Date de publication:
2024
2024
Historique:
received:
26
03
2024
accepted:
08
05
2024
medline:
12
6
2024
pubmed:
12
6
2024
entrez:
12
6
2024
Statut:
epublish
Résumé
Somatic hypermutation (SHM) of immunoglobulin variable (V) regions by activation induced deaminase (AID) is essential for robust, long-term humoral immunity against pathogen and vaccine antigens. AID mutates cytosines preferentially within WRCH motifs (where W=A or T, R=A or G and H=A, C or T). However, it has been consistently observed that the mutability of WRCH motifs varies substantially, with large variations in mutation frequency even between multiple occurrences of the same motif within a single V region. This has led to the notion that the immediate sequence context of WRCH motifs contributes to mutability. Recent studies have highlighted the potential role of local DNA sequence features in promoting mutagenesis of AGCT, a commonly mutated WRCH motif. Intriguingly, AGCT motifs closer to 5' ends of V regions, within the framework 1 (FW1) sub-region1, mutate less frequently, suggesting an SHM-suppressing sequence context. Here, we systematically examined the basis of AGCT positional biases in human SHM datasets with DeepSHM, a machine-learning model designed to predict SHM patterns. This was combined with integrated gradients, an interpretability method, to interrogate the basis of DeepSHM predictions. DeepSHM predicted the observed positional differences in mutation frequencies at AGCT motifs with high accuracy. For the conserved, lowly mutating AGCT motifs in FW1, integrated gradients predicted a large negative contribution of 5'C and 3'G flanking residues, suggesting that a CAGCTG context in this location was suppressive for SHM. CAGCTG is the recognition motif for E-box transcription factors, including E2A, which has been implicated in SHM. Indeed, we found a strong, inverse relationship between E-box motif fidelity and mutation frequency. Moreover, E2A was found to associate with the V region locale in two human B cell lines. Finally, analysis of human SHM datasets revealed that naturally occurring mutations in the 3'G flanking residues, which effectively ablate the E-box motif, were associated with a significantly increased rate of AGCT mutation. Our results suggest an antagonistic relationship between mutation frequency and the binding of E-box factors like E2A at specific AGCT motif contexts and, therefore, highlight a new, suppressive mechanism regulating local SHM patterns in human V regions.
Identifiants
pubmed: 38863710
doi: 10.3389/fimmu.2024.1407470
pmc: PMC11165027
doi:
Substances chimiques
Immunoglobulin Variable Region
0
AICDA (activation-induced cytidine deaminase)
EC 3.5.4.-
Cytidine Deaminase
EC 3.5.4.5
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1407470Informations de copyright
Copyright © 2024 Tambe, MacCarthy and Pavri.
Déclaration de conflit d'intérêts
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.