Please use this identifier to cite or link to this item: https://doi.org/10.1038/s41598-017-01699-z
Title: Predicting influenza antigenicity from Hemagglutintin sequence data based on a joint random forest method
Authors: Yao, Y
Li, X
Liao, B
Huang, L
He, P
Wang, F
Yang, J 
Sun, H
Zhao, Y
Yang, J
Keywords: Influenza virus hemagglutinin
virus antigen
algorithm
amino acid sequence
amino acid substitution
chemistry
genetics
human
immunology
Influenza A virus (H3N2)
molecular evolution
mutation
Algorithms
Amino Acid Sequence
Amino Acid Substitution
Antigens, Viral
Evolution, Molecular
Hemagglutinin Glycoproteins, Influenza Virus
Humans
Influenza A Virus, H3N2 Subtype
Mutation
Issue Date: 2017
Publisher: Nature Publishing Group
Citation: Yao, Y, Li, X, Liao, B, Huang, L, He, P, Wang, F, Yang, J, Sun, H, Zhao, Y, Yang, J (2017). Predicting influenza antigenicity from Hemagglutintin sequence data based on a joint random forest method. Scientific Reports 7 (1) : 1545. ScholarBank@NUS Repository. https://doi.org/10.1038/s41598-017-01699-z
Rights: Attribution 4.0 International
Abstract: Timely identification of emerging antigenic variants is critical to influenza vaccine design. The accuracy of a sequence-based antigenic prediction method relies on the choice of amino acids substitution matrices. In this study, we first compared a comprehensive 95 substitution matrices reflecting various amino acids properties in predicting the antigenicity of influenza viruses by a random forest model. We then proposed a novel algorithm called joint random forest regression (JRFR) to jointly consider top substitution matrices. We applied JRFR to human H3N2 seasonal influenza data from 1968 to 2003. A 10-fold cross-validation shows that JRFR outperforms other popular methods in predicting antigenic variants. In addition, our results suggest that structure features are most relevant to influenza antigenicity. By restricting the analysis to data involving two adjacent antigenic clusters, we inferred a few key amino acids mutation driving the 11 historical antigenic drift events, pointing to experimentally validated mutations. Finally, we constructed an antigenic cartography of all H3N2 viruses with hemagglutinin (the glycoprotein on the surface of the influenza virus responsible for its binding to host cells) sequence available from NCBI flu database, and showed an overall correspondence and local inconsistency between genetic and antigenic evolution of H3N2 influenza viruses. © 2017 The Author(s).
Source Title: Scientific Reports
URI: https://scholarbank.nus.edu.sg/handle/10635/178615
ISSN: 2045-2322
DOI: 10.1038/s41598-017-01699-z
Rights: Attribution 4.0 International
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1038_s41598-017-01699-z.pdf1.81 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons