Multiple Sequence Alignment of Influenza Hemagglutinin Protein Sequences
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
The NCBI influenza virus resource was used to download 11,981 full-length, non-redundant Influenza hemagglutinin protein sequences (August 2011). Homologs were aligned using MUSCLE and saved in MSF format. Sequence names formatted for use by JProfileGrid v2.0 grep filtering feature. Dataset for reference: Roca AI, et al. ProfileGrids Solve the Large Alignment Visualization Problem: Influenza Hemagglutinin Example. F1000Research 2:2 (2013)