SF3B1 mutations in different cancer types cause recognition of sterically hindered cryptic splice sites downstream of the branch point

<p>One of the biggest surprises to emerge from the growing catalog of somatic mutations in various cancer types is the recurrent mutation of genes encoding the RNA spliceosome. Recurrent mutations in the highly conserved HEAT 5-9 repeats of splicing factor 3B subunit 1 (<em>SF3B1</em>) have been reported in myelodysplastic syndrome (MDS), chronic lymphocytic leukemia (CLL), breast cancer, uveal melanoma (UM), and pancreatic cancer. Interestingly, <em>SF3B1</em> mutation is associated with poor prognosis in CLL but improved prognosis in myelodysplasia and UM. Prior studies have shown that mutated <em>SF3B1</em> CLL samples use canonical 5’ splice sites but cryptic 3’ splice sites. However it is unknown whether <em>SF3B1</em> mutation causes the same 3’ splicing defects in different cancers. The mechanism by which <em>SF3B1</em> mutations cause cryptic 3’ splicing and the functional consequences thereof remain unresolved as well.</p> <p>Here we define the specific sequence requirements needed for cryptic 3’ splicing in tumors with mutated <em>SF3B1</em>. We examined splice junction usage in transcriptome data from <em>SF3B1</em> mutant and unmutated CLL, UM and BRCA cases and found that <em>SF3B1</em> mutants use as cryptic acceptors AG dinucleotides ~13-17 bp downstream of the branch point that are likely sterically hindered when <em>SF3B1</em> is unmutated. The cryptic acceptors are also located >10 bp upstream of nearby canonical acceptors and thus avoid competing with them for splicing. In our genome-wide analysis only 617 AG dinucleotides met these specific sequence requirements and were used as cryptic acceptors. The same cryptic 3’ splicing signature was observed in different cancers but only in samples with mutations in ~10 amino acid hotspots in the <em>SF3B1</em> HEAT 5-9 repeats.</p> <p>We assessed the functional impact of <em>SF3B1</em> mutation and found that the cryptic acceptors are typically used at low frequency in the SF3B1 mutants (<10% relative to the canonical splice site) and are sometimes present in the <em>SF3B1</em> unmutated tumors but at an even lower frequency (<0.5% relative to the canonical splice site). Nonetheless, we identified three genes previously implicated in cancer <em>TTI1</em>, <em>MAP3K7</em> and <em>FXYD5</em> and four others (<em>YIF1A</em>, <em>ORAI2</em>, <em>ZNF91</em>, <em>RP11-1280I22.1</em>) with cryptic acceptors that were consistently preferred to the associated canonical acceptor in the CLL <em>SF3B1</em> mutant samples.</p> <p>Our study suggests that cryptic 3’ splicing in <em>SF3B1</em> mutants results from altered sterics of <em>SF3B1</em> and other proteins bound at the branch point allowing for the usage of acceptors that are normally hindered and provides a framework for understanding the effects of <em>SF3B1</em> mutations on the pathophysiology of various cancers.</p>