() Diagrammatic representation of conserved protein motifs and domains within the 1005 amino acid sequence of Psc1. N domain; RS domain, arginine/serine dipeptide repeat; Zn finger, C(X)C(X)C(X)H zinc finger motif; P, proline-rich region; PG, proline/glycine repeats; RRM, RNA binding motif; C domain, shared region of homology between ARRS proteins and homologues; RG, arginine/glycine repeats; acidic rich, C-terminal aspartate/glutamate-rich region; (i–iii). Homologies between the Psc1 N domain (i), Zn finger (ii) and RRM (iii) and proteins of known function or consensus derived from the NCBI conserved domain database of known Zn finger or RNA binding domains. Identical residues shown as bold. The RRM motifs P(X)N(X)HF(X)FG(X)N and A(X)A(X)S(X)NNRFI(X)W that are unique to ARRS proteins and homologues are boxed (iii). () Alignment of representative ARRS proteins and homologues with Psc1. Representative proteins from human, mouse, amphibian, fruit fly, mosquito, nematode worm and slime-mould. Conservation extends to all ARRS proteins including fish, chicken and rat (data not shown). Elements are drawn to scale and the positions of motifs described in (A) are indicated. Percentages indicate degree of amino acid identity to the equivalent domain in Psc1. () Unrooted distance neighbour-joining tree showing a phylogeny of ARRS proteins. Sequences for predicted ARRS proteins from fish (accession codes SINFRUP00000133230; SINFRUP00000133187), chicken (accession codes ENSGALG00000007516; ENSGALG00000016910,) and rat (accession code ENSRNOG00000009836) were taken from the ENSEMBL database (). Sequences for mouse (accession code BAC34721), human (accession codes KIAA1311 and AAH41655), amphibian (s; AAH43744), fruit fly (r, NP_609976), mosquito , XP_318628), nematode worm (s, NP_498234) and slime-mould (, AAO51188), were taken from the NCBI database (). Sequences were aligned using ‘CLUSTAL W’ () with the BLOSUM 62 scoring matrix, with gap opening and gap extension penalties of 10.0 and 0.1, respectively, followed by some minor manual corrections to conform to known structural features. The tree was constructed with PAUP* () using standard distances and mean character differences. High confidence was confirmed via congruent tree topology using Parsimony treatment (PAUP*, data not shown) and high resampling statistics indicated at nodes (1000 bootstrap replications represented as percentage values; distance above and Parsimony below). Ellipses (shaded) define two proteins, Psc1 and BAC34721 clades, the result of a putative gene duplication (*) that occurs in the vertebrate lineage. Shown are clades composed of orthologous proteins from vertebrate and invertebrate organisms. Scale bar indicates a distance of 0.1 amino acid substitutions per position in the sequence. () Sequence comparison of RS domain, proline-rich and acidic rich elements in ARRS proteins and homologues. The asterisk in the sequence of AA051188 represents an intervening 11 amino acids not shown. All other sequences are contiguous with periods used to align areas of similarity. DNE; does not exist.