Structure, syntax and “small-world” organization in the complex songs of California Thrashers (Toxostoma redivivum)

Abstract We describe songs of the California Thrasher (Toxostoma redivivum), a territorial, monogamous species whose complex songs are composed of extended sequences of phonetically diverse phrases. We take a network approach, so that network nodes represent specific phrases, and links or transitions between nodes describe a subgroup structure that reveals the syntax of phrases within the songs. We found that individual birds have large and largely distinct repertoires, with limited phrase sharing between neighbours and repertoire similarity decaying between individuals with distance apart, decaying also over time within individuals. During song sequences, only a limited number of phrases (ca. 15–20) were found to be actually “in play” at any given time; these phrases can be grouped into themes within which transitions are much more common than among them, a feature contributing to a small-world structure. It appears that such “small-world themes” arise abruptly, while old themes are abandoned more gradually during extended song sequences; most individual thrashers switch among 3–4 themes over the course of several successive songs, and some small-world themes appear to have specific roles in starting or ending thrasher songs.

most obvious features to examine (Kershenbaum et al. 2014). Only after understanding the rules governing such sequences can we reasonably expect to understand how those might convey meaning. Consequently, the songs of many species with complex songs have now been described in detail, including Cassin's Vireo (Hedley 2015), Spectacled Warbler (Palmero et al. 2012), Hermit Thrush (Roach et al. 2012) and Nightingale (Weiss et al. 2014).
Song of the California Thrashers (Toxosoma redivivum) takes on special interest for syntax that might convey meaning. Their songs have complexity features similar to the abstract mathematical systems most capable of transmitting information (Wolfram 2002;Taylor and Cody 2015) -being neither too simple nor chaotic. In this paper, we detail some structural properties of their song that contribute to their complexity, including repertoire size and small-world themes. We also describe differences among individuals and across time. This documentation will provide a foundation for descriptions of additional work from our laboratory on more abstract measures of song complexity and on experimental investigations about how the birds themselves interpret song structure.
Of common complexity measures, repertoire size is in some ways the simplest index of song complexity, but may not be especially diagnostic (Garamszegi et al. 2005;Botero et al. 2008). We show how repertoire sizes increases, even for single individuals, over time without apparent limit. A second feature, small-worldness (SW), is possibly a more general measure of complexity, since the statistical structure of symbol sequences, such as phrases in bird song, is often scale-free and can provide insight to the processes that generate them (Watts and Strogatz 1998;Cancho and Solé 2001). Sasahara et al. (2012), Weiss et al. (2014), and Deslandes et al. (2014) have documented SW in the songs of several bird species, suggesting it may be a common feature of complex bird songs, and inviting further investigation. In this study, we elaborate on our earlier finding of such sequence patterns in the thrasher songs, and use the small-world feature as a measure of how birds employ one or a few themes at a time within songs, while over a long interval they progress through a series of distinct themes.

Study site and scope of recordings
Fieldwork was conducted in the Santa Monica Mountains of Los Angeles County, California, near Saddle Peak (elevation 845 m); the study area stretches approximately 1 km along a minor road, Topanga Lookout Road (TLR) leading NE from the saddle at 32°04′53″ N, 118°38′42″ W and ending at 34°05′28″ N, 118°38′11″ W (elevation 745 m). Chaparral extends upslope (SE) and downslope (NW) on both sides of TLR. Conspicuous in this area are co-dominant shrub species of Ceanothus, Rhamnus, Quercus, Adenostema, Cercocarpus, Heteromeles and Arctostaphylos, providing a dense, shrubby habitat 3-4 m tall, that supports around 20 breeding bird species (excluding far-ranging raptors and aerial foragers), at combined densities upwards of 10 pairs per hectare. Tall chaparral is the primary and preferred habitat of California Thrashers, in which they maintain year-round territories and reach densities of 0.4-1 pairs per hectare (Cody 1997).
Recordings for this study were made between dawn and dusk throughout the breeding season of 2012 -on 17-20 January, 8-10 February, 27-29 March and 26-28 April 2012. Singing activity was sporadic in January and again in April, strong in February and at a peak in March; this corresponds to an onset of breeding in a typical southern Californian year from late-January into February, with eggs and incubation into March and young in the nest in April. After the February recording episode, thrasher territories were deemed stable and delineated around 14 focal thrasher pairs (see the map in Supplementary material, Section 1).
Songs were recorded on a Marantz PMD 670 digital recorder with a Sennheiser ME62 omnidirectional microphone mounted in a 50 cm Telinga parabolic reflector. Files were stored in uncompressed Pulse Code Modulation format at 44.1 kHz and retained in .wav files as sequentially numbered "tracks" obtained from specific individual thrashers. Downloaded files, identified by location, date and time, were duplicated and archived for later analysis in their original uncompressed format; all recorded California Thrasher song tracks and their Figure 1. sonograms of representative california thrasher phrase types discussed in the text. in each case, the range on the ordinate is from 0 to 10 kHz, and the horizontal line through each sonogram is drawn at 3 kHz. time scale is indicated by the length of the line segment at the bottom of the figure. sonograms for all phrase types mentioned in this study are illustrated in the supplementary materials.

Phrase identification and song analyses
All California Thrasher recordings of quality suitable for analysis, with clearly discernible sonograms, were examined spectrographically with Praat (Boersma and Weenink 2015). Spectrograms of some representative California Thrasher phrases are shown in Figure 1. Individual phrases were initially identified by assessing their phonetic characteristics in spectrogram form, and given three-letter ID codes; we accumulated a master chart of coded phrase sonograms against which all subsequent phrases were evaluated (see Supplementary material, Section 2).
The start and end times of each phrase were determined with approximately 0.01 s precision. In practice, the process of phrase delimitation and identification was aided by the fact that pauses between phrases averaged ≥50% longer than phrase duration. While many California Thrasher phrases are essentially monosyllabic, with a single extended Figure 2. an example of a california thrasher song bout by catH 9 recorded 19 January 2012, at 8:10 aM. the sequence consisted of 190 distinct phrases, of 11 phrase types, represented here as three letter codes at the nodes of a directed graph. transitions among the phrases, representing the sequence in which the phrases are sung, are shown as lines connecting the vertices, with thickness in proportion to the frequency with which that transition occurs. Rare phrases, those that occur just a few times in the 190-phrase song bout, are shown as dashed nodes and transitions. three phrase groups show high rates of internal transitions, and are circumscribed as small-world themes.
burst of sound, many are not. Our analysis is inspired by linguistics of human languages. Some acoustic differences in sounds are meaningful in English and other human languages while others are not. When such differences are meaningful they are termed "phonemes", and "allophones" when not. Analysis of syntax and semantics are typically conducted only with phonemes. In the case of birds, we have little knowledge about the meaning of acoustic differences, so must base analysis on acoustic differences alone. As will be detailed below, this is complicated by the apparently unlimited number of identifiable clusters of acoustically similar vocalizations with time and space. When examined in more and more detail, even these apparent clusters sometimes separate with differences in time or individual (see Supplementary material, Section 3). Yet, even cursory examination of the sound sequences, illustrated e.g. in Figure 2, shows that structure is present. Accordingly, we used two criteria to define a unit phrase: temporal continuity and consistent association. (i) If bi-or tri-syllabic phrase syllables were essentially continuous in time or strictly contiguous, they were regarded as a single phrase. (ii) If temporal continuity was ambiguous but the syllables were always found associated in identical sequence, they were interpreted to constitute a single phrase. The files for classifications (.TextGrid) paired with sound (.wav) files are downloadable from the database mentioned above, in the "Keycode_file" column.
Very detailed discrimination, e.g. with multidimensional scaling of acoustic features, revealed that further subdivision into smaller classes was sometimes possible. However, independent observers could normally agree on phrase types and previous analysis with machine learning had determined that thrasher phrases could be reliably and objectively distinguished in about 95% of the cases with the classification we employed (Sasahara et al. 2012).
The song files used here total 84 tracks from 14 different territories that represent nearly 400 min of recording time (see Supplementary material, Section 4). From these tracks 12,332 phrases were identified then annotated into phrase types. There were totals of >1000 phrases from each of five California Thrasher territories (#s 1, 2, 7, 8, 9). Track lengths were of concern, since only long sequences of phrases from extended singing bouts are amenable to pattern and transition analysis; here we deal primarily with 32 tracks of ≥125 phrases, 16 of which have ≥250 phrases, and three comprised more than 500 phrases in continuous time sequences.
We measured repertoire size and repertoire development (accumulation rate of different phrases along song sequences) as a function of string or sequence length, on several scales of resolution. When tracks were recorded sequentially, they were concatenated together; otherwise they were treated separately. Phrase repertoires were compared within and among individuals, relative to both temporal and spatial separations between song tracks, to examine neighbour, distance and elapsed time effects on repertoire similarities. For phrase types 1, … , N observed with numbers X = {x 1 , … , x N } in one individual or time, and Y = {y 1 , … , y N } for another individual or time, we measured the similarity in overall phrase usage as SPU = X · Y/(||X|| || Y||), where X · Y = Σx i y i , and ||Z|| = [Σz i 2 ] 1/2 , for Z = X, Y; this measure is equal to the cosine of the angle between the phrase frequency vectors of the two individuals, and is similar to a correlation coefficient. Further, we investigated whether there are systematic, discernible song differences from month to month throughout the season. Such comparisons among recorded tracks were aided by using the SPU and also "standardized repertoire" equal to the repertoire size for that track, divided by its length and ×100.

Small-world statistics and other graphical measures
Songs from identified California Thrasher individuals were converted to phrase sequences, and the incidence of phrase transitions for that individual and track were computed. For each track, we created a directed graph as shown in Figure 2, with phrase types as vertices and transitions represented by directed edges between them. The frequencies of transitions among phrase type are represented by thickness of connecting lines in the figure. The directed graphs become undirected graphs for our analysis by removal of arrows and self-transitions, then collapsing redundant edges. Small-world analysis was conducted on the undirected graphs using the PajaroLoco software package (Sánchez et al. 2015).
The characteristic path length (L) of a graph is the average of the means of the shortest path lengths connecting each vertex to all other vertices. The average path length in Figure 2 is 1.62. The clustering coefficient (c v ) for a vertex v is the extent to which vertices adjacent to (connected to) v are also adjacent to each other. It is calculated as the ratio of paths of length two in the graph that are closed, to all paths of length two in the graph. The average value of c v, over all graph vertices is termed C . This is sometimes called the cliquishness of the graph, because in social networks it measures the proportion of v's friends that are also friends with each other. The value of C v for the full graph in Figure 2 is 0.74. Corresponding values of L and C exist for comparable random graphs with the same numbers of nodes and edges: L random and C random ; procedures for creating such random graphs are described in Watts (1999) and Durrett (2007). Finally, the test for SW is the measure (C/C random )/(L/L random ). If SW is greater than 1, then the graph is said to be small world (Watts and Strogatz 1998); in Figure 2, the value of SW is 1.7.
We identify SW themes as groups of phrase types within which locally high internal transition frequencies q ij average, as a rule of thumb, greater than 0.75 within a group of phrases. This is similar, though not identical, to the criterion for defining "social communities" in Radicchi et al. (2004), Girvan and Newman (2002), and by the Mathematica software package (Wolfram Research Inc. 2013). Of the various options offered by PajaroLoco for grouping into themes, the modularity and spectral options gave somewhat different groupings, but gave the clearest separation. For analysis here, the modularity option was chosen. The three SW themes indicated in Figure 2 have internal transition probabilities of 0.80, 0.82 and 0.76, respectively (left to right). In general, a small-world structure comes from having high neighbour-to-neighbour transition frequencies within local phrase groups and relatively few long-distance connections, such that average path length is modest. Smallworld songs are decidedly non-random, as discussed in Sasahara et al. (2012).
Each internal node in a directed song graph has incoming edges and of outgoing ones. Sasahara et al. (2012) distinguished four types, or transition motifs, of such nodes: (a) oneway nodes, where only one type of edge led into it and one other kind exited; (b) bottleneck nodes, where more than the average number of edges came in and less than the average exited; (c) branch nodes, where less than the average number of edges entered and more than the average exited, the complement to bottleneck nodes; and (d) hourglass nodes, where more than the average number of edges entered and more than the average exited. The number of nodes in the song graph that were of each motif was calculated for each track using the PajaroLoco software.

Overall song structure
California Thrasher songs are delivered as sequences of various different and discrete phrases. During a typical singing bout, a thrasher sings more or less continuously for several (1, 2-10 or more) minutes, without changing its perch position. Within the singing bout, phrase strings are grouped into "songs, " within which there are only barely discernible pauses between phrases, while separate songs are identified by marked pauses on the order of the same time length as the songs themselves (minimum 2 s). This nomenclature is intended to be consistent with that proposed by Catchpole and Slater (2008).
Typical California Thrasher song structure is illustrated in Figure 3, a segment from a longer recording of two concatenated tracks of CATH 8 recorded on 29 March, at 7:45 AM. In this figure, the sonograms of three successive songs are shown, delivered between approximately 44 and 103 s along the track; pauses between the songs were 7 and 13 s, respectively. Broad differences in structure among the three songs are apparent. These three songs use 6, 12 and 13 different phrases, respectively, with many repeats both within and between songs. Overall for this recording, there were a total of 1312 phrases, grouped into 74 songs and consisting of 37 different phrase types. Songs average 18.5 phrases in length, with an overall delivery rate of about 3 phrases/s. The mean time elapsed between successive phrases within songs was 0.23 ± 0.24SD s, while inter-song pauses averaged 7.5 ± 5.7SD s. These numbers are typical for thrashers at the site in February and March; in January and April, songs tended to be somewhat shorter with longer pauses between them.

Repertoire and phrase diversity
Repertoire sizes were in all cases observed to increase with the lengths of the recordings, non-linearly and with no asymptote apparent. The number of different phrase types observed in the repertoire any individual thrasher depended on the number of accumulated phrases in the recording. In all, 680 different phrase types were identified in the 12,322 phrase database for 2012. Figure 4 shows the growth in number of phrase types used by four different thrashers for whom we had the most extensive recordings. The repertoire size grew rapidly at first, but that growth slowed, and while never appearing to level off completely, continued growing at a slower rate. It appears that eventually the number of phrase types observed would increase without limit.
The repertoire size per unit of time did not vary greatly among birds. We performed statistical analysis on files where 50 or more phrases were observed. On this local time scale, the standardized repertoire -number of phrase types per 100 phrases sung -averaged 11.05 phrase types per 100 phrases, and did not differ statistically significantly among individuals, among times of day or among the months (January-April) of our observations, see the Supplementary materials Section 5 for details.
There was some sharing of repertoire, but very little. On adjacent territories, thrasher SPU values averaged around 3%. Song similarity in terms of phrase composition fell slightly with increasing distance separating singers to 0.5% beyond 300 m, which we speculate may be the maximum distance at which birds can hear other singers. The SPU between all phrases annotated here and the recordings reported by Sasahara et al. (2012), from Amador County, 600 km to the north, was 0.59%. This small similarity in repertoire for distant locations has been suggested to underlay isolation by distance (Scariglia and Burns 2003) and may represent the minimum from shared mimicry of other species. Details of the decay of similarity with distance may be found in the Supplementary material Section 6.
Studies of human linguistics and song repertoire in other species have typically found a Zipf-like distribution of phrase usage, such that a few phrases are disproportionately common, with most being rare (McCowan et al. 1999;Suzuki et al. 2005;Deslandes et al. 2014). The same is observed here. Within each individual and within a short time period, plots of log[frequency of phrase] against phrase rank in all cases resembles a sharply negative slope with a slightly convex shape, as suggested by Mandelbrot (1961). Examples are shown in the Supplemental material Section 7. All examples show a non-uniform usage of phrases, indicating structure of some sort to their sequences, but just what this might mean is unclear.

Phrase sequencing and transition patterns
For the study of transition patterns, we included only those tracks for which 50 or more phrases were identified. In this group, the number of phrases per individual ranged from 239 to 2618. Table 1 shows these numbers, with repertoire size per track, small worldness for the graph of that track, and the numbers of the various motifs that were observed for each bird in our sample. The average small worldness was 1.69, indicating a strong smallworld structure.
There are a variety of ways that such structure might arise, but one obvious manner is if the repertoire and transition probabilities change in time. Figure 5 shows the manner in which new phrase types are added to the longest continuous recording -the concatenated tracks CATH 8 in Figure 3, above. It contained 1312 phrases of 37 different types, broken into 67 songs ≥6 phrases long. The top graph shows phrase types in the ordinate in this graph as Phrase ID#, arranged according to the rank order in which the phrases were first recorded in the tracks, and with progression through the song sequences (abscissa) new phrases are accumulated. New phrases are accumulated with increasing sequence length, added in batches, generally after several hundred re-iterations of the same dozen or so phrase types.
The song sequences unfold with time as various phrases repeat, or transition to new phrases or to phrases previously used. Many of the new phrases that are added to the sequence are accreted in groups that subsequently constitute the core of specific song units, what we term "small-world themes" of several phrases that are closely-knit subgraphs with high internal transition frequencies (see Figure 2). In the bottom graph of Figure 5, the phrase types are ordered according to their small-world themes as identified by PajaroLoco. In both graphs, the phrases that constitute new small-world themes are seen to be introduced as relatively cohesive groups in the phrase sequence on the abscissa; at any one time, this bird has some 3-6 themes in play. The SW of this recording was 2.24, well above the threshold value of 1.0. One view of this network broken into small-world themes is shown in Figure 6. This pattern of new phrases being introduced and "riffed" seen in Figure 5 was also representative of all other recordings. The Supplementary materials Section 8 shows comparable graphs for the long song sequences in other thrasher individuals. These show similar organization around SW themes for those individuals having sufficiently long recordings (CATH 2: 1217 phrases; CATH 4: 526 phrases: CATH 9: 449 phrases; CATH 1: 396 phrases). This pattern of phrase composition is mirrored in a negative regression of the SPU between recordings at different times and the length of time separating them. While there is a high correlation between recordings that are adjacent in time, that similarity diminishes over consecutive hours or days. The regression was SPU = 0.507 − 0.008 × days, and the correlation between the SUV and days separation was −0.57. This is shown graphically in the Supplementary materials Section 9. Table 1 also shows the motif composition of the song network graphs. The rank order of composition in all cases but one was bottlenecks ≤ branches < hourglass < one-way. To the extent that this is a useful and scale-free way to characterize graphs, it agrees with the other measures used here, and finding of Weiss et al. (2014) for nightingales, that the structure of the songs is similar across males.

Discussion
The picture that emerges is one of an effectively infinite number of possible phrase types available for California Thrasher songs. Their use is structured -at the top level, a bout consists of songs, sequences of 15-20 phrases separated by several seconds; contiguous songs typically contain 15-20 long different phrase types grouped into a few recurrent small-world themes, with most transitions among phrases being within-theme. New themes episodically enter into use as older ones gradually disappear. This turnover in phrase usage means that with sufficient time the male is using almost entirely different phrase types. Few phrase types are shared among different males. The phrase types in play over short terms -e.g. within a year -are largely distinct and can be distinguished from one another, but over a sufficiently long time the number of phrase types might be so large as to be practically indistinguishable.
Beyond bunching into more-or-less distinct themes, it appears that there are preferred transitions among themes, and possibly some phrase types are more commonly used to introduce songs or to end them (see the Supplemental material Section 10). These higher, more elaborate groupings are difficult to study because they are largely statistical, and because they involve so many phrase types it will be difficult to infer syntactic rules without much longer phrase sequences and larger sample sizes -probably requiring better classification for automated analysis.
Patterns of phrase usage into sequentially appearing themes, as shown in Figure 5 suggest that one model for assembling phrase sequences -their syntactic rules -might occur as a Markov Process with changing transition probabilities (i.e. is inhomogeneous). Another explanation might be that the rules governing syntax in the species are not strictly a firstorder Markov process, even locally. Stabler et al. (in preparation) have identified some syntactic regularities, and they are indeed more complex than can be described by Markov processes. In view of the interest that linguistic theory has for bird song (e.g. Berwick et al. 2011), more work is clearly warranted.
The analysis in this paper is predicated on the assumption that identification of rules for symbolic sequencing will enable description of the song structure. We believe it entirely possible, even likely, that other rules of assembly, involving pitch, rhythm or phrase shapes may also prove to determine song structures and what meanings are conveyed, though it appears that symbolic rules will prove important as well.
The ultimate judge of whether we have identified the correct structural/syntactic rules must come from the birds themselves. Experiments like those of Weiss et al. (2014) with playbacks of songs constructed from different rules are much desired. Such experiments with California Thrashers are currently being conducted. We predict that the birds will be much more attentive and interested when playbacks employ the correct rules for syntax. This study is an attempt to make a first pass at identifying those rules.