Developmental cell fate choice in neural tube progenitors employs two distinct cis-regulatory strategies

Summary In many developing tissues, the patterns of gene expression that assign cell fate are organized by graded secreted signals. Cis-regulatory elements (CREs) interpret these signals to control gene expression, but how this is accomplished remains poorly understood. In the neural tube, a gradient of the morphogen sonic hedgehog (Shh) patterns neural progenitors. We identify two distinct ways in which CREs translate graded Shh into differential gene expression in mouse neural progenitors. In most progenitors, a common set of CREs control gene activity by integrating cell-type-specific inputs. By contrast, the most ventral progenitors use a unique set of CREs, established by the pioneer factor FOXA2. This parallels the role of FOXA2 in endoderm, where FOXA2 binds some of the same sites. Together, the data identify distinct cis-regulatory strategies for the interpretation of morphogen signaling and raise the possibility of an evolutionarily conserved role for FOXA2 across tissues.


INTRODUCTION
During development signaling cues direct cell fate decisions. Precise spatiotemporal gene expression is essential for this process. Cis-regulatory elements (CREs) integrate inputs from tissue and cell-type-specific transcription factors (TFs), as well as signaling effectors, to direct gene expression. 1 This suggests a straightforward mechanism in which distinct signals produce different cell types by activating different transcriptional effectors that bind to different CREs. However, in many tissues, a single signal directs multiple, alternative cell fates. 2 Two strategies can be envisioned for how CREs interpret a single signal to define multiple cell fates. Different CREs could function in different cell types. In this strategy-differential accessibility-the availability of an element is the principal determinant of cell identity. In an alternative strategy-differential binding-the same CREs could be used in all cell types in the tissue and the configuration of proteins bound to the elements determines cell fate. An implication of these two strategies is that choosing between two fates will require chromatin remodeling in the case of the differential availability of CREs but can happen without remodeling in the case of different protein configurations at the CREs.
A well-characterized example of a signal controlling multiple cell types is the morphogen sonic hedgehog (Shh). Shh controls the pattern of neuronal subtype generation in ventral regions of the developing neural tube. 3 Shh, initially secreted from the notochord and later from the floor plate (FP), 4,5 forms a ventral to dorsal gradient in the neural tube and establishes a set of molecularly distinct domains that occupy characteristic positions along the dorsal-ventral axis. Progenitors in each domain differentiate into distinct classes of post-mitotic neurons. This organization is necessary for the subsequent assembly of the locomotor and sensory circuits. 6,7 The progenitor domains are identifiable by characteristic gene expression programs (Figure 1A). NKX2.2 is expressed in p3 progenitors, the most ventral domain; 8 OLIG2 is expressed in adjacent motor neuron progenitors (pMNs); 9 PAX6 is expressed in more dorsal progenitor domains, including p0, p1, and p2 domains, whereas lower expression is detected in pMN; 10 NKX6.1 is expressed in, p3, pM, and p2 domains. 11 The pattern of progenitor gene expression is determined by a gene regulatory network (GRN). Broadly expressed tissuewide activators, such as SOX2, promote the transcriptional programs of multiple progenitor domains. [12][13][14] Concurrently, a set of TFs encoding Groucho/TLE-dependent transcriptional repressors are differentially expressed in distinct progenitor domains. 15 Pairs of these TFs, expressed in adjacent domains, cross-repress each other and form a network of transcriptional repression that selects a single progenitor identity by repressing all other cell fates. [16][17][18] The transcriptional effectors of Shh signaling, the GLI (Gli1, Gli2, and Gli3) proteins provide the spatially polarized input that initiates the patterning process. [19][20][21] The combination of the positive and negative inputs generated by this network establish and position the discrete boundaries of gene expression domains in response to graded Shh signaling. 11,22,23 Although the genetic architecture of the GRN is well established, how cell fate choice is implemented through CREs is less clear. Enhancer elements that drive expression of domainspecific TFs have been identified. 13,14 The presence of these cell-type-specific enhancer elements could support the differential accessibility strategy in which different CREs are available in different cell types. However, many of these CREs are bound by SOX2, the Shh effector GLI1 and the repressive TFs expressed in alternative cell types, including NKX2.2, OLIG2, and NKX6.1. 13,16,17 This suggests that the same CREs are employed in different cell types and the composition of TFs at these CREs determines activity: the differential binding strategy.
To distinguish between these two regulatory control strategies, we made use of neural progenitors (NPs) differentiated from embryonic stem (ES) cells. 24, 25 We reasoned chromatin accessibility was the best approach to capture the global landscape of functional regulatory regions. We therefore needed a cell-type-specific accessibility assay, in a system without cell surface markers. Here, we developed an ATAC-seq workflow-crosslinked and TF-sorted ATAC-seq (CaTS-ATAC)based on intracellular flow sorting that allowed the assay of chromatin accessibility of NP cell types. We found that most NPs shared a common set of CREs. These elements are bound by the known repressive TFs, indicating that these cell fate decisions are made by changing the composition of proteins bound to the elements. By contrast, the most ventral NP cell type, p3, has a distinct regulatory program. We identify the role of FOXA2 in driving this chromatin remodeling and show that it is required for p3 specification. This is reminiscent of the pioneering role of FOXA2 in endoderm lineages, raising the intriguing possibility that it represents a remnant of a GRN co-opted from this germlayer or a shared evolutionary origin for these cell types. (D) Representative immunohistochemistry of ES cells differentiated for 6 days show expression PAX6 when exposed to 0 nM SAG and NKX2.2 if exposed to 500 nM. At 100 nM SAG, both OLIG2 or NKX2.2 are detected compared with little or no signal at 10 nM SAG. Scale bars, 50 mm.
(E) Proportion of NPs at each SAG concentration shows generation of higher proportions of more ventral cell types at increasing SAG concentrations. Dots are individual samples. n = 4 biological replicates for each SAG concentration. Line represents the average. Shaded areas, SEM. SAG, smoothened agonist. See also Figure S1.

A stem cell model of ventral neural tube progenitors
To generate NPs of different dorsoventral identities, we made use of an established protocol for the directed differentiation of mouse ES cells. 24,25 To mimic graded Shh signaling, we used different concentrations, ranging from 0 to 500 nM of the Shh agonist SAG (smoothened agonist) ( Figures 1A and 1B). We observed a dose response as measured by Shh target genes Gli1 and Ptch1 ( Figure S1A). Using markers for specific progenitor domains, 26 we assayed cell identity with single-cell resolution ( Figures 1C and 1D). Each SAG condition generated a mixture of two or three NP types. Using the combinatorial expression of TFs, we quantified the proportions of different cell types for each concentration by intracellular staining followed by flow cytometry ( Figure 1C). This showed enrichment of the expected cell types. The highest concentration (500 nM SAG) produced the highest percentage of the most ventral domain, p3, characterized by the expression of NKX2.2; pMN, which express the TF OLIG2, were produced at 500 and 100 nM SAG; p2 progenitors, which express NKX6.1 and PAX6, were produced at 100 and 10 nM. PAX6 expression in the absence of NKX6.1 identifies p0 and p1 progenitors, and these were produced at 0 and 10 nM SAG. The expected combi-natorial expression of TFs was also observed: All p3 and pMN cells also expressed NKX6.1 ( Figure 1E). Moreover, we observed a lower level of PAX6 in pMN cells compared with p3 cells, consistent with in vivo data ( Figure S1B). The different SAG conditions also generated the expected neuronal subtypes after the addition of Notch inhibitor dibenzazepine (DBZ), which induces neuronal differentiation (Figure S1C). Overall, the data indicate that the ES-derived model of neural tube patterning ventral of progenitors faithfully mimics the in vivo response of NPs to Shh.
CaTS-ATAC identifies cell-type-specific regulatory program To distinguish whether the gradient of Shh signaling is interpreted via differential accessibility or differential binding, we reasoned we needed global chromatin accessibility information with cell-type specificity. ATAC-seq is conventionally performed in live, permeabilized cells. However, the cell-type-specific markers that distinguish NPs are intracellular and therefore cannot be used on live cells. Based on previous work, 27 we devised a strategy-CaTS-ATAC-to perform ATAC-seq on formaldehyde fixed cells. This involved intracellularly staining and flow cytometry sorting of cells followed by bulk ATAC library preparation (Figure 2A; STAR Methods).   (B) Heatmap showing ATAC-seq coverage for elements differentially accessible between any two cell types or time points with the same dynamics, grouped in clusters (see STAR Methods) shows decommissioning of the NMP program (cluster 1), a shared pan-neuronal cluster (cluster 4) and two behaviors in cell typespecific accessibility, a shared for p0-1, p2, and pMN samples (clusters 6-8), and a unique program for p3 (clusters 4 and 5). Elements ordered by average accessibility.
(legend continued on next page) ll OPEN ACCESS

Article
We assessed the overall quality of the method by comparing it to ATAC-seq from live cells using the Omni-ATAC protocol. 28 To this end, we sorted Olig2-expressing cells using a previously developed reporter cell line 25 and compared it with CaTS-ATAC pMN samples, both from 100nM SAG at day 5 of differentiation. CaTS-ATAC data were of high quality as assessed by the insert size distribution, proportion of mitochondria DNA, and FRiP (Fraction of Reads in Peaks) score ( Figures S2A-S2C). There was some underrepresentation of the smallest size fragments, which we assume is due to the reverse crosslinking steps, as noted by others 29 ( Figure S2A). The correlation between samples from the same method was high (R 2 > 0.94) ( Figures S2D and S2E) and a bit lower between methods (R 2 > 0.72 for any pair wise comparison) (Figure S2F). We note that the samples will be slightly different as the reporter-sorted cells will include newly differentiated MNs and some OLIG2+ NKX2.2+ double positive cells, which are excluded from out CaTS-ATAC pMN. Overall, we were satisfied with the quality assessments and used standard ATAC QC metrics for all further samples.
We performed CaTS-ATAC-seq over a 4-day time course following the addition of different concentrations of SAG. For each condition, we sorted the two or three predominant cell types. This allows us to compare the regulatory landscape of the same cell type originating at different SAG concentrations as well as different cell types within the same dish ( Figure 2B).
To explore this dataset, we first examined the correlation between samples based on accessibility at transcriptional start sites (TSSs) or distal elements (not overlapping with TSS). As previously reported in other systems, 30 accessibility of distal elements exhibited greater cell-type specificity and higher dynamic range than promoter elements ( Figures S2G and S2H). This confirms our previous observation that ATAC-seq can capture the regulatory landscape of NPs, 31 and it can do so with cell type specificity.
Cell-type-specific accessibility configuration is independent of Shh agonist concentration We first investigated the effect of SAG concentration on cell type-specific chromatin accessibility. To do this, we compared global accessibility changes between different NP types generated under the same SAG concentration and the same NP type generated under different SAG concentrations.
We found that the accessibility profiles between the same NP type isolated from different SAG concentrations were indistinguishable ( Figures 2C-2E). Differential accessibility analysis using DESeq2 revealed no significant differentially accessible peaks between pMN derived from different SAG concentrations ( Figures 2D and 2E) nor between p3 derived from different SAG concentrations ( Figures 2D and 2E). By contrast, over 2,000 differentially accessible peaks were identified between pMN and p3 generated under the same SAG concentration (thresholds: fold change over 2, FDR < 0.01 and base mean > 100 normalized counts) ( Figure 2D). Similarly, p0-1 or p2 arising from different concentrations had few if any differentially accessible elements ( Figure S2I, arrowhead), whereas almost 500 elements were highly differentially accessible between these two NP types generated from the same SAG concentration ( Figure S2I).
Thus, each cell type, identified by marker gene expression, acquired a specific chromatin landscape, irrespective of the agonist concentration to which they were exposed. This indicates the GRN mechanism of morphogen interpretation, that converts graded Shh input into distinct cell identities, establishes not only the discrete transcriptional identities of NPs but also their chromatin landscapes. Thus the differential chromatin accessibility induced by different levels of Shh signaling reflect cell-type-specific identity and are a product of the GRN that patterns progenitors.
A shared chromatin landscape for p0-1, p2, and pMN To examine the global accessibility dynamics, we performed principal component analysis (top 30,000 elements, Figure 3A) and clustered differentially expressed elements across all conditions using k-means ( Figure 3B; STAR Methods). The main change across all data is the global decommissioning of the neuromesodermal progenitor (NMP) program and establishment of the NP program, which occurs between day 3 and day 4 of differentiation ( Figures 3A and 3B). This involves closing a large proportion of elements open in D3 NMPs ( Figure 3B, cluster 1 ''NMP''), as well as opening elements in NPs ( Figure 3B, cluster 3 ''day 4,'' cluster 2 ''days 4 and 5,'' cluster 9 ''pan-neural''). This is expected as cells adopt neural identity. Consistent with this, SOX2 ChIP-seq from NMPs 32 is enriched in cluster 1 NMP elements, and SOX2 ChIPseq from neural cells 13 is enriched in neural clusters (9, 6, 7, 8, 4, and 5) ( Figure 3C).
From day 4 onward, as cells adopt one of the ventral NP fates, the analysis revealed that pMN, p2 and p0-1 were remarkably similar ( Figures 3A and 3B). This is surprising as these progenitors are molecularly distinct and give rise to functional different neurons: pMNs, characterized by the expression of OLIG2 generate motor neurons, whereas the p2 domain expresses IRX3 and generates V2 interneurons. RNA expression of marker genes from paired RNA-seq samples confirmed the expected identity of these cell types ( Figure S3A). NP p3 were, however, distinct in their chromatin accessibility profile ( Figures 3A and 3B).
The markers that define NP domains are repressive TFs and are directly involved in cell-type specification. OLIG2 is expressed in the pMN and represses p3 and p2 fate. NKX6.1 is expressed in p2, pMN and p3 and represses p0-p1. Binding data (C) Heatmap of ChIP-seq coverage for the same elements for NKX6.1, OLIG2, and NKX2.2 16 performed in neural embryoid bodies treated with SAG shows binding to both pan-neuronal and cell-type-specific elements, and SOX2 from either NMP stage 32 or SAG-treated neural embryoid bodies 13 correlates with accessibility in either NMP or neural progenitors, respectively. (D) Footprinting scores (using TOBIAS, see STAR Methods) for motifs with highly variable scores between p0-1, p2, and pMN at days 5 and 6, and normalized RNA counts for the most correlated TF within the motif archetype. The motifs for cell-type-specific TFs show expected footprinting differences. (E) Example loci that are accessible across p0-pMN but were not accessible at day 3 and do not became accessible in p3 NPs. ChIP-seq for NKX6.1 and OLIG2 16 shown for the same elements. Normalized expression for nearby genes Neurog2 and Prdm12 shown. Points are individual samples; bars represent average expression. See also Figure S3. To explore gene regulation differences between p0-p1, p2, and pMN NPs, we turned to footprinting. This computational approach uses transposase insertion sites to identify motifs that are under-transposed within open chromatin and are thus likely to be protected by a bound protein. We used TOBIAS 33 with motifs from three databases, [34][35][36] grouping motifs into published archetypes based on position weight matrix clustering of motifs. 37 The analysis revealed homeodomain (HD) and PAX motifs among those most differentially scoring between pMN, p2, and p0-1 at day 5 and day 6 ( Figure 3D). From all the TFs associated with these motifs, the genes Nkx6-1 and Pax6 had RNA expression levels most consistent with the footprint score for HD and PAX, respectively ( Figure 3D). In addition, we found the Ebox motif to be among the most variable footprints between these NPs ( Figure S3D). The high footprint score in pMN is expected as these NP express Olig2. However, this is not the only Ebox binding TF expressed in these cell types that could explain this footprint. Neurog2 and Neurod1 both bind the same motif and are expressed in p2 and pMN, potentially explaining the Ebox footprint score in p2 NPs. This highlights the challenges in assigning footprints to specific TFs. Overall, the differential footprints found between p0-1, p2, and pMN supports the differential binding strategy, in which a common set of open chromatin regions is shared across cell types and different protein composition of the CREs determine gene activity ( Figure 3E).
FOXA2 establishes the p3 regulatory landscape By contrast to the other NPs, p3 cells had a distinct global accessibility profile, consistent with the differential accessibility strategy ( Figures 3A and 3B). Accessibility was apparent at a unique set of open chromatin regions not shared with any other NP (''p3specific'' cluster 5), and at a set of elements highly enriched in this cell type (''p3-enriched'' cluster 4). These differences are evident in the second principal component of the PCA where p3 cells are clearly distinct from all other ventral progenitors ( Figure 3A).
Analysis of ChIP-seq for the p3-specific TF NKX2.2 revealed NKX2.2 was enriched in the p3-enriched (cluster 4) and p3-specific clusters (cluster 5) ( Figures 3C and S3B). However, NKX2.2 was not specific, and it also bound the elements shared across all cell types (pan-neural cluster 9) and, to a lesser extent, the p0-pMN-enriched clusters (clusters 6 and 7) ( Figures 3C, S3B, and S3C). Thus, the expression of NKX2.2 did not appear to explain the distinct chromatin landscape of p3. What then drives this distinct chromatin remodeling?
To explore which TFs made p3 NP different from other NPs, we compared footprinting scores across all conditions. The strongest footprint distinguishing p3 from other NPs was predicted to correspond to FOXA2 (Figures 4A, 4B, and S4C). This was intriguing as FOXA2 is known to be expressed during FP development. [38][39][40] FOXA2 responds to Shh, 41 and it induces Shh expression, 42 thus generating a new signaling center in the ventral neural tube.
To exclude the possibility that the cells isolated as p3 NPs (SOX2+ NKX2.2+ OLIG2À) contained FP cells, we devised culture conditions that promoted expression of mature FP markers (Arx, Shh) in which Foxa2 is expressed at high levels. We compared our spinal cord (SC) differentiation with FP differentiation and observed little expression of FP markers in SC conditions even at the last time point sorted, D6 ( Figure S4A). Moreover, the expression of FP markers even at later time points was very inefficient in SC conditions, suggesting that the FOXA2 footprint is not the result of FP induction ( Figures S4A  and S4B).
We next analyzed published FOXA2 ChIP-seq generated from neuralized embryoid bodies treated with SAG. 13 This revealed FOXA2 binding only to p3-specific peaks ( Figure 4D). Consequently, we observed open chromatin in p3 NP samples in the regions bound by FOXA2 ( Figure 4C). This contrasts with ChIP-seq of other NP TFs, which showed accessibility across all NP cell types ( Figure S3C). Together, the data prompt the hypothesis that FOXA2 is driving the unique chromatin accessibility profile we observe in p3 cells.
In vivo p3 cells have a history of FOXA2 expression If FOXA2 plays a role in generating the p3 chromatin accessibility profile, it should be expressed in p3 cells. In vivo NKX2.2 and FOXA2 are co-expressed at early stages of neural tube development, 43,44 but these FOXA2 expressing cells have been thought to mark the future FP. 45 To determine whether cells that express FOXA2 also contribute to the p3 domain, we took advantage of (C) Average ATAC-seq accessibility at FOXA2 ChIP-seq peaks 13 in the indicated samples shows these regions are highly accessible in p3 NPs. (D) Normalized FOXA2 ChIP-seq coverage showing accessibility in p3-specific elements from the groups of ATAC-seq elements in the indicated clusters from Figure 3B. (E) Genetic lineage tracing indicates that cells that expressed Foxa2 at E8.5 (tamoxifen administration) have contributed to the p3 progenitor and V3 neuronal cell types by E11.5 (red arrows). (F) Quantifications of p3 and V3 cells expressing tdTomato in embryos collected after tamoxifen administration at the indicated times. Biological replicates: n = 2 for E7.5, n = 4 for E8.5, n = 3 for E9.5. (G) Foxa2À/À ES cells fail to generate p3 NPs when exposed to 500 nM SAG.  Article genetic lineage tracing in mouse embryos. Foxa2-driven CreERT2 expression 46 induced recombination of a fluorescent reporter when tamoxifen was administered to pregnant mice at E8.5 or E9.5. Embryos analyzed at E11.5 showed expression of the reporter in both p3 (as identified by SOX2 and NKX2.2 expression) and V3 neurons (identified as laterally located cells expressing NKX2.2 alone) ( Figures 4E and 4F). When tamoxifen was administered at E7.5, zero and one p3 cells were detected in two embryos, consistent with de novo Foxa2 expression induced by ventral Shh signaling ( Figure 4F). This demonstrates that p3 cells and their neuronal progeny express Foxa2 during their history and supports the hypothesis that FOXA2 expression establishes a unique chromatin landscape in p3.
FOXA2 is required for p3 cell fate specification Our results are consistent with the hypothesis that FOXA2 drives extensive remodeling of the chromatin landscape in p3 cells, which in turn is essential for p3 identity. To test this hypothesis, we used genome engineering to create a FOXA2 knockout mouse ES cell line. Foxa2À/À cells showed a marked reduction in the proportion of p3 cells (25% in wild type to 2% in the mutants) under 500 nM SAG (Figures 4G-4I). Cells co-expressing NKX2.2 and OLIG2 were still present in Foxa2À/À ( Figure 4H). This indicates FOXA2 is not directly required for the expression of NKX2.2. Instead, we hypothesize that FOXA2-driven remodeling is required in order for NKX2.2 to exert its repressive activity on OLIG2 and establish the p3 domain. Consistent with this, we observed binding of FOXA2 at the same locations as NKX2.2 ( Figure S4D).
FOXA2 expression substitutes for Shh signaling early in p3 specification FOXA2 is expressed in NPs closest to the source of Shh, and it is only co-expressed with NKX2.2 at early stages of neural tube development. To ask whether early exposure to Shh signaling is necessary to establish a p3 identity, we took advantage of our in vitro NP differentiation protocol to investigate the timing requirements for FOXA2 expression and Shh signaling in p3 specification. To this end, we exposed cells to 0 nM SAG for 24 h before changing regime to 500 nM SAG. Compared with a constant exposure to 500 nM SAG, cells exposed to delayed SAG are greatly impaired in their generation of p3 NPs (Figures S5A and  S5B). This was not due to reduced Shh transduction ( Figure S5B). The reduction in p3 specification was reminiscent of the outcome of exposing Foxa2À/À NPs to 500 nM SAG (Figures 4G and 4H) as we also saw NPs co-expressing NKX2.2 and OLIG2 ( Figure S5B). When cells were exposed to SAG from day 3, a proportion of them show low but detectable levels of FOXA2 after 24 h. In the delayed SAG addition conditions, however, cells did not upregulate FOXA2 in the equivalent time frame ( Figure S5D).
This regime provided the opportunity to test whether FOXA2 expression within the initial 24 h of NP differentiation was sufficient to rescue p3 generation. We used doxycycline inducible lines that expressed either mCherry or a Foxa2-mCherry fusion under the control of the doxycycline-responsive promoter tetON. Consistent with prior experiments, overexpression of mCherry alone in this ''delayed SAG'' regime showed a reduction in p3 generation at D6 ( Figure 5A). By contrast, enforced expression of Foxa2 for 12 h during the initial 24 h of neural differentiation rescued p3 generation to levels comparable with those observed with constant 500 nM SAG ( Figure 5A).
These data indicate that high Shh at the onset of SC progenitor specification is required to induce Foxa2. This expression of FOXA2 is sufficient to confer competence for Shh signaling to specify p3 identity.
FOXA2 is sufficient to remodel the chromatin landscape To test whether FOXA2 is responsible for the early p3-specific chromatin remodeling, we took advantage of the tetON cell lines. We performed bulk ATAC-seq at D4, which is 24 h after the induction of neural identity and 12 h after doxycycline exposure ( Figure 5B). Both control and FOXA2-overexpressing conditions showed accessibility across the previously defined NMP and pan-neuronal clusters ( Figure S5F). Although neither cell line had been exposed to SAG, FOXA2-expressing cells displayed opened chromatin across the p3-specific cluster ( Figure S5F).
Consistent with its known role as a pioneer factor, 47 FOXA2 overexpression resulted in chromatin opening ( Figure 5B). Over 30% of the regions in the p3-specific cluster overlapped with regions more accessible in FOXA2-overexpressing cells ( Figure 5C). Increased accessibility in cluster 5 (p3-specific) regions in response to induced FOXA2 is observed together with binding of p3-specific TF NKX2-2, with areas of highest binding also sharing FOXA2 binding ( Figure 5D). An example of this behavior is shown for a Gli2 intron ( Figure 5E), as well as three additional loci ( Figure S5G). These results support a model in which FOXA2 opens chromatin regions required for the p3 fate. Thus, Shh induction of Foxa2 at early neural developmental stages is required to configure the p3-specific response of the cells to Shh. Cell fate choice between p0-1, p2, and pMN employ a differential binding strategy, whereas p3 rely on differential accessibility mediated by pioneer factor FOXA2 ( Figure 5F). (E) Example element in a Gli2 intron with p3-specific chromatin accessibility, FOXA2 and NKX2.2 binding that gains accessibility upon Foxa2 overexpression but not control. (F) Diagram describing the two regulatory landscapes that underlie the molecular identity of ventral neural progenitors, a differential binding strategy is used to distinguish p0-1, p2, pMN, whereas p3 employ a differential accessibility strategy. See also Figure S5.
A common lineage-pioneering role for FOXA2 across germ layers FOXA TFs are required for the development of endoderm-derived lineages, pancreas, liver, and lung, where they play partially redundant roles. [48][49][50][51][52][53] Although many TFs play roles in multiple tissues, they are generally thought to act via different enhancers in a combinatorial fashion with other tissue-specific TFs. 54,55 Pioneer TFs are proposed to act by opening up compacted chromatin, thus allowing other TFs to bind. 56 Hence, a given pioneer TF could have access to the same binding sites independent of the cellular context. However, the binding of pioneer TFs has been shown to depend on epigenetic features. 57 Indeed, cell-type-specific cofactors direct cell-type-specific binding of FOXA2 in endogenously expressing cell lines and in FOXA2 overexpressing cell lines. 58 We therefore asked how much of FOXA2's pioneering activity is tissue specific between endoderm and neural p3 progenitors. We examined the binding of FOXA2 in cellular differentiation models of endoderm at day 3 and day 5 50 and asked whether endoderm FOXA2 was bound to any of the SC differentiation clusters. To our surprise, we found FOXA2 also bound to the p3-specific cluster of elements in endoderm (cluster 5) (Figure 6A). This included genes such as Prox1, expressed in both lineages; 59,60 Lmx1b, required for specification of hindbrain p3derived serotonergic neurons 61,62 and for in vitro pancreatic islet generation; 63 and Rfx3, also required for pancreatic endocrine differentiation 64 and expressed in the ventral neural tube 65 (Figure 6B). This argues for a functional role in both tissues for at least some of these targets. Moreover, many of these sites were either opened in NPs upon brief overexpression of Foxa2 ( Figures 6B and S5F) or occupied by overexpressed FOXA2 in endoderm differentiation ( Figure 6B).
Although endoderm and NPs also have tissue-specific FOXA2 binding, these data raise the possibility that the two tissues share a core set of targets. Mechanistically, this could be mediated by the classical pioneer role of FOXA2 in this subset of targets or by the presence of common co-factors in both tissues. Intriguingly, it could offer evidence of an evolutionarily conserved GRN.

DISCUSSION
Here, we identify two cis-regulatory strategies that direct cell fate choice in developing NPs. One strategy-differential bindingrelies on a common regulatory landscape. The different composition of TFs at these CREs dictates differential gene expression and cell fate decisions. The second strategy-differential accessibility-relies on cell-type-specific chromatin remodeling (Figure 5F). This is the case for p3 NPs, which have a unique set of accessible elements that distinguish them from all other NPs. We show that FOXA2, expressed early in response to Shh, is responsible for the remodeling that opens p3-specific elements. Many of these elements then bind the p3-specific TF, NKX2.2. Intriguingly, a subset of the elements bound and opened by FOXA2 in NPs are also bound by FOXA2 during endoderm differentiation, where it plays a parallel role opening elements later required for specific endoderm lineages. 51 The identification of these two regulatory strategies establishes a framework for classifying the genomic implementation of cell fate-decision mechanisms.
To assay cell-type-specific chromatin accessibility, we combined intracellular flow cytometry with methods developed to measure protein levels at single-cell resolution 27 that produced high quality chromatin accessibility data for specific cell types. 29 The method is applicable to other tissues, in vitro and in vivo, and will be of particular use in cases where fluorescent reporters are not readily available or when the identification of the desired cell types depends on the intersectional expression of multiple markers. Moreover, the approach could be extended to isolate cells based on other intracellular characteristics, such as posttranslationally modified signaling effectors. A common set of pan-neural elements, not accessible prior to neural induction, was apparent in all NPs. These elements were enriched for binding of TFs involved in the neural tube GRN, including SOX2, NKX6.1, NKX2.2, and OLIG2. The existence of this set of elements is reminiscent of the idea of morphogenetic fields-discrete, modular units of embryonic development 66 that display a coordinated response to patterning signals. 66 Comparison of chromatin accessibility in the same NP cell type generated using different levels of Shh signaling revealed that cell identity, rather than morphogen concentration, correlated with chromatin accessibility pattern. This indicates that the morphogen interpretation mechanism that converts graded Shh input into distinct cell identities establishes not only the discrete transcriptional identities of NPs but also their chromatin landscapes.
The shared chromatin landscape of the majority of ventral NPs suggests a differential binding strategy governs gene expression selection for these cells. This is consistent with the ''selection by exclusion'' mechanism that has been proposed previously to explain the specification of progenitor subtype identity by graded Shh signaling. 16,17 In this view, the positive and negative transcriptional inputs supplied by the GRN are integrated at CREs associated with target genes, which are accessible in NPs, regardless of whether the target gene is active. Thus, cell fate choice does not depend on major chromatin remodeling but instead is determined by the composition of transcriptional effectors bound. These findings challenge the validity of approaches typically used to infer gene regulation and cell identity from genomic data, which equate open chromatin with gene activity and cell fate. Our data demonstrate that a major regulatory strategy-''differential binding''-violates this assumption and indicate the necessity of developing more nuanced methods.
By contrast, the differential accessibility strategy appears to dictate p3 identity. These NPs are distinguished by accessibility at a distinct subset of CREs that depend on FOXA2. Despite the transience of FOXA2 expression in p3 NPs, 67 accessibility at the CREs opened by FOXA2 persists and are bound by the p3-specific TF NKX2.2. This is similar to the observation that the presence of FOXA is not required to maintain stable epigenetic states in liver cells. 68 The marked changes in chromatin landscape initiated by FOXA2 thus represents an example of epigenetic memory and demonstrate how the pioneering activity of a TF rewires a GRN to specify a specific cell type.
The different regulatory strategies have important implications for cellular reprogramming. Cell types sharing the same landscape (differential binding) would be relatively plastic, enabling transitions between the different fates by altering the TF configuration on already available CREs. Consequently, the barrier to reprogram cells between such fates would be lower. On the other hand, the differential accessibility strategy involves chromatin remodeling to facilitate the binding of TFs previously unable to bind specific CREs. The expression of the end point TFs alone might not be sufficient for cell fate acquisition if the elements these TFs need to bind are inaccessible. This might explain the inefficiency or incompleteness of some transgenedriven reprogramming approaches that use only the TFs expressed in the final state of the target cell type. 69 Understanding and recapitulating the epigenetic trajectory cells follow to reach the desired endpoint might improve such differentiation protocols.
Our results are consistent with prior genetic studies showing the p3 NPs are the only NPs that require activating GLI proteins for their specification. 19,70 The demonstration that FOXA2 expression can substitute for Shh signaling during the early steps of p3 specification indicates that the requirement for activating GLI proteins is for the expression of FOXA2. This requirement restricts p3 induction to the most ventral progenitors in the neural tube. The expression of FOXA2 and NKX2.2 in p3 NPs also appears to contribute to establishing p3 identity by modulating the cell intrinsic response to Shh. 43 NKX2.2 promotes Foxa2 via a mechanism involving Gli3 repression 43 and FOXA2 binds and opens an intronic region in the gene encoding Gli2 ( Figure 5D). Together, these data suggest that the interpretation of Shh signaling by p3 NPs is distinct from the adjacent p0-pMN region. This distinction involves remodeling the regulatory landscape of p3 NPs.
FOXA2 has been proposed to act as a pioneer factor in the endoderm, where it remodels chromatin to allow the binding of other TFs. 51,56 Although TFs often have roles in multiple tissues, they are usually thought to act in a combinatorial fashion with other TFs via different elements in different tissues. 54,55,58 The observation that FOXA2 binds to p3 NP-specific CREs in endodermal cells raises the possibility these two cell types share some aspects of their regulatory control. Consistent with this, pancreas specification, an endoderm derivative, involves many TFs that are also expressed in the ventral neural tube, specifically in the p3 domain. 71 The potential shared control by FOXA2 of functionally relevant genes in endoderm and neural tissue raises the question of how this might have arisen. The role of FOXA proteins in endoderm development is evolutionarily conserved 72 and predates the origin of bilaterians. One possibility therefore is the co-option of an endoderm FOXA2 regulatory program into NPs in an ancestor of vertebrates. Alternatively, it could be evidence of convergent evolution or of a common evolutionary origin for endoderm and neural cells. This latter possibility would be supported by recent work suggesting that parts of the digestive system in the cnidaria Nematostella are of ectodermal origin 73 and that a common progenitor generates neurons and secretory cells in this species. 74,75 Thus, the shared regulatory features seen in vertebrate endoderm and neural cells might be a vestige of a common evolutionary origin of these cell types. A deeper understanding of the GRNs in neural tissue and endoderm will be needed to explore this possibility. More generally, it will be important to determine whether other tissues use similar cis-regulatory strategies to specify distinct cell types.

Limitations of the study
In this study, we used Foxa2 knockout ES cell lines to show a requirement for Foxa2 in NPs. Whether there is a similar requirement for FOXA2 in vivo is not known because the removal of Foxa2 in vivo disrupts axis formation and is embryonic lethal at ll OPEN ACCESS Article an earlier time point. 76,77 Additionally, the previously published ChIP-seq data used in this study was generated from a different directed neural differentiation protocol, and there could be differences in binding of the ventral TFs as a result. Finally, although we demonstrate a good correlation between data produced by CaTS-ATAC and conventional ATAC protocols that use live sorted cells, there could be limitations introduced by the paraformaldehyde fixation necessary for CaTS-ATAC. Paraformaldehyde disrupts TF binding to mitotic chromatin, and this might result in a loss of any cell cycle phase specific signal.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:

ACKNOWLEDGMENTS
We are grateful to Andreas Sagner, Vicki Metzis, and Teresa Rayon and members of the Briscoe lab for advice and reagents. We thank Tiago Rito for nuclei segmentation of images. We thank Anne Grapin-Botton and Uli Technau for critical reading of the manuscript. We thank the Crick Science Technology Platforms, in particular, Vangelis Christodoulou from the Structural Biology STP for production of Tn5 enzyme, Flow Cytometry, Advanced Sequencing, Advanced Light Microscopy, BRF and Advanced Computing. This work was supported by the Francis Crick Institute, which receives its core funding from Cancer Research

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, James Briscoe (james.briscoe@crick.ac.uk).

Materials availability
Murine ES cell lines generated in this study area available upon request.

Data and code availability
Sequencing data (CaTS-ATAC, live Omni-ATAC, RNA-seq and bulk ATAC-seq) have been deposited at GEO and are publicly available as of the date of publication. Accession numbers are listed in the key resources table. Accession number for the SuperSeries is GSE204921. Microscopy data reported in this paper will be shared by the lead contact upon request. All original code has been deposited at https://github.com/MJDelas/Neural_DV_ATAC and is publicly available as of the date of publication. Custom code for lineage tracing quantification are available at https://github.com/MJDelas/tdTom_LinTracing. Any additional information required to reanalyse the data reported in this paper is available from the lead contact upon request. Oligonucleotides qPCR primers used in this study, see Table S1 This study N/A Primers for cloning of Foxa2 guides into pX458, see Table S1 This study N/A Oligonucleotides for Tn5 assembly, see Table S1 Picelli et al. 80 N/A Oligonucleotides for CaTS-ATAC and ATAC-seq library, see Table S1 This study N/A  79 were backcrossed and maintained in a C57BL6 background. Induction of recombination was achieved by tamoxifen administration by oral gavage at 0.08 mg/ body weight. All animal procedures were performed in accordance with the Animal (Scientific Procedures) Act 1986 under the UK Home Office project licenses PP8527846 and PF59163DB. Animals were housed under a 12-h light, 12-h dark cycle at the Francis Crick Institute animal facility. Animals were housed in singly-ventilated cages. The relative humidity was kept at 45 to 65%. Mouse rooms and cages were kept at a temperature range of 20-24 C. Animals had 24 h access to RO water and autoclaved pelleted food. Caging, bedding, nesting materials and enrichments were autoclaved prior to use and cages are changed routinely. A maximum of 5 adult mice were housed per cage if they were over 20 g. Animals were monitored visually daily for health concerns and once a week a full health check was carried out during a discretionary cage change

Cell lines
WT experiments were performed with the mouse embryonic stem cell line HM1. 78 All cell lines were maintained at 37 C with 5% carbon dioxide (CO 2 ).

Generation of Foxa2 -/-ES cell lines
Generation of Foxa2 knockout cell lines was performed by electroporating two different guides cloned into the pX458 vector 81 with the AMAXA nucleofector kit (Lonza Cat no. VPH-1001). Cells were sorted for GFP as single cells one day after electroporation. Clonal cell lines were genotyped and two clones were used. Both carried the same deletion.

Generation of tetON-Foxa2 and control ES cell lines
PiggyBac (PB)-tetOn-destination-PGK-hygro and PB-CAG-rtTA3-PGK-puro vectors were used from Stuart et al. 82 Gateway cloning (Invitrogen) was used to insert the coding sequence for mCherry or FoxA2:mCherry fusion protein into the tetOn destination vector. Stable transgenic HM1 mouse ES cell lines were generated by transfection as follows: 1 mg PB-tetOn expression vector, 1 mg PB-CAG-rtTA3 vector, 1 mg non-integrating PBase transposase vector and 2 ml Lipofectamine-2000 (Invitrogen) were incubated for 20 min in 100 ml DMEMF12 (Gibco) at room temperature, then applied to 300,000 cells/6well in 1.6 ml medium for 18 h. Selection was applied to transfectants for at least 5 passages prior to use (150 mg/ml hygromycin-B (ThermoFisher) and 1 mg/ml puromycin [ThermoFisher]). Transfection and selection were performed in feeder-free 2iLIF culture conditions as described in Stuart et al., 82 then adapted back to feeders and ES cell medium regime as described below for at least 4 passages prior to experiments. No. 566660-5mg). The ''delayed SAG'' regime consisted of 100 nM RA and 0 nM SAG from Day 3 to Day 4, followed by a constant concentration of RA and increased SAG to 500 nM. Induction of the tetON system was achieved by supplementing the media with 1 mg/ml Doxycycline (Sigma-Aldrich, Cat. No. D9891) for the second 12h of Day 4. Floor plate conditions were achieved by addition of 500 nM SAG on Day 3 without any addition of RA.
Footprinting analysis BAM files for merged replicates were used as input for TOBIAS ATACorrect, 33 followed by TOBIAS Footprint. TOBIAS BINDetect was run on all samples combined using the following the JASPAR2018, HOCOMOCO and Taipale databases. [34][35][36] Motifs that were amongst the top 5% highest absolute fold change or 5% smallest pvalues in each pair-wise comparison were selected. Motifs were grouped in archetypes. 37 The most variable archetypes for the desired comparison (either p0-1, p2, pM at day 5 and 6, or all conditions) were selected and RNA expression for each TF that is associated with the archetype was compared to the motif score for that archetype. This correlation analysis between motif score and RNA expression was used to find candidate TFs driving the chromatin changes observed.
RNA-seq data processing Data was processed using the nf-core rnaseq pipeline with the following options: -star_index 'star_ercc_mm10_genome' -gtf 'mm10.refGene_wERCC.gtf' -fc_group_features_type gene_id -fc_extra_attributes gene_id -r 1.4.2. Gene counts were obtained from featureCounts with the option ignoreDup=TRUE to try and remove PCR duplicates. Samples were excluded from downstream analysis based on a number of QC criteria particularly applicable to this type of low input samples from PFA-fixed cells including: too small proportion of the library being of mouse origin (excess spike-in representation), large number of overrepresented sequences, and loss of dynamic range in spike-in quantification.

QUANTIFICATION AND STATISTICAL ANALYSIS
Details for bioinformatic data analysis and software used are detailed in each method section. No additional statistical analyses were conducted. All code has been deposited at https://github.com/MJDelas/Neural_DV_ATAC, including parameters and versions for all software used. Figure legends contain number of replicates and details of data plotted (mean, s.e.m).