Regional recombinant novelty, related and unrelated technologies: a patent-level approach

ABSTRACT This paper investigates the impact of regional technological relatedness on the emergence of recombinant novelty (i.e., new combinations of subclasses occurring for the first time) in French regions using patent data over the period 1990–2010. We find that relatedness favours incremental innovations that reuse already applied combinations, whereas increasing levels of relatedness reduce the likelihood of novelty. However, the impact is less negative when combined technologies are new, unrelated or not locally specialized because it facilitates learning and technological recombination. We also find that universities and large incumbents are less dependent on relatedness than small and novel players to create novelty.


INTRODUCTION
It is a well-established fact that regional economic development builds on the local available capabilities in terms of inputs and knowledge (Hidalgo et al., 2007. When regions benefit from a diverse set of related activities or industries, they enhance their strengths in specific areas by orienting their investments and thus fostering comparative advantages. This development path builds on the principle of relatedness which has been shown as one of the core mechanisms on which regions rely their future growth (Balland et al., 2019). It is based on the structural role of diversity and similarity: new activities enter more easily in a region when they require similar knowledge and inputs as those already present (European Commission, Directorate-General for Research and Innovation et al., 2017). Boschma et al. (2015) and Kogler et al. (2013) have demonstrated that this principle of relatedness also applies to technologies. They provide empirical evidence that the entry (and exit) of technologies in cities over time is shaped by the existing (or declining) technological knowledge base of that city. A new technology is more likely to enter a city when it is cognitively related to technological domains in which the region is already specialized.
This result is of particular importance when considering regional innovations and their capacity to be more creative. In the Schumpeterian tradition, innovation is the result of a process of recombination which 'refers to the way old ideas can be reconfigured in new ways to make new ideas' (Weitzman, 1998, p. 333). The result of this process can lead both to incremental and radical innovations given the underlying search process. Because exploration is costly and uncertain, most innovations are incremental and follow a well-known trajectory in which search occurs in close cognitive proximity. It builds on reusing and improving known technological combinations (Carnabuci & Operti, 2013). In a regional context, it means that innovations tend to rely on local specialization and expertise which are diffused and improved through localized knowledge spillovers and learning. In contrast, more novel combinations may lead to radical innovations and tend to follow a more uncertain and disruptive path. The invention can be considered as radical when it combines previously unconnected technologies or components (Fleming, 2007;Verhoeven et al., 2016). The underlying search process is more exploratory as it deviates from usual paths and combines technological domains that are cognitively more distant (Arts & Fleming, 2018;Schoenmakers & Duysters, 2010). Applied to a regional context, novel innovations can be defined as a combination of technologies that occur for the first time in a region (Montresor et al., 2020).
If the entry of new technologies has been shown to be crucial for innovation as it feeds local technological diversity and enhances the opportunities to combine them, it is less clear whether this principle also supports more radical types of innovation (Boschma, 2017). And in fact a number of scholars argue that unlike most (incremental) innovations, relatedness may not be sufficient to explain the occurrence of radicalness which combines rather unrelated technologies (Castaldi et al., 2015;Hesse & Fornahl, 2020).
The aim of this paper is to contribute to this literature by considering the relationship between novelty and relatedness and focus on the regional dynamic underlying the emergence of these combinations. More specifically, we study whether novelty combines technologies that build on the regional technological portfolio using the relatedness density indicator. However, our paper departs from this literature by its methodology and empirical strategy. First, unlike studies based on aggregate data, we explore the microeconomic process underlying patent novelty by considering individual inventions produced by inventors located in a given region. Focusing on patents and their underlying technological combinations, we control for patent specific characteristics which allows to focus on regional determinants of novelty. The approach is very similar to microeconomic approaches investigating spillovers arising from agglomeration economies such as urbanization or localization on individual innovations; in our case, the variable of interest is relatedness density. Second, previous studies show that the likelihood, for a new technology, of entering a region increases when the technology is related to the local knowledge base. However, a technology rarely enters the region on its own, it is combined with other technologies within a patent invented in a region. Our empirical strategy enables to control for the characteristics of these other technologies by considering whether they build upon existing trajectories or instead depart from it when introducing novelty. More precisely, patents may combine technologies (1) that enter the region for the first time, (2) that are already in the region but do not have comparative advantage or (3) are unrelated with each other and make its entry less likely and more difficult. Our aim is to consider how relatedness interacts with these technological characteristics and facilitate or instead limit the occurrence of novelty.
We operationalize the relationship between novelty and relatedness by using the Organisation for Economic Co-operation and Development (OECD) REGPAT database (March 2018 Edition) and the European Patent Office (EPO) and Patent Cooperation Treaty (PCT) patent applications at the NUTS-2 level. We focus on a subsample of patents produced by French-resident inventors over the period 1990-2010 and consider for each region and each technological subclass pair used in a patent whether it enters the region for the first time, and if not, how many times it has been used previously. We do also use the OECD REGPAT database to characterize the technological space of regions in terms of specialization patterns based on 18 countries and compute the relatedness density indicator which reveals the extent to which a technological class is related to the existing space.
While it is admitted that the entry of new technologies within a region is driven by its degree of relatedness to its technological portfolio, our results indicate that this relationship does not apply to novel technological combinations. Our findings suggest that relatedness density tends to favour the production of inventions that reinforce local specialization and the reuse of already applied combinations which translates into a negative impact on recombinant novelty. More specifically, relatedness density has an inverted quadratic relationship with novelty: for small levels of relatedness density, it increases the likelihood of novel combinations but for higher levels the impact is decreasing. By further investigating the determinants of novelty, we find that novel combinations occur when it combines technologies that are new to the region, unrelated or do not rely on local comparative advantage. However, the likelihood of novelty is less negative if these technologies are related to the local capabilities as it facilitates learning and knowledge spillovers. Said differently, novel inventions combine technologies for which the region has an expertise with technologies that are more cognitively distant and unrelated to the local portfolio. Finally, we do also consider the impact of different types of actors and explore whether they rely differently on relatedness. Our findings suggest that small and novel players unlike large incumbents and universities tend to build on the local technological capabilities to create novelty.
The rest of the paper is structured as follows. In the next section we explain the underlying mechanisms and the hypothesis we test based on previous literature. Section 3 unveils the data, our empirical strategy and the construction of variables. Section 4 presents descriptive statistics, section 5 our empirical results. The last section concludes.

LITERATURE REVIEW
Innovation processes are usually geographically localized and constrained by the local knowledge base available to organizations and their inventors. The regional knowledge portfolio offers opportunities for novel combinations by broadening the available knowledge base and geographical proximity facilitates learning and lowers the cost of searching for new technologies and moving into related ones (Boschma et al., 2015). The question raised in this paper is whether and under which conditions relatedness favours or inhibits the creation of genuinely novel innovations.

Recombinant search process and the sources of novelty
Innovations are not all alike and they differ by their degree of novelty which is best understood using the recombinant approach (Fleming, 2001;Weitzman, 1998;Verhoeven et al., 2016). The Schumpeterian tradition conceptualizes the innovation process as the outcome of a recombination of existing or new technologies, knowledge elements or components (Fleming, 2001;Nelson & Winter, 1982, p. 130;Weitzman, 1998). When successful, this recombinant process does not necessarily lead to genuinely novel inventions and in fact most inventions build on incremental improvements of formerly used combinations (Arts & Fleming, 2018). The capacity of organizations to innovate and find new technological combinations depends on their underlying search mechanisms, that is, whether they explore new and distant knowledge or instead exploit and deepen their prior knowledge (March, 1991).
The literature converges to claim that genuine novelty relies primarily on distant search and exploration as it offers higher opportunities for cross-fertilization and new combinations. First, the degree of novelty of an invention depends on the combination of previously unconnected technologies or components (Fleming, 2007;Verhoeven et al., 2016). The invention is considered as radical when it is composed by a relatively high number of new combinations. This definition highlights the introduction of genuinely new combinations which may also potentially generate new areas of development as compared with already existing combinations which may rather indicate a refinement or an incremental improvement. Second, novelty can also be conceived as the combination of technologies distributed over different technological domains (Keijl et al., 2016;Nemet & Johnson, 2012). For a given invention, the larger the distance between the technological domains and the larger the degree of novelty. Both definitions refer to technological brokering of ideas and technologies that may belong to similar or separate technological fields or industries.
Even though distant search and exploration offer higher opportunities for cross-fertilization, they are also more uncertain and increase the variability of outcomes, failures as well as successes (Fleming, 2001). For these reasons, most inventions reuse familiar technologies as they benefit from experience and learning and they rely on local search in closely related technologies within organizational and regional boundaries (Rosenkopf & Nerkar, 2001) as interactive learning is more effective and can be realized at lower cost.
In order to better formalize this distinction between local and distant search and whether inventors reinforce regional specialization or instead explore new trajectories, we decompose and characterize the technologies combined in each invention. First, technologies can be related or unrelated with each other: related means they are commonly combined and appear frequently within patent documents in a region otherwise they are unrelated, more difficult and riskier to interact, and inventors may lack the needed competences (Caviggioli, 2016;Li et al., 2021). Second, individual technologies may enter the region for the first time; we know that they are more likely to enter the region if they are related to the local knowledge base as in Boschma et al. (2015). Third, the combined technologies may belong to the regional specialization or not. If they do, they tend to reinforce and deepen the current regional technological trajectory. Otherwise, the combination may potentially open new technological developments.
2.2. Deepening technological trajectories and regional path dependence Because actors have limited cognitive capabilities and try to reduce uncertainty and risk, they will primarily exploit their own knowledge and expand it through local search by exploring technological combinations that are related to their existing technology portfolio and regional network. As knowledge is tacit and sticky, sourcing knowledge within a region enables to reduce the cost of learning and knowledge absorption. This is the main result of the relatedness literature which demonstrates that the probability that a region entersor exitsnew technologies is a function of the number of related technologies that already exists in that region (Balland et al., 2019;Boschma et al., 2014Boschma et al., , 2015Hidalgo et al., 2007Hidalgo et al., , 2018. As a consequence, the technological composition of a region affects the rate and direction of technical change which ends up being a cumulative, path-and place-dependent process Henning et al., 2013;Martin & Sunley, 2006;Rigby & Essletzbichler, 2008). Said differently, organizations are conditioned by the existing knowledge base and they search new technologies by diversifying in closely related technologies because it lowers risk, uncertainty and facilitates learning processes (Arts & Fleming, 2018;Arts & Veugelers, 2015;Fleming, 2001;Rosenkopf & Nerkar, 2001). However, this path and place-dependent process will predominantly favour incremental refinements of already existing combinations instead of genuine novelty (Castaldi et al., 2015;Li et al., 2021) as these inventions tend to reinforce and deepen the current specialization and the existing technological trajectory. Thus, we have the following expectations: Hypothesis 1a: Relatedness and local specialization increase the likelihood of reusing existing technological combinations.
Hypothesis 1b: Relatedness and local specialization reduce the likelihood of producing new combinations.

Path-breaking trajectories and novelty
The previous section just argued that relatedness reduces the likelihood of producing new combinations when the associated technologies reinforce and deepen the current specialization (Hypothesis 1b). However, when these combinations introduce technological novelty that departs from local trajectories, there is a greater need for stronger embeddedness in the local knowledge base because relatedness facilitates the underlying learning and recombinant search process. Said differently, the impact of relatedness on the likelihood of producing new combinations will depend on their characteristics in terms of novelty and path-breaking trajectory relative to the local knowledge base. Three situations may be distinguished: . The combination introduces a technology subclass that is new to the region. . The combination is rather unusual in the sense that the combined subclasses are usually unrelated with each other. . The combination associates technologies that have no comparative advantage within the region (RCA < 1).
We now discuss each of these three scenarios and explain their moderating role on the impact of relatedness density on novelty, that is, why novel combinations are more likely to be produced if they are related to the local knowledge base.
First, an invention may combine technologies that enter the region for the first time, as depicted in Boschma et al. (2014Boschma et al. ( , 2015, with the difference that we consider a pair of technologies. If the novel technology is combined with subclasses that are related to the local knowledge base, it will be easier to deal with the learning challenge of exploring new technologies (Arts & Fleming, 2018;Balland et al., 2019;Boschma et al., 2014Boschma et al., , 2015. Building on local knowledge through collaborations or networks provides competences and expertise of how technologies may interact with each other, it reduces the cost of learning new competences and increases the likelihood of success. Second, inventions may combine technologies in which the region has limited or no expertise, that is, for which the region has no revealed comparative advantage (RCA < 1). As in the previous case, the lack of prior knowledge may be compensated if the combined technologies have some proximity with the local knowledge base as it facilitates the learning and recombinant search process.
Third, the new invention may combine technologies that are unrelated with each other in the sense that the combination of two subclasses is seldomly found together on a patent document within a region. 1 When an invention combines technologies that are unrelated with each other, it requires by definition completely different capabilities and their combination can be difficult especially if actors lack absorptive capacity to integrate and combine too distant bodies of knowledge effectively. However, unrelated technologies are more likely to be recombined if they are individually strongly present in the same region or related to the local knowledge base as technological distance is easier to bridge within geographical boundaries (Janssen & Frenken, 2019;Li et al., 2021). When these combinations occur successfully they may even help create new technological paths and generate competitive advantage as these novel competences are more difficult to imitate than innovations stemming from related knowledge as argued by Janssen and Frenken (2019). Finally, Castaldi et al. (2015) also find that innovations in general draw on technologies that are locally related whereas more breakthrough inventions tend to rely on rather unrelated diversity.
Compared with the baseline effect (Hypothesis 1b) in which relatedness has rather a negative impact on the introduction of novelty, its impact is moderated when individual technologies do not belong to the local knowledge base. In consequence, we hypothesize that relatedness facilitates the emergence of technologies and technological combinations which depart from the local trajectory because it facilitates learning and the recombinant search process. Hypothesis 2b. Relatedness increases the likelihood of novelty when the combined technologies have no comparative advantage (RCA < 1).
Hypothesis 2c. Relatedness increases the likelihood of novelty when the combined technologies are unrelated.

Relatedness and agents of technological novelty
In this last section we move one step forward and ask whether the effect of relatedness varies across agents. We investigate who are the actors that introduce technological novelty within regions and whether they build on relatedness to achieve this goal. The heart of the discussion is whether new activities are introduced by new players and whether they build on local resources to do so. In order to distinguish between agents, we rely on inventive activity and distinguish: (1) organizations with large inventive activity that have the characteristics of incumbent firms; (2) organizations that patent for the first time and have the characteristics of start-ups; (3) organizations that patent in the region for the first time and may potentially bring extra-regional knowledge; and (4) universities that are located in the region but may benefit from external linkages to renew their knowledge base.
Because large incumbents are strongly embedded in their region, it is expected that they develop inventions that reuse local capabilities and resources that reinforce local specialization . In contrast, novelty is most likely to be introduced by smaller and novel players that are less constrained by regional specialization and structure. The question is then whether they build on local experience to develop innovations that are related to the local knowledge base (Klepper, 2007) or rather unrelated. Actors that enter the region for the first time have a higher likelihood of producing novelty but may not be able to benefit from the local market due to a lack of anchoring. This is probably the reason why Neffke et al. (2018) find that new firms introduce novel and unrelated activities when they come from outside the region. Finally, we consider the specific role of universities. When universities collaborate intensively with local actors they probably reinforce local specialization, however as they are embedded in extra-regional networks, they may have wider access to potentially unrelated technologies. In sum, through their degree of embeddedness and local networks, large incumbents and universities will tend to reinforce current specialization and decrease the degree of novelty. We propose to test the following hypotheses.
Hypothesis 3. Large incumbents and universities are less likely to rely on local relatedness to produce novelty.
Hypothesis 4. Small and novel players are more likely to rely on local relatedness to produce novelty.

Data and indicators
The empirical analysis uses the OECD REGPAT database (March 2018 Edition) and the EPO and PCT patent applications at the NUTS-2 level and refers to a sample of 22 French regions with priority year between 1990 and 2010. A patent is assigned to a region if at least one of the inventors listed on the patent document reports an address located in that region. In case where two inventors of the same patent report location in different regions, patents are affected to each region.
Our objective is to estimate whether a new technological combination occurring in a region is related to the technology portfolio of that region. Following the literature on new technological combinations (Arts & Fleming, 2018;Fleming, 2001;Verhoeven et al., 2016) and applying it at the regional level, we count the number of times a subclass pair combining two International Patent Classification (IPC) codes at the four-digit level has already been used in a given year and region. Henceforth, a combination is considered as new if it appears for the first time in a given year and region, that is, the count is equal to one. On this basis, we identify each new technological combination entering the region. The sample includes all patents with two or more IPC combinations and amounts to 50% of all patents (46,596 out of 92,000). Of those 50%, there are 70% with inventors from only one region.
We consider two dependent variables. The first is New technological combination; it takes the value of 1 if the subclass combination appears in the region for the first time in history, 2 and 0 otherwise. In order to contrast the impact of relatedness on inventions that are not novel, we compute also the variable Reuse of technological combination which counts the number of times a given combination, composing the patent, has already been used in the region.
The main explanatory variable is the relatedness density of each IPC subclass. For each subclass pair, we use the mean of the relatedness density of both subclasses. Relatedness density is computed following the method proposed by Boschma et al. (2015). To capture the relatedness of technologies available in regions, we use the OECD REGPAT 2018 database. All individual patents are allocated to one of the 245 NUTS-2 regions 3 based on inventors' location and one of the 633 IPC codes at the four-digit level (IPC4) using fractional count (Kogler et al., 2017). Thus, if a patent has been invented by inventors located in three NUTS-2 regions and if the patent is characterized by two IPC codes, a patent is allocated at a NUTS-2 region and IPC code with a weight of 1/3*1/2 ¼ 1/6. The advantage is that it avoids double counting of patents.
Based on fractional count, we compute the RCA for each region and technology, which is the relative frequency of a technology in a region compared with the frequency of the same technology over all regions. If the value is > 1, the region is considered as having an RCA (0/1). More precisely, a region r has an RCA in technological class i if the ratio below is > 1: The next step is to compute the conditional probability of having a comparative advantage in two technologies i and j. We compute the conditional probability of having comparative advantage in a technology i conditional on having comparative advantage in technology j and vice versa. To avoid the issue when a region is the only one who has a specialization in one specific technological class, we keep the minimum value of those two conditional probabilities. Hence, to measure the relatedness between each two pair of technological classes i and j, we build variable w i,j following Hidalgo et al. (2007) and Boschma et al. (2015): where w i,j , relatedness, is the minimum of these two pairwise conditional probabilities. P(RCA i |RCA j ) is the number of regions with an RCA in both i and j divided by the number of regions with RCA in i, and P(RCA j |RCA i ) is the number of regions with an RCA in both j and i divided by the number of regions with RCA in j. We get a relatedness matrix with conditional probabilities [0,1]. Finally, we consider only the highest conditional probabilities. Thus, we consider a 5% threshold meaning that only the top 5% of all technology pairs that have the highest relatedness are considered as related, while the remaining 95% are considered as unrelated. After calculating the relatedness w i,j , we follow Boschma et al. (2015) and calculate the relatedness density for French regions, at NUTS-2, specific to each class i and region r. This is calculated as the weighted i of w i,j . More formally: The relatedness density variable captures how each technological class i is connected to the technological portfolio of the region r. This is a combination of two factors: how connected is the technological class i to the rest of the technological space, and the portfolio of technological specialization of each specific region r. Formally, in equation (3) the numerator measures the sum of relatedness from technology class i to all technological classes j that are part of the region portfolio (RCA j,r > 1), and the denominator is the sum of the relatedness to all technological classes present in the technological space. All in all, a high level of our main explanatory variable of interest, relatedness density, reveals a strong connection to the regional technological portfolio, while a low level of relatedness density shows that the technological class i is sparsely embedded in the knowledge portfolio of region r. In order to cope with the fact that relatedness density may have a higher impact for lower values and lower impact for higher values as we expect that high specialization may rather limit novelty (Hypothesis 1), we also estimate the quadratic form of relatedness density.
In addition, we characterize each technological subclass pair used in a patent in order to identify whether they reinforce or rather depart from the pre-existing technological set of capabilities in the region. New subclass takes a value of 1 if the combination includes at least one technology that is new to the region; No subclass has RCA > 1 takes a value of 1 if none of the combined subclasses have comparative advantage within the region, One subclass has RCA > 1 takes a value of 1 if one of the combined technologies has comparative advantage and 0 otherwise; both variables are included in the regression and the base reference is when both subclasses have RCA > 1. Unrelated technologies takes a value of 1 if two technologies are usually not related within a region. It is based on the conditional probability, w i,j , computed in the previous section and independent of the specific regional context.
In order to investigate the impact of relatedness density by type of agents, we distinguish five categories of applicants. Entry takes a value of 1 if the patent is invented by an applicant that appears for the first time in the OECD HAN database, March 2018. Entry in the region takes a value of 1 if the applicant applies for the first time for a patent with inventors located in the region. In this case, it may be a large incumbent or a subsidiary that has already patented with an inventor team in other regions. University takes the value of 1 if the applicant is a French university. Local applicant takes a value of 1 if the applicant has inventors all located in a single region over the past five years as an indicator of the applicant's location of its inventive activity. Experience (# patents) is the applicant's number of patents accumulated over the past five years as provided by the HAN OECD database.
We do also include regional variables to control for the potential determinants of novel combinations at the technological and economic level. Number of patents in the subclass is the average number of patents applied in the region in each of the subclasses combined during the five previous years. Number of associated subclasses is the average number of subclasses to which our focal subclass pair is combined in the region during the five previous years. Technological specialization of the region is measured by the average location quotient weighted by the number of patents computed at the region and subclass level (Boschma et al., 2015). Technological generality is the average generality index of all patents in the region (Squicciarini et al., 2013;Trajtenberg et al., 1997) as provided by OECD Patent Quality Indicators database, March 2018. The generality index measures to what extent citations come from distinct technological domains and thus the higher potential for cross-technological recombination within the region. Inventors with external collaborations is the average number of inventors within the region with extra-regional collaborations and computed for each subclass combined in the patent.
Finally, we include a number of time-varying region characteristics such as the number of employees in the region (Employment), the number of inhabitants/m 2 (Population density), the economic wealth of a region (Income per employee) (Boschma et al., 2015) from INSEE.

Empirical strategy: estimated model, estimator choice and endogeneity issues
Our empirical strategy is to decompose each patent in a number of subclass pairs and apply a patent fixed effect estimation procedure. Through these fixed effects, we capture all effects that are common to a patent and as all patent-specific variables are perfectly collinear with the patent fixed effects, they are not included 4 and we focus our estimations on the impact of technology-region variables on novelty. Furthermore, year fixed effects are not needed as a patent is invariant in time. We include subclass fixed effects (two for each combination): they capture all factors specific and common to each technology subclass that are common to all patents based on these codes. At last, note that everything that is region specific is only estimated when using the sample of patents with inventors from more than one region. When including all patents, even those reported in two or more regions, we do also include region fixed effects. We end up with 46,596 patents and 264,948 observations. The estimated model is: Reuse of technological combination r,i,j,t = + b 1 Relatedness Density r,i,j, t As we keep only patents with at least three pairs of combinations (three IPC subclasses), the patent fixed effects a p captures all common feature from a single patent p. Both our dependent variables are measured at the region, subclass pair combination i, j and time t level. Our explanatory variable of interest, Relatedness Density, is calculated at this same level, and measured as explained above. The explanatory variables X ′ are listed above, such as population density: they are variables at the regional level that can be estimated as many patents in our sample have inventors located in different regions. The conditioning set Z refers to variables at the same region-technology level such as our variable of interest: the number of patents in the subclass and the number of associated subclasses. At last, we include region fixed effects f and technological (subclass) fixed effects c. Our standard errors are clustered at the patent level. 5 We estimate the above models using a traditional linear panel data fixed effect estimator, also known as least square dummy variable (LSDV). To our knowledge, to use a linear probability model (LPM, or applying an ordinary least squares -OLS) in a panel data setting with a binary dependent variable is the most appropriate empirical method (Boschma et al., 2015). Our identification strategy relies on the variation of combinations of technological classes within a patent: that is, we estimate the effect of relatedness density on novelty controlling for patent fixed effects. In contrast, including a fixed effect in a nonlinear model, such as logit or probit, leads to an incidental parameter problem. Nonetheless, Table A2 in the supplemental data online estimates our benchmark specification (column 6 of Table 3) using a logit and a probit, without patent fixed effects, although keeping region and technological class fixed effects. Our main results are robust to these alternative estimation methods.
The endogeneity of our independent variable of interest (Relatedness Density) could arise from two sources: reverse causality and omitted variables. We explain in detail below how we tackle these issues.
Regarding reverse causality, endogeneity is very limited. At first, Relatedness Density is primarily computed using data from 245 NUTS-2 regions of 18 countries. This is the step explained in equation (2), in which we compute the relatedness for each subclass pair. Therefore, it is very unlikely that a French region may influence this calculation. Next, the numerator of Relatedness Density (equation 3) is calculated as the sum j of technological subclasses present in region r. Straightforward reverse causality for the technological class i can then be excluded, as the variable that measures a new combination of a pair of IPC classes does not enter directly the calculation of relatedness density. Nonetheless, as in our regressions we use the average relatedness density of both IPCs, the relatedness of the IPC i is used in the calculation of the relatedness density of the IPC class j. Once again, note that there are 633 IPC codes in this level of disaggregation: a reverse causality effect is again very low. At last, the RCA dummy in equation (3) (RCA j,r > 1) incorporates information on the technological classes of patents in the region. An issue would arise if a specific patent drove the number of the other technological classes j having an RCA in the region. A concern would be that patents with many IPC codes develop some kind of technological spillovers in the region. In that sense, it could lead to a situation where having many IPC codes in a patent draws a higher count in equation (1), generating higher observations of RCA > 1. Therefore, to address this issue, we estimate again our model by dropping patents with a high number of IPC classes. We discuss these results in section 5 below.
Regarding the omitted variable bias, the relationship between relatedness density and recombinant novelty could catch unobserved determinants of novelty. However, our empirical strategy includes patent, technology class and region fixed effects. These capture common features for all combinations of IPCs in a patent, for every single IPC in the combination which is common to all patents and regions, and for characteristics of patents and IPCs located in a single region. For example, the patent fixed effects capture variables such as patent scope, patent quality and backward or forward citations. The technology class fixed effects captures intrinsic and invariant characteristics of each technology. At last, the regional fixed effects capture the state of technological development of a region, such as being a lead or a laggard. Summed up, these set of variables control a great share of the determinants of recombinant novelty. However, note that our variable is at the technology-regional level: it is specific for each technology in each region for a given year. For example, the density of links to other technologies for IPC subclass H04N in region Bretagne in a given year. In that direction, we control for two variables (our Z variables above) that can be specific to a technology in a given region: the average number of patents applied in the region in each of the subclasses combined during the five previous years (of patents in subclass), and the average number of subclasses to which our focal subclass pair is combined in the region during the five previous years (of associated subclasses). These variables control for knowledge and spillovers related to the subclasses in the same region. In the end, our estimated model covers the different levels potentially determining recombinant novelty.

DESCRIPTIVE STATISTICS
Descriptive statistics and correlation for the variables used in the regressions are provided in Table A1 in the supplemental data online. In order to better understand the regional process of novelty generation, this section provides descriptive statistics on the percentage of patent novelty, new combinations, relatedness density and specialization averaged at the region level. As relatedness density relies on regional specialization, we do also explore whether the technologies that are combined within a patent produced in a region relies on local specialization (i.e., RCA > 1). More specifically, we consider for each patent combination whether it includes none, one or two subclasses with an RCA > 1 and compute the mean RCA. We do also consider whether a given technology pair combines subclasses from different fields or different sectors based on WIPO classification (Schmoch, 2008). Table 1 describes the spatial characteristics of patent novelty and their related new combinations for each region. The table shows a certain variability in the percentage of novel patents (respectively new combinations) across regions ranging from 17.3% (respectively 7.7%) in Île-de-France which is a leading region with many patent applications and a large technological portfolio to 67.7% (respectively 62.5%) in Limousin which has the opposite characteristics. Relatedness density is ranging from 0.23 for Limousin to 0.54 for Languedoc-Roussillon whereas Île-de-France has a rather high value of 0.37 as well as Rhône-Alpes (0.42) which characterizes portfolio for which technologies are diversified and rather related. Regarding regional specialization, most regions combine technologies for which only one out of the two combined technologies are based on regional specialization (mean RCA <1) and only seven regions including Île-de-France combine technologies for which both are built on local specialization (mean RCA > 1).
Table 1 also shows that new combinations associate technologies from different fields or different sectors confirming previous findings from the literature (Nemet & Johnson, 2012;Keijl et al., 2016): across regions, 65.9-76.5% of all combinations are cross-fields and 25-40% are cross-sector. Finally, the last column indicates the Note: a % Patent novelty is the percentage of patents that includes at least one technological combination new to the region. b % New combinations is the percentage of combinations that are new to the region. c Relatedness density is the average relatedness density of the combined technologies. d For each patent combination, we consider whether it combines none, one or two subclasses belonging to the regional specialization and averaged at the regional level. e Total number of subclasses that represent a revealed comparative advantage (RCA). Note: a Number of times a combination has been used in the region. b Each invention is decomposed into subclass pairs (i.e., combination IPC1-IPC2)the table characterizes these combinations.

Regional recombinant novelty, related and unrelated technologies: a patent-level approach
number of subclasses (out of 633) for which a region has an RCA > 1. Rhône-Alpes and Île-de-France are the regions with the largest number of specializations. Table 2 explores the relationship between novel combinations and the characteristics of combined subclasses. It shows descriptive statistics on the characteristics of novel versus already used technological combinations and t-tests for the comparison of means. Table 2 summarizes the 264,948 technological combinations (50,960 new to the region versus 213,988 already used combinations) characterizing the 46,596 patents in the final sample. Regarding already used technologies, the table indicates that on average a combination is used 153.54 times in a region and associates individual subclasses that are used on average between 356.25 and 539.36 times. In comparison, new combinations rely on subclasses used eight times less. 6 Thus, it is not surprising that new combinations build on technologies in which the region is not specialized: 39% of new combinations are based on technologies for which the region is not specialized (RCA > 1), 44% rely on a combination for which only one subclass has an RCA > 1 and only 17% associated subclasses for which both have an RCA. The differences are statistically significant when compared with reused combinations. An interesting point to note here is that 44% of new combinations combine RCA with non-RCA technologies suggesting that local organizations explore new combinations by relying on local expertise combined with more exploratory or even technologies that are new to the region.
Table 2 also indicates that the mean relatedness density is lower and significant for new combinations but the difference is small. Finally, 84% of new combinations combine technologies from different fields (compared with 69% for reused combinations) and 52% from different sectors (compared with 26%). And 15% (respectively 12%) of new combinations occur in Île-de-France (respectively Rhône-Alpes) region.

Main results
In this section, we estimate the conditions under which technological combinations occurring in a region are related to the technology portfolio of that region. In order to do so, we contrast the impact of relatedness density on the number of reused combinations and the likelihood of introducing a new combination in order to explore whether the impact changes given the degree of novelty as measured by the number of times a combination is used in the region. Table 3 presents our main results. The first three columns test the impact of relatedness density on the Reuse of a technological combination, that is, the number of times a given combination has been used in the region. The variable is introduced in a linear and in a quadratic form, and the impact is clearly linear and positive meaning that relatedness density favours the combination of technologies that are already used and related to the regional technological portfolio. This result is depicted in Figure  1a. Results hold in column 3 when we introduce control variables concerning the patent subclass recombination characteristics. When the combined subclasses have no comparative advantage (RCA > 1) or only one, the number of reused combinations is lower than when both are specialized, as the coefficients associated with these variables are negative and significant. In sum, these results confirm the Hypothesis 1a that relatedness and local specialization increase the likelihood of reusing existing technological combinations. Regarding our control variables, the level of activity in the region such as the income per employee, the number of patents or inventors with external collaborations in the subclass has a positive impact on the number of recombinant reuses. New subclass has also a positive impact on the number of already used combinations. 7 The remainder of Table 3 tests the impact of relatedness density on the occurrence of a new recombination. In column 4 relatedness density is introduced linearly and the impact is negative and strongly significant. Relatedness density appears to reduce the likelihood of introducing a new combination in a region. In columns 5 and 6, we test the quadratic form which is significant and the curve has an inverted 'U'-shape. The average marginal effect shows that the slope of Relatedness density is positive only for small values of relatedness up to 0.30 and then becomes negative and decreases sharply as Relatedness density increases. The maximum probability of novelty corresponds to a relatedness density of around 0.3 ( Figure  1b). 8 This result holds throughout the paper and the interpretation is the following: 9 as relatedness to the regional portfolio becomes stronger, the probability of introducing a new combination of subclass diminishes, supporting Hypothesis 1b. Relatedness and local specialization reduce the likelihood of producing new combinations. columns 7-10 explore the impact of relatedness density given the characteristics of the combined technologies in three situations through interactions. The results are easier to interpret based on graphical representations. Figure 1c illustrates the interaction between Relatedness density and new subclass (column 7) showing that the introduction of a new subclass in the region increases the likelihood of having a novel combination as the degree of relatedness density increases. This confirms Hypothesis 2a as well as the results found by Boschma et al. (2015) that relatedness density eases the introduction of new technologies in a region. If the combinations rely on technologies that are already present, the impact is decreasing as the degree of relatedness density increases. Figure 1d illustrates the impact when the combined technologies are unrelated to each other. The inverted 'U'-shaped curve moves upwards indicating that the likelihood of introducing a new combination is significantly higher when subclasses are unrelated given a level of relatedness density meaning that relatedness facilitates the introduction of unrelated and uncommon combinations confirming Hypothesis 2b. However, as before the impact reduces as relatedness density increases. Finally, Figure  1e illustrates the interaction between relatedness density and whether the combined subclasses belong or not to the local specialization (RCA > 1). Results support Table 3. Impact of relatedness density on the number of reused combinations and the likelihood of a new combination (patent level with patent fixed effects).

Regional characteristics # of Recombination reuse
New recombination (0/    Hypothesis 2c and show that novelty is favoured when subclasses are not or only partly specialized as compared with the case where both subclasses have revealed technological advantage and the relationship with relatedness density has an inverted-'U' shape. Compared with Figure  1b the decreasing part is less sharp for a given level of relatedness density. These results support Hypothesis 2a-c that relatedness increases the likelihood of novelty when the combined technologies are new to the region, have no comparative advantage or are unrelated. Table 4 tests the impact of relatedness density on the likelihood of a new technological combination given the type of agents. Again, the strategy is based on interactions and the results are the following. Table 4 provides partial support for Hypothesis 3 as universities do not seem to rely on relatedness density to develop new technological combinations (column 1). The interaction is negative and not significant. In contrast, large incumbents approximated by their level of patenting over the past five years do not rely on the local technological portfolio to produce novelty (column 2). The impact is clearly negative and significant. Figure 2a shows that relatedness density benefits marginally more to small and medium agents which provides support to Hypothesis 4. The agents that benefit most from

1280
Anne Plunket and Felipe Starosta de Waldemar relatedness density are applicants that are located in a single region (column 3). They clearly rely on this local environment to produce novelty. Finally, columns 4 and 5 test the impact of agents' entry on novelty and in relationship with relatedness density and results are remarkedly similar to those obtained by Neffke et al. (2018). Those actors that benefit from local specialization are not start-ups that patent for the first time but rather agents that enter the region for the first time but have already a patenting record in other regions. This result, illustrated by Figure 2c, provides partial support to Hypothesis 4 that novel players are more constrained by their local environment and more likely to rely on local relatedness to produce novelty. The next section is devoted to robustness checks. Table 5 presents some robustness checks using our benchmark specification (column 6 of Table 3) which uses new technological combination as the dependent variable. The objective is to verify if our main result, the quadratic relationship between relatedness density and novelty, is robust to different samples that could potentially explain its effect.

Robustness checks
The first four columns focus on inventor location. In the first column, we only estimate our relationship using patents that have inventors located in a single region. Our results remain similar. In column 2 we use only patents for which inventors are located in at least two different NUTS-2 regions: relatedness density remains negatively and significantly associated with novel recombination. Then, we distinguish estimates for inventors located in the largest French innovative regions (namely, Île-de-France and Rhône-Alpes in column 3) and outside those regions (column 4). Results are similar in both specifications. The following columns focus on the technological characteristics of combinations. Columns 5 and 6 test the effect when the combined technologies are cross-field or from similar fields using WIPO classification (Schmoch, 2008). Our main result remains constant. Next, we estimate our relationship using a sample of patents that have at least one new recombination among all combined pairs: relatedness density still has a quadratic impact associated with patent novelty. At last, we verify if relatedness density is robust to the estimation without extreme values for this variable, namely the top and bottom 1% of the distribution. Results remain unchanged.
To further test the robustness of our results, we run our benchmark specification on the relative density measure instead of the relatedness density. Table 1 shows that regions have different technological portfolios with an average relatedness density ranging from 0.23 to 0.56. These different absolute values are not comparable from one region to another if technologies are not in the same interval throughout all regions. More importantly, if Figure 2. Impact of relatedness density on new recombination given the type of agents.
Regional recombinant novelty, related and unrelated technologies: a patent-level approach regions with a high-density on average are also those that do not combine new subclass pairs, our results could be driven only by these observations. Note that although our estimations already include patent and fixed effects, we still decide to be conservative and test the robustness of our results to the estimation using a relative density measure following Pinheiro et al. (2018). Relative density enables to compare the density of technologies not present in the region with the density of a region's option set (OS). The option set of a region comprises all the technologies that are not yet in that region. Relative density is computed as: where RD i,r is relatedness density of a technology in a region as in equation (3), mean (RD r,OS ) is the average density of all technological subclasses in the option set, that is, technologies in which the region has no RCA and the denominator is the standard deviation of the density of the technologies in the same group OS (Pinheiro et al., 2018). This measure centres around zero and negative values indicate unrelated technologies. Thus, we test the impact of a variable 'Unrelatedness' which takes a value 1 when relative density is negative, and 0 otherwise. Results in column 9 confirm that new technological combinations rely on technologies that are mainly unrelated to the local portfolio as the associated coefficient is positive and significant. Table 6 tests the robustness of our results for other potential biases. In the first five columns we verify how the dependent variable evolves through the period of study. As novelty is any combination that has never occurred before, regions with a larger knowledge base in the end of the period will find it more difficult to generate new combinations. 10 We tackle these potential biases in two ways. First, in columns 1-3, we split the sample period into three: 1990-96, 1997-2003 and 2004-10. Our main result is robust for these separate periods. Second, in columns 4 and 5 we split the sample in two, above and below the median, and the criteria we use is the share of new combinations of the total of combinations of a region for the year in the middle of our period (2000). Results are robust for both samples. In the last two columns, we check for potential endogeneity from reverse causality as explained in section 3.2. This could be an issue if a patent with many IPC codes drive the number of other technological classes in the same region, generating technological spillovers that could raise the potential number of RCA and thus raising relatedness density. In that direction, to curb a possible indirect effect of reverse causality from IPC codes to relatedness density, we estimate our benchmark specification using only patents with a few numbers of IPC codes. At first, in column 6 we present results from an estimation only with patents having no more than six IPC codes (15 combinations), which account for 86% of our observations. In column 7 we show estimation results for patents with no more than eight IPC codes (28 Table 5.

Continued.
(1) In the single region specification, regional variables cannot be computed. IDF, Île-de-France.
Regional recombinant novelty, related and unrelated technologies: a patent-level approach combinations), which account for 95% of our observations. Our main result remains unchanged.

CONCLUSIONS
In this paper, we contribute to explain the emergence of technological novelty for individual innovations in a given region, following the recombinant literature (Fleming, 2007;Verhoeven et al., 2016). Using patent data for French regions over the period 1990-2010, we contrast events in which technological subclass pairs are combined for the first time with events in which they have been reused already many times. We use a representative sample of patents combining at least three subclass pairs in order to exploit variations of technological characteristics at the regional level while controlling for patent fixed effects amongst other covariates. Our indicator of recombinant novelty is regressed on the average relatedness density of the combined subclasses indicating whether they are related to the technological portfolio of the region. To measure our explained variable of interest, we follow the relatedness literature (Hidalgo et al., 2007 which shows that the emergence of new activities in a region is driven by the number of related activities present in that location. Yet, as highlighted by Pinheiro et al. (2018, p. 2), 'what is true on average is not true for every instance. While countries and regions are more likely to enter related economic activities, sometimes they deviate from this behavior and enter unrelated activities … '.
Our findings shed light on the aforementioned instances: relatedness density is associated with an increase in the number of times a combination has been already used in a region. More importantly, relatedness density is mostly negative associated with recombinant novelty. To be precise, relatedness density has an inverted quadratic relationship with novelty: for small levels of relatedness density, it increases the likelihood of a new pair of subclass technologies to be introduced in a region. This positive effect tapers quickly and with an increasing level of relatedness density, novelty emerges less often. This negative effect of relatedness density decreases when the subclass pairs are not both part of the set of technologies in which the region has an RCA, or when they are unrelated to one another. We have also checked if different actors react similarly to relatedness density when producing novelty in their innovations. Our results show that while large incumbents are less dependent on relatedness, small and novel players build on the local technological portfolio to create novelty, and relatedness density does not facilitate universities to engage in recombinant novelty.
These main results are robust to different covariates and a series of fixed effects. Moreover, our results are also robust to different samples of regions, subclasses within and across fields, to samples of patents with at least one novelty in their combinations, extreme values for relatedness density, as well as the introduction of relative density measure. We also check that our results are robust to the evolution of our dependent variable, Robust standard errors clustered at the patent level in brackets, + 0.10 ** 0.05 ***0.01.
Regional recombinant novelty, related and unrelated technologies: a patent-level approach endogeneity, other estimation methods and the level of clustering of our standard errors.
Our results shed light on a few important policy implications. First, in order to foster regional economic growth and development, relatedness (Balland et al., 2019) plays an important role. We contribute to this literature by showing that the novelty of these activities does not follow the same pattern. In order to set foot on novel innovations, regions also need to explore more distant and less mastered pieces of knowledge, breaking apart from path and place dependence. This implies that the policy recommendations for exploring into distant and unknown knowledge should be adapted to the local environment. Second, these policy recommendations should also consider the type of agents present in the region when trying to foster novelty. Our empirical exploration suggests that the local technological environment heterogeneously constraints the production of novelty depending on the type of actor introducing novelty. While some actors are not affected, some are more constrained while others rely more on relatedness density. Policy design needs then to incorporate the novelty profile of agents in a given region when developing tools to promote novelty. As in the case of structural change in Neffke et al. (2018) we find that local agents are more embedded in the local technological space and novelty is sparked by agents producing already in other regions. In sum, the interaction between local relatedness and the profile of agents is crucial to regional policy design.
Nonetheless, our results and methodology have some limitations which pave the way to future improvements. First, further microeconomic analysis could empirically test specific underlying mechanisms that spark the emergence of new combinations. Second, the field could also benefit from deepening the knowledge associated with the agency dimension: are the new combinations driven by multinationals or other type of companies, and what are they previous experience in introducing novelty elsewhere? Third, an important path is to explore in which type of the technologies are those new combinations intrinsically more complex or not (in the sense of Hidalgo, 2021), and do they pave the way to robust and durable new trajectories? Finally, focusing on specific improvements, we could, for example, find a new way to aggregate the relatedness density of two subclass pairs rather than the average of both single measures, such as a weighted average. We could also interact our variable of interest with other potential determinants of recombinant novelty, especially those that allow regions to bring knowledge and technologies from outside its territory, such as migrant inventors.
These two variables differ from our main dependent variable which considers a new combination occurring for the first time within a French region.