Learning About New Products: An Empirical Study of Physicians' Behavior

We develop and estimate a model of market demand for a new pharmaceutical, whose quality is learned through prescriptions by forward‐looking physicians. We use a panel of antiulcer prescriptions from Italian physicians between 1990 and 1992 and focus on a new molecule available since 1990. We solve the model by calculating physicians' optimal decision rules as functions of their beliefs about the new pharmaceutical. According to our counterfactuals, physicians' initial pessimism and uncertainty can have large, negative effects on their propensity to prescribe the new drug and on expected health outcomes. In contrast, subsidizing the new good can mitigate informational losses.


I. INTRODUCTION
An important aspect of economic innovation is the development of new products. Thus, firms often devote large amounts of resources to research and development. The pharmaceutical industry, in particular, is characterized by a high degree of product innovation and significant investments in research and development. 1 Critical to the adoption of a new pharmaceutical, however, is the willingness of physicians to prescribe it. Although clinical trials and advertising convey information about the quality of a new pharmaceutical, considerable uncertainty remains not only about its general 1. For instance, the U.S. Congressional Budget Office (2006) reports that in 2003, pharmaceutical companies devoted 37.6 billion dollars to research and development of new products. quality but also about its effect on individual patients. 2 Because physicians ultimately resolve this uncertainty by prescribing the new drug 3 to their patients-namely, by experimentationthe adoption of a new pharmaceutical is closely related to the experimentation process by which physicians gain information.
In this paper, we focus on the entry of a new antiulcer drug, omeprazole, in the Italian pharmaceutical market in June of 1990. 4 Our data consist of monthly prescriptions of antiulcer medications by physicians in the Rome metropolitan area between June 1990 and December 1992. We develop and estimate a dynamic discrete choice model of physician Bayesian learning about the quality of a new drug. In our model, physicians have initial beliefs about the new drug's efficacy, and update them every time that they prescribe the new drug and observe the treatment outcome. When choosing which drug to prescribe to a particular patient, a forward-looking physician considers not only the expected outcome for that patient but also the opportunity to learn about the new drug's quality and use the information on future patients. Our parameter estimates allow us to disentangle how uncertainty about the new drug's quality, intertemporal preferences, and prices affect the adoption of the new drug and the corresponding health outcomes.
A recent literature has explored the role of learning about pharmaceuticals. 5 As noted by Manchanda et al. (2005) in their review of this literature, some important aspects of the learning mechanism are the channels through which agents learn (prescription, detailing, marketing meetings, and events), their preferences toward risk (risk-averse or risk-neutral), and whether they are short-sighted (myopic) or forwardlooking. Myopic agents maximize their current expected utility, while forward-looking agents evaluate their current utility as well as the expected future value of the information they might gain through the current prescription. 6 The studies also differ in whether they use physician-level or aggregate data.
A central contribution of our paper is the prediction of market demand for the new drug through a parsimonious yet behaviorally rich model. Using the same physician-level data used here, Coscelli and Shum (2004) model the behavior of doctors who learn in a Bayesian fashion yet are myopic. We, in contrast, model 5. See, for instance, Ching (2008), Chintagunta, Jiang, and Jin (2009), Coscelli and Shum (2004), Crawford and Shum (2005), Currie and Park (2002), LeCates, Lucarelli, and Nicholson (2008), Mukherjee (2002), Narayanan, Manchanda, and Chintagunta (2005), and Narayanan and Manchanda (2009). Berndt, Pindyck, and Azoulay (2003) studied the diffusion of antiulcer drugs based on consumption externalities, although they did not specify a behavioral learning model.
6. In industrial organization and marketing, empirical models of learning in dynamic frameworks have also been applied to other products, such as yogurt (Ackerberg 2003), breakfast cereal (Eckstein, Horsky, and Raban 1998), laundry detergents (Erdem and Keane 1996), online grocer (Goettler and Clay 2009), and telephone consumption (Narayanan, Chintagunta, and Miravete 2007). Previously, Miller (1984) had estimated a dynamic Bayesian learning model of job-worker matching. Miller modeled a multiarmed bandit problem in which the return to every option was uncertain, whereas in our case only the return to the new drug is uncertain. the forward-looking behavior of physicians and reject the null hypothesis that physicians are myopic. According to our estimates, myopic behavior significantly delays the adoption of the new drug and generates large losses to patients' health. Crawford and Shum (2005) use related patient-level data and model how forwardlooking patients learn about their curative and symptomatic matches with respect to alternative medications. In their model, a patient's learning about a medication is not driven by the fact that the medication is new; rather, it is driven by her need to learn which medication suits her best. Hence, she learns about her match to each medication, not just the new one. In our model, however, learning is driven by the market-level uncertainty associated with omeprazole as a new product. 7 While it would be of interest to study both doctors' and patients' learning, our doctorlevel data motivate our focus on physician learning. We believe that focusing on the physician is useful because in practice he chooses which drug to prescribe. 8 In addition, as he gains knowledge through his experience with one patient, he can apply it to other patients. Finally, Narayanan and Manchanda (2009) have studied the role of prescription experience, detailing, meetings, and events using physician-level data. In their model, physicians learn from two sources (detailing and experimentation) yet are myopic. 9 While the frameworks provided by Coscelli and Shum (2004) and Narayanan and Manchanda (2009) allow for the prediction of market demand, ours is more parsimonious. In addition, physicians in our model are forwardlooking.
While the combination of physician-level data and a forward-looking setting lends great power to our framework, it also poses considerable challenges. In the model, we assume that 7. Another difference between Crawford and Shum (2005) and our work is the modeling of beliefs. Crawford and Shum (2005) assume rational expectations, which means that while a doctor does not know a patient's match with a given medication, his beliefs about her match correspond to the actual distribution of matches in the population. We, in contrast, allow prior beliefs to differ from true quality. See footnote 21 for a related discussion. To accomodate for the fact that omeprazole is a new drug, Crawford and Shum allow the population distribution of matches to omeprazole to change several times over the sample period. They do not model these changes as the outcome of a learning process, whereas we model all changes in beliefs as the outcome of learning.
8. Direct-to-consumer advertising of pharmaceuticals is only allowed in the United States and New Zealand. 9. We do not examine detailing in this paper because we have no data on it. when a physician sees a patient, he observes some aspects of her condition. This condition is one of the state variables of the problem, as are the mean and variance of the physician's beliefs about the true quality of the new drug, and the price difference between the new and the incumbent drug. The physician's optimal decision rule is to choose the drug with the largest expected discounted value given the state variables.
Researchers typically calculate value functions using iterative algorithms. Continuous state variables are discretized by using a grid with a finite number of points, with the number of grid points increasing with the number of variables and with the desired level of accuracy. Iterative algorithms working over grids are computationally demanding even for a small number of state variables. In our case, to estimate the model we must calculate the probability of the observed prescription choices (approximately half a million in our dataset) for each parameter point. This poses severe computational challenges.
Faced with these challenges, we exploit theoretical properties of our model and some features of our data to reduce the physician's dynamic discrete choice problem to the computation of a threshold value for the patient's condition, net of price difference disutility. The threshold is a function of only two state variables-the mean and variance of the physician's beliefs about the new drug's true quality. In particular, the threshold is not a function of continuation payoffs. A patient whose net condition is above the threshold is prescribed the new medication, and a patient whose net condition is below the threshold is prescribed the old medication. Exploiting properties of physicians' Bayesian learning, we have developed a recursive algorithm that finds threshold values for one grid point at a time. Calculating thresholds is thus straightforward and very fast, which leads to dramatic savings in computational time.
Our approach is similar to Rust (1987) because his assumptions of additive separability and conditional independence hold in our model. If we were to apply Rust's approach to our problem, these assumptions would enable us to eliminate one dimension of integration when calculating continuation payoffs by assuming a type I extreme value distribution for the patient's condition. Given our assumption that the physician faces only two choices, we are able to eliminate one dimension of integration for any distribution of the patient's condition, by exploiting the shape of the value function with respect to the patient's condition. Because the physician does not update his beliefs when he prescribes the old medication, we simplify further by avoiding the calculation of continuation payoffs.
Having a parsimonious, rich, and computationally tractable model facilitates the prediction of market demand for the new medication. Furthermore, a similar model may be applied to study demand for other experience goods. While the institutional setting in the Italian pharmaceutical market (see next section) has allowed us to abstract away from the supply side without biasing our estimates, in other settings this may not be possible, although computational considerations often preclude the joint study of dynamic demand and supply. 10 By building tractability on the demand side, our approach could facilitate more comprehensive studies of dynamic demand and supply and may be applicable to other dynamic discrete choice problems.
Using our parameter estimates, we conduct several counterfactuals. These indicate that uncertainty about the new drug's quality affects physician prescription behavior because a physician who does not know the drug's quality does not prescribe it to all the patients who need it. This leads to inferior expected health outcomes for patients. More damaging than uncertainty, however, is physicians' myopic behavior. Because a myopic physician places no value on the learning opportunity from prescribing the new drug, he prescribes it much less, which in turn delays his learning. Of the total health-related losses inflicted by a myopic physician, his failure to experiment accounts for a larger share than his uncertainty about the new drug's quality. The expected discounted payoff over 20 years for a representative, forwardlooking physician is 9% lower than for a fully informed physician, and the payoff for a myopic physician is 41% lower than for a fully informed physician. Our simulations show, however, that informational losses can be mitigated by pricing policies that encourage the adoption of the new drug.
The remainder of this paper is organized as follows. Section II presents some descriptive statistics and institutional aspects of our data. 10. For instance, Ching (2008b) studies the dynamic of the demand for prescription drugs after patent expiration, and Ching (2008a) studies the dynamics of the supply of prescription drugs after patent expiration.
Section III presents the model, and Section IV analyzes it. Section V outlines the estimation procedure, and Section VI describes the estimation results. Section VII presents the counterfactuals, and Section VIII concludes.

II. DATA AND DESCRIPTIVE STATISTICS
The barrier of mucus in the stomach and duodenum has the function of protecting these organs from the powerful acids and enzymes produced by the body to digest food. A number of factors, such as infection with Helicobacter pylori bacterium, and long-term use of aspirins and other medicines, can damage the barrier of mucus. Damage to this barrier leads to gastric ulcers, which antiulcer drugs purport to cure. More generally, antiulcer drugs are usually prescribed for four different purposes: as treatment to pathological hypersecretory conditions, as attack therapy for gastroesophageal reflux disease (GERD) or peptic ulcer, as maintenance therapy for GERD or peptic ulcer, and as treatment for minor heartburn.
The antiulcer market is the largest therapeutic drug market worldwide (Coscelli and Shum 2004). Most of the antiulcer drugs used around the world before 1990 were based on H2receptor antagonists. These block the production of histamine, a substance that stimulates acid secretion. 11 Yet the most powerful inhibitors of acid secretion currently available are protonpump inhibitors, which completely block the production of stomach acid by stopping the final step in acid secretion (Freston 1994;De Giorgi et al. 2006). The first proton-pump inhibitor was omeprazole. During its first years in the market, omeprazole was the preferred choice for pathological hypersecretory conditions and as attack therapy for GERD or peptic ulcer (Buhl and Clearfield 1990;Freston 1994;Meijer, Jansen, and Lamers 1989). Omeprazole entered the Italian market in June 1990 under the brand names of Losec and Omeprazen. Coscelli and Shum (2004) document that in their sample, approximately 25% of the prescriptions were written out to patients who suffered from the conditions for which omeprazole is the preferred choice.
The data used in this study were collected by the Italian National Institute of Health, 11. Brands of H2-receptor antagonists in the United States include Tagamet, Zantac, Pepcid, and Axid, among others. See Berndt, Pindyck, and Azoulay (2003) for a study of the diffusion of H2-receptor antagonists and inter-brand competition.
which recorded all prescriptions for antiulcer medications written by randomly sampled doctors in the Rome metropolitan area between June 1990 and December 1992. For each physician, we observe the monthly number of antiulcer prescriptions, and the monthly number of prescriptions written out for omeprazole. As for the prescriptions written out for other medications, we do not know which particular drugs were prescribed. Because most of them would have belonged to the same pharmaceutical class (H2-antagonists), we pool them into a single non-omeprazole alternative. We include physicians with at least 10 patients in each month, and we exclude physicians who in August of 1991 or 1992 saw 80% fewer patients than their average over July and September of the corresponding year. 12 The resulting dataset contains 256 physicians and 31 months, for a total of 463,199 prescription episodes organized into 7,936 doctor-month observations. Moreover, in Italy, all patients have the same insurance status because their costs are covered by the National Health System. The Drug Commission sets drug prices and copayments. Pharmaceuticals based on the same molecule are usually assigned the same price, and the copayment for antiulcer drugs is 50% of the price (Crawford and Shum 2005). All patients face the same prices and copays. Over the sample period, the average real price difference between omeprazole and the incumbent for a day's dose was 1,106 liras, which was equivalent to slightly less than one June 1990 dollar. The price difference experienced little variation over the sample period. Table 1 presents some summary statistics for our data. As the table shows, there is wide variation in the number of office visits received by physicians in the sample, as well as in the number and proportion of omeprazole prescriptions written over the period. Figure 1 depicts the evolution of the in-sample market share of omeprazole over the period. As the figure shows omeprazole's share rose from 1.6% to 14% over the sample period. Furthermore, Coscelli and Shum (2004) document that by mid-1995, omeprazole's market share in Italy 12. Physicians with very few patients may be less interested in learning about the new drug simply because they have fewer opportunities to write prescriptions. In addition, physicians with large seasonal fluctuations may not fit the assumption of a constant patient arrival process, which we invoke in the model (see next section).  had risen to 25%. 13 As Figure 1 shows, over the sample period physicians gradually converged in their propensity to prescribe omeprazole. Moreover, by the 3rd month about 75% of the 13. Prilosec, based on omeprazole, was the leading drug in terms of retail expenditure in 2000 to 2001 in the United States. Protonic-pump inhibitors accounted for more than 75% of all antiulcer retail expenditure (NIHCM 2002). From conversations with physicians we learned that it is quite common for a new pharmaceutical to follow this type of trajectory-that is, to enter the market focused on specific uses, and then broaden the set of uses with a concomitant increase in market share. In the case of omeprazole, as time went by new studies documented its safety for longerterm treatments and a broader patient base than initially considered. doctors had prescribed the new drug at least once, and all doctors had prescribed it at least once by the 9th month.
Unfortunately, our data are limited in several ways, all of which constrain our model and estimation. The main limitations are the following. First, we do not have patient-level information. This means that we do not know whether any given patient receives one or multiple prescriptions over the sample period. In the sample used by Crawford and Shum (2005), slightly more than half of the patients receives more than one prescription, which suggests that a substantial number of prescriptions in our sample might have been written out to the same patients. In the absence of patient-level information, however, we assume that each prescription is written out to a different patient. 14 While this assumption may not be suited to all the prescriptions in the data, it is necessary to explain at least some of the evolution of the market share. This is because in the data, the total number of prescriptions remains roughly constant over time. Furthermore, during our sample period omeprazole was not indicated for long-term treatment (Coscelli and Shum 2004;Buhl and Clearfield 1990;Freston 1994;Meijer, Jansen, and Lamers 1989). Hence, a rising market share could not have been generated simply from multiple prescriptions of the new drug to the patients for which the drug had been effective; rather, it must have required that additional patients be prescribed omeprazole over time.
The second important limitation in our data is the absence of doctor-level covariates such as age, training, or specialty to help us characterize physician prescription behavior. We do exploit, however, the substantial heterogeneity in the number of patients seen by the doctor, probably attributable to differences in doctors' patient bases. Third, we do not have data on other channels of learning such as advertising, detailing, professional interactions, and so on. Hence, we model a doctor's learning exclusively as a function of his experimentation with patients.

III. MODEL
Consider a physician who treats patients afflicted by ulcers. When he examines a patient, he observes aspects of her condition and determines whether to prescribe the new or the old medication. He knows the quality of the old medication but is uncertain about the quality of the new one. The patient returns at the end of her treatment (we assume that she complies with her treatment), and the doctor observes the treatment outcome. If she consumed the new medication, the doctor obtains new information on the new drug's quality and uses it to update his beliefs about the new drug. Thus, when he sees the 14. To our knowledge, no researcher has used both patient-level and doctor-level information. Researchers studying doctor-level learning handle prescription episodes as independent from one another. Narayanan and Manchanda (2009) use the first prescription for each patient. Coscelli and Shum (2005) observe the diagnosis treated with each prescription and allow for within-month correlation among the prescription signals for each diagnosis to capture spillovers across patients in different diagnosis groups. next patient, his beliefs on the new drug will differ from those he had when he saw the current patient. If, on the other hand, the patient consumed the old medication, his beliefs about the new drug remain unchanged. The doctor is a Bayesian learner: when he sees the outcome of the new drug on a particular patient, he views it as a signal of its true quality. His updated beliefs are a weighted average of his prior beliefs and the signal; the greater weight is given to the more accurate of these two elements.
More formally, we model two drugs in the antiulcer market: drug 0-the "old" or "incumbent" drug (which, in our empirical application encompasses all pre-existing drugs), and drug 1-the "new" drug, which enters the market at time t = 0. The quality of the old drug is known, but the quality of the new drug is unknown.
We assume that physicians only differ in their patients' arrival processes. 15 Consider doctor i, and let t k be the time of arrival of his kth patient. On the basis of the patient's observable condition, the doctor makes the prescription d i t k , which is equal to 1 when the new drug is prescribed and is equal to 0 otherwise. We assume that the doctor always writes a prescription because our data are based on prescription counts.
As is common in this literature, we assume that the doctor is benevolent, in the sense that there is no agency problem between him and the patient. 16 Thus, doctor i's instantaneous utility is the same as patient k's. We lack the data to identify the utility from the old drug; instead, we can only identify the difference in utilities between the new and old drugs. Hence, we normalize the instantaneous utility from the old drug to zero, which means that the utility from prescribing the new drug stands for the new drug's net utility, or its premium over the old drug (see Appendix A for further details). We model instantaneous utility as follows: 15. While we do not model other sources of doctor heterogeneity, Coscelli and Shum (2005) include a doctorspecific shifter. They conclude that neither doctor heterogeneity nor time effects explain much of the observed market share. Furthermore, the variance of the shifter is not significantly different from zero in specifications that include time-specific fixed effects.
16. The institutional framework of health care in Italy mitigates agency problems between doctors and patients, as each enrollee of the National Health Service must list a general practitioner (Coscelli and Shum 2005), presumably developing a long-term relationship with her doctor. According to our estimates (see Section VI), a higher price for the new medication relative to the old one lowers the probability of prescribing the new medication. This would not necessarily be the case if physicians did not care at all about their patients. (1) In this expression, the outcome of the new drug equals η i t k + δ + ξ i t k . The term η i t k corresponds to the patient's condition observed by the physician during the office visit. We often refer to this term as the patient's match parameter, because it indicates to the physician whether the new drug would be a good choice for the patient given her condition. For instance, consider a patient who comes to her doctor's office with a severe hypersecretory condition. Because omeprazole is the preferred treatment for pathological hypersecretory conditions, this means that she has a high η. Parameter δ is the quality of the new medication. We can also think of it as the true average effect of the new drug in the population of patients-equivalently, the unconditional average outcome in the population of patients. For patients whose match parameter equals η, the average outcome of the new drug equals δ + η. Hence, for patient k with match parameter equal to η, the term ξ i t k reflects how her new drug's outcome differs from δ + η. This term is realized after patient k takes the new medication but before the arrival of patient k + 1. 17 Finally, the term φ p t k captures the disutility due to the price difference between the new and old drugs, p t k , with φ being a parameter that measures patients' price sensitivity. The variables η i t k and ξ i t k are random and independent and identically distributed (i.i.d.) across patients, and their realizations are independent. Furthermore, η i t k and ξ i t k are normally distributed as follows: where σ 2 η and σ 2 ξ are positive variances. The price difference p t k is also i.i.d., 18 and distributed according to the cumulative distribution function (c.d.f.) W (.): 17. We assume this timing for belief updating for simplicity. Moreover, omeprazole produces results as quickly as in 1 day (see, for instance, http://www.fda.gov/bbs/topics/ news/2003/new00916.html).
18. Because an administrative entity sets prices in Italy, it is reasonable to assume that prices are i.i.d. from the point of view of physicians.
If the new drug is prescribed, then when patient k returns after her treatment the doctor observes the full outcome of the new drug on this patient. 19 The doctor had already observed part of this outcome, η i t k , and now observes the rest. However, the doctor cannot distinguish between δ and ξ i t k because he does not know the true quality δ. Instead, he observes a prescription signal for the new drug's quality, equal to ε i t k ≡ δ + ξ i t k . Thus, the doctor performs a signalextraction exercise, as he extracts information about the new drug's quality from the noisy signal he observes.
More specifically, we assume that at time 0 doctor i does not know the new drug's quality. His prior beliefs about it can be described as follows: whereδ 0 andσ 2 0 are the mean and variance of his initial beliefs. The lower the value ofδ 0 , the more pessimistic he initially is; the greater the value ofσ 2 0 , the less accurate his initial beliefs are. Prescribing the new drug allows the doctor to update his beliefs. Moreover, we assume that experimentation (prescription) is the only source of physician learning. 20 Thus, when he sees patient k, the doctor's beliefs can be described as follows: 19. In our model, σ 2 ξ can be viewed as the true variance of the random portion of the outcome adjusted by the probability that the doctor sees the outcome. A high σ 2 ξ could indicate either high volatility in the outcome or low probability that the doctor observes the outcome. These possibilities are empirically not distinguishable, although their implication is the same-other things equal, the doctor will update his beliefs more slowly than he would with a lower σ 2 ξ . 20. Modeling physicians who learn from each other would prove challenging, because each physician would then have to predict the evolution of other physicians' beliefs and, in equilibrium, these predictions should be consistent with the actual beliefs. The introduction of a simple function of time might be able to proxy for the learning based on word of mouth, professional conferences, journals and news, changes in labels and package inserts, and so on. Coscelli and Shum (2004) introduce a time trend that captures learning through these sources. Their counterfactuals suggest, however, that the trend explains little of the market share's evolution.
If he prescribes the old drug to patient k, the physician is certain that he will observe an outcome equal to zero. If he prescribes the new drug, he is uncertain about the prescription signal he will observe. However, his beliefs about the new drug lead him to perceive this signal as having the following distribution: which follows from Equations (2) and (5). Hence, his updating is based on the following perceived joint distribution: This means that the mean and variance of the doctor's posterior, updated beliefs are as follows: These will be the doctor's prior beliefs the next time he sees a patient. No updating takes place if the old drug is prescribed, or We assume that doctors are forward-looking and risk-neutral. 21 Next, let κ be the rate of time discounting, common across all physicians. Then, if patients arrive at times t 1 , t 2 , . . . , t k , . . . , where the subindex stands for the arrival order of a patient, doctor i's objective is to 21. It is not possible to identify risk aversion in this setting. The logic is quite similar to that of Coscelli and Shum's (2004) discussion of lack of identification of risk aversion in a myopic setting. Intuitively, a low initial market share could be rationalized either by risk-neutral physicians who are pessimistic about the new drug, or by risk-averse physicians who know the new drug's quality but are uncertain about it (i.e., physicians who haveδ 0 = δ butσ 2 0 > 0). We chose the first alternative because it seems reasonable to assume that physicians would not know the true quality of the new drug on the full population of patients just from clinical trials conducted before the drug's entry. maximize the expectation of his discounted utility or where E t=0 {.} stands for the doctor's expectation at time zero, given his initial beliefs (4) and future optimal choices d i t k , based on the information available at time and η i t k ). At times we will refer to the doctor's (expected) discounted utility as the doctor's payoff or value.
Because the instantaneous utility in Equation (1) may be linearly transformed without affecting optimal choices, we adopt a normalization for the sake of identification. In particular, we impose the following restriction: Thus, we measure δ, ξ, and φ p in units of standard deviation of the match parameter η.
Finally, we assume that the time elapsed between two consecutive patients' arrivals, t k = t k − t k−1 , follows a Gamma distribution 22 and is independent of the time elapsed between any other consecutive patients and all other random variables 23 : In this doctor-specific distribution, a i is the "shape" parameter of the distribution, and parameter b i is associated with the intensity of the arrival process-the lower the b i , the more intensive the arrival rate. 22. A Poisson arrival process generates an exponential distribution for the time between consecutive arrivals. An exponential distribution is a special case of a Gamma distribution.
23. We assume that the arrival process is exogenous from the point of view of the physician. It is not affected by treatment outcomes or by the entry of the new drug (the number of patients seen by each doctor does not change much over the sample period). For the arrival process, we have assumed the mean arrival time is equal to a i b i .
Having stated the primitives of the model, in the next section we characterize the optimal behavior of a doctor. 24

IV. ANALYSIS OF THE MODEL
To simplify the exposition, in what follows we suppress the doctor index, i. We use t and t + t to denote the time of the current and next patient's arrival, respectively. Consider the problem of a physician who sees a patient at time t. He must make a decision based on four state variables:δ t ,σ 2 t , p t , and η t . Let V δ t ,σ 2 t , p t , η t be the doctor's expected continuation payoff at time t when he behaves optimally. Given Equations (9), (7), and (8), the following Bellman Equation holds: (.|., .), and T (.; ., .) are given by Equations (10), (3), (6), and (11), respectively. In this equation, the max function has two arguments. The first is the expected 24. One could think of the following alternative model: at the beginning of the month, the doctor chooses the share of patients who receive the new drug; at the end of the month, he updates his beliefs based on all the signals received throughout the month. We do not think this model is plausible, as the doctor would use information from the signals only at the end of the month, instead of using it as soon as it becomes available. In addition, choosing the share of patients who receive the new drug at the beginning of the month would introduce new computational challenges. For instance, if the physician were considering the option of prescribing the new drug to 5 out of 10 patients in the next month to calculate the continuation value from this choice he would need to integrate over five unobserved signals. payoff to prescribing the old drug. When he prescribes the old drug, the doctor gets zero instantaneous utility and does not update his beliefs; hence, his expected utility for the next patients depends on the same beliefs he has for the current patient. The second argument is the expected payoff to prescribing the new drug. This is equal to the subjective expectation of the instantaneous utility given by Equation (1), plus the expected discounted utility for the next patients. The latter depends on beliefs which will be updated when the doctor sees the outcome of the new drug on the current patient. At the time he writes the prescription, the doctor does not know what his posterior beliefs will be, although he knows their distribution.
The Bellman equation can be solved numerically by solving iteratively for the value function. Because the state space is four dimensional, the integrations required are computationally very costly for a reasonable degree of accuracy. Nonetheless, our problem has certain properties that enable us to solve for the doctor's optimal behavior in a computationally simpler way without sacrificing accuracy. We provide the details below.
Because t is independent of η, p and ε, in Equation (12) we can factor out the integration over t and rewrite the Bellman equation in the following, more familiar form: subject to Equation (13), where is the doctor-specific effective time discount factor between adjacent patients' arrivals. The fact that the effective discount factor differs across doctors implies that the value function differs across doctors even for the same δ t ,σ 2 t , p t , η t combination. It is now convenient to introduce the following notations for the integral terms in Equation (14): where C o δ t ,σ 2 t and C n δ t ,σ 2 t are the doctor's continuation payoffs to prescribing the old and new drug, respectively, given his system of beliefs δ t ,σ 2 t . Now we can write the doctor's value function in the following parsimonious form: As Equation (17) suggests, there is a value of the patient's match parameter η t that makes the doctor indifferent between the old and the new drug given the observed price difference p t . Thus, the doctor's optimal behavior consists of the following threshold rule: prescribe the new drug when the patient's match parameter adjusted for price, η t − φ p t , is higher than the threshold value ω δ t ,σ 2 t , and prescribe the old drug otherwise. According to Equation (17), the threshold value is given by The threshold value allows us to express the value function as follows: 25 25. The event η t − φ p t = ω δ t ,σ 2 t takes place with zero probability, so the doctor's choice in this case has no effect on the value function. We assume that he prescribes the new drug in this case. Figure 2 illustrates the value function with respect to the patient's match parameter η t . The function is flat for all values of η t below the threshold and linearly increasing (with unit slope) otherwise. In other words, the thresholdrule property lends a very convenient shape to our value function. We exploit this shape when we substitute Equation (19) into Equation (15) and integrate over η t+ t to get where [.] and f [.] are the c.d.f. and probability density function (p.d.f.) of the standard normal distribution, respectively. Next, according to Equation (18), we can exclude C n δ ,σ 2 in the above equation as follows: This gives us the following one-to-one relationship between C o δ t ,σ 2 t and ω δ t ,σ 2 t : From Equations (16), (15), and (20), it follows that s.t. (13).
Equations (13) and (6) imply that at time t the physician perceives the distribution ofδ t+ t as follows: Hence, we can rewrite Equation (21) as 26. The function F (x) has the following properties: where dH (δ t+ t |δ t ,σ 2 t ) represents the distribution given by Equation (22).
By substituting Equations (20) and (23) into Equation (18), we get the following expression for the threshold function ω (., .): × dH (δ t+ t |δ t ,σ 2 t ). As pointed out before, in our data the price difference exhibits very little variation over time. This allows for further simplification of Equation (24) because the following approximation is accurate: where p is the mean of price difference. Hence, we use the following equation for the threshold function: 27 . Thus, we have reduced the original Bellman equation (12), with four state variables (δ t ,σ 2 t , p t , and η t ) and four-dimensional integration (with respect to t, η t+ t , p t+ t , and δ t+ t ), to the threshold equation (25), with two state variables (δ t andσ 2 t ) and one-dimensional integration (with respect toδ t+ t ), by exploiting certain features of our model and data. The additive separability (AS) and conditional independence (CI) assumptions invoked by Rust (1987) hold in our model. AS holds because η t and p t enter in an additively linear fashion in the instant utility function. CI holds because the distribution of the doctor's beliefs in the next period can be completely characterized by the doctor's current beliefs and his current decision, and because η t and p t are i.i.d. (in the absence of CI, the distribution of the doctor's beliefs in the next period would generally be a function of all the current period's state variables, and the current period's decision). CI allows us to write the continuation payoffs C o and C n as functions of only two state variables (δ t andσ 2 t ) instead of the original four. AS, in conjunction with CI and the assumption that the doctor only faces two choices (old or new drug), gives us a convenient shape of the value function with respect to η t (see Equation [19] and Figure 2), a shape whose usefulness will be explained below.
To reduce the dimension of integration from four to one, we rely on: (a) the assumption that 27. We exploit the fact that price difference exhibits little variation to avoid one dimension of integration for the threshold calculations. In the estimation, however, we use the actual price difference, which varies over time, to identify φ. the patient arrival process is invariant to other variables, which allows us to integrate t out; (b) the very low variation of p over time in our data, so that we can evaluate all functions of p at p and avoid integration with respect to p; (c) the shape of the value function with respect to η t . Rust (1987) assumes that i.i.d. shocks follow a type I extreme value distribution. If we assumed this distribution for η, we would obtain closed-form solutions for the integrals with respect to η in the continuation payoffs (see Equations [15] and [16]). However, we are able to avoid numerical integration with respect to η without assuming a type I distribution for η. This is because the convenient shape of the value function with respect to η yields analytical solutions for the integrals with respect to η in the continuation payoffs. Although we have assumed that η has a standard normal distribution, it would be possible to find an analytical solution for these integrals for any distribution of η, because our ability to find this solution depends entirely on the shape of the value function with respect to η.
Because the doctor's beliefs do not change when he prescribes the old drug, we can collapse the equations for C o and C n to one equation in one unknown-the threshold-involving just a one-dimensional integral overδ t+ t (see Equation [25]). In particular, the threshold is not a function of the continuation payoffs. In order to solve this (nonlinear) equation for a given δ t ,σ 2 t combination, we have developed a fast algorithm that operates in a recursive fashion. The algorithm rests on the Bayesian updating feature by which the variance of a physician's beliefs falls each time he prescribes the new drug (see Equation [13]). This means that after a sufficiently large number of prescriptions of the new drug the variance of beliefs is almost equal to zero, and the mean of beliefs no longer changes. For a sufficiently small σ 2 t , C o δ t ,σ 2 t = C n δ t ,σ 2 t , which means that ω δ t ,σ 2 t = −δ t . We can then apply backward induction to find the threshold for increasingly higher values ofσ 2 t . Appendix B contains further details on the algorithm.
To summarize, we have reduced the physician's dynamic discrete problem to the calculation of threshold values that depend only on two state variables and are not functions of continuation payoffs. The computational burden of our approach is the same as that which would result from assuming a type I extreme value distribution for η (as in Rust 1987). However, the computational burden of our approach would be larger if the physician faced more than two choices because in this case it would be harder to characterize the doctor's decision rule through the use of threshold functions.

V. ESTIMATION
The model includes the following parameter vector: where a i and b i are the parameters of doctor i's patients' arrival process, I is the total number of doctors in the sample, κ is the discount factor, δ is the true quality of the new drug,δ 0 and σ 2 0 characterize the doctors' system of initial beliefs (prior beliefs), σ 2 ξ is the variance of the prescription signal, φ measures price sensitivity, and p is the expected value of price difference. Below we describe our estimation procedure and the features of the data that identify the parameters of the model.
To describe the estimation, some additional notation is in order. Let M denote the total number of months in the sample. Let n i m be the number of patients who visit doctor i in month m, and r i m the total number of new drug prescriptions by doctor i in month m. The estimation procedure involves the following steps: Step 1. (Calibration of κ): We set the value of κ equal to 0.00025, which corresponds to an annual discount factor of 0.997. 28 Step 2. (Estimation of p): Our estimate of the expected value of price difference is the sample mean of price difference, equal to 1,106 liras per day.
Step 3. (Estimation of the arrival process parameters for each doctor): The patient arrival process for doctor i is characterized by two parameters, a i and b i . We estimate these parameters using maximum likelihood. For each doctor, we match the empirical distribution of the number of patients per month n i m to the distribution implied by Equation (11). The variation in the number of patients across months 28. When we attempted to estimate κ, our point estimate was on the boundary at zero, which corresponds to an annual discount factor of one. Problems with the estimation of the discount factor are pervasive in the literature, as indicated by Erdem and Keane (1996), whose estimate of the annual discount factor is above one. Thus, we calibrate κ to 0.00025, which is very small yet still numerically manageable. Erdem and Keane (1996) and Crawford and Shum (2005) also calibrate the discount factor. for a given doctor identifies the arrival process parameters.
Step 4. (Estimation of the remaining parameters): In principle, to estimate the remaining parametersθ = {δ,δ 0 ,σ 2 0 , σ 2 ξ , φ}, one would apply maximum likelihood. However, the lack of some critical data leads us to apply simulated maximum likelihood instead, as we explain in the paragraphs that follow.
Consider the ideal situation, in which we observe for each doctor the sequence of his prescription decisions where p jm is the price difference for patient j at month m. Doctor i's contribution to the likelihood function (27) is In this ideal situation, to estimate the model one finds the value ofθ that maximizes (27) given the values of the parameters estimated in the previous steps. However, our data are not ideal for two reasons. First, we do not observe the sequence of prescription decisions for each doctor; we only observe how many patients he receives per month, n i m , and to how many he prescribes the new drug, r i m . For instance, if the doctor sees two patients in a given month and prescribes the new drug to only one of them, the patient getting the new drug could be either the first or the second. In other words, for doctor i and month m, there are n i m r i m possible prescription sequences that could have generated the observed r i m new drug's prescriptions out of a total n i m prescriptions. Second, we do not observe the doctor's beliefs at the time he writes the prescriptions, or the prescription signals that he receives when prescribing the new drug.
Denote as C n r the set of all sequences of prescription decisions where out of n prescriptions the new drug was prescribed r times. There are n r elements in this set. Then, conditional on {r i m , n i m } and { p jm } the doctor's likelihood is  (29) is approximately equal to the following average of simulated likelihood values: We found that 1,000 simulation sequences was enough to achieve the desired accuracy in our parameter estimates, which are accurate up to the third decimal place. 30. Because we observe price variation across rather than within months, we assume a constant price difference within each month.
Thus, in step 4 of the estimation we maxi-mizeL(θ) with respect to five parameters (δ,δ 0 , σ 2 0 , σ 2 ξ , and φ) to get an estimate ofθ. To evaluatẽ L(θ), we first solve for ω(., .) and find threshold values for approximately 40,000,000 combinations ofδ,σ 2 , and β in approximately 15 sec in a 2.3 Ghz Intel Centrino processor. Then we run through each of the S simulated sequences calculating the probability of prescribing the new drug to each patient and updating the corresponding beliefs every time that new drug is prescribed. This takes about 15 sec. Thus, eval-uatingL(θ) takes only half a minute, which is very little given the complexity of the problem and the total number of prescription episodes in our data. 31 We finish this section with an intuitive discussion of the identification of the parameters involved in step 4. For each doctor i and month m, we observe his monthly share of new drug prescriptions, ρ i m = r i m /n i m . The sensitivity of ρ i m to fluctuations in the price difference identifies the price coefficient φ. The average share of the new drug across doctors for the first months identifies the mean of the prior beliefs,δ 0 . The ratioσ 2 0 /σ 2 ξ is directly related to the speed of learning (see Equation [8]): the lower the ratio, the flatter the market share trajectory and the lower the speed of learning. The trajectory, however, is not strictly monotonic (see Figure 1), neither for the market as a whole nor for individual physicians. The variability of the new drug's share around the expected path identifies the variance of the prescription signal, σ 2 ξ . This, in turn, allows us to separately identifyσ 2 0 and σ 2 ξ . Finally, given φ p, market share in the last months of the sample identifies the true quality of the new drug, δ.

VI. ESTIMATION RESULTS
In this section, we describe our estimation results. Column 1 of Table 2 shows the parameter estimates for our model (recall that our unit of measure is the standard deviation of the patient's match parameter, η). All the 31. We use the outer product of the gradient approximation to obtain the variance-covariance matrix for the estimates of the parameters pertaining to step 4. Because estimating the whole model involves sequential steps, we correct this matrix following Newey (1984) to obtain correct asymptotic standard errors. The resulting matrix is the assymptotic approximation of the variance-covariance matrix (Gourieroux and Monford 1996). parameters are accurately estimated. According to our estimates, the new medication has lower quality than the old one, as the estimate for δ is negative. This is probably due to the fact that, as explained in Section II, during our sample period omeprazole is the preferred treatment only for a subset of ulcer pathologies. The fact that the estimate of the mean of initial beliefs,δ 0 , is lower than the estimate of the true quality δ indicates that physicians are initially pessimistic about the new drug. However, these beliefs are not completely accurate, as the estimated variance of initial beliefs,σ 2 0 , is positive. If it were zero, physicians would not experiment or learn at all. The fact that it is positive means that physicians have incentives to learn by experimentation. The estimated variance of the new drug's signal, σ 2 ξ , is larger than the estimated variance of initial beliefs. Finally, the estimated price coefficient is positive.
To examine the fit of our model, we simulate 1,000 patient sequences for each physician in our sample; for each patient we draw a match parameter η and a signal ε. Figure 3 displays predicted and observed market share, and 95% confidence bounds for predicted share. 32 As the figure shows, the model fits the data reasonably well, and the observed market share falls within the 95% confidence interval for 32. When drawing the confidence bounds, we take into account the fact that the parameter estimates are also random variables. The number of patients in the simulated sequences for a given doctor and month is taken from the data for that doctor and month. each month. Furthermore, as physicians learn and converge in their beliefs, their prescription behavior converges as well. In other words, learning by experimentation provides an empirically relevant explanation for omeprazole's usage pattern depicted in Figure 1. The model predicts that in the long run, physicians learn the true quality of the new drug and their beliefs are accurate. According to Equation (18), the threshold value for the patient's match parameter adjusted for price, η − 0.195 p, is equal to ω(−0.69, 0) = 0.69. If the price difference were the same as the average price difference during our sample period, this would predict a long-run market share of 18% for omeprazole. If the price difference were zero, the predicted market share would be equal to 25%. These numbers square quite well with the fact (see Section II) that omeprazole is the preferred choice for patients whose condition accounts for approximately 25% of all prescriptions.
Column 2 of Table 2 displays the estimates from the myopic model, in which physicians make prescription choices only to maximize instantaneous utility (i.e., κ → ∞, and the annual discount factor equals zero for each physician). The fit of the data, as measured by the log-likelihood value, is better for the forward-looking than the myopic model. Using these log-likelihood values, we test the null hypothesis that physicians are myopic, and the data reject it. 33 Hence, in the counterfactual simulations that follow we use estimates from the forward-looking model. Overall, we are encouraged by the fit of our parsimonious model.

VII. COUNTERFACTUALS
We can now use our structural estimates to gauge the cost of uncertainty and myopia, and the value of learning. To do so, we consider a representative physician who sees 50 patients per month for 10 years, and whose patients arrive at a uniform rate within each month. We assume that the treatment lasts 14 days, and that the patient pays a 50% copay. We simulate 100,000 prescription (patient) sequences. The results we present are averages over the simulated sequences. We assume that the price 33. The p-value of the test is 0.00002. difference is constant and equal to the observed sample mean. 34 We compare three cases. In the first ("forward-looking"), the physician behaves as in our model, that is, he is uncertain about the new drug's quality, and is forward-looking. In the second ("myopic"), the physician is also uncertain about the new drug's quality but only seeks to maximize current utility. In the third ("full information"), the physician knows the true quality of the new drug, that is,δ 0 = δ, and σ 2 0 = 0; however, the outcome of the new drug continues to be random. 35 We begin by comparing prescription behavior in the three cases. Figure 4 displays the 34. On treatment length, see http://www.cdc.gov/ulcer/ files/hpfacts.pdf. The averages depicted in Figures 4, 5, and 6 are kernel-smoothed to reduce simulation noise.
35. For the myopic case, we use the parameter estimates from the forward looking case but set the effective discount factor β equal to zero (equivalently, κ → +∞). Although we have rejected the null hypothesis that doctors are myopic, it is still of interest to gauge the gains associated with the forward-looking rather than myopic behavior of doctors. The full-information counterfactual is of interest because the Italian policy-maker, who controls prices, might want to choose a price path that induces physicians to make similar choices to those under full information.
probability that a patient arriving at a particular time during the first 10 years following the new drug's entry receives a prescription for the new drug. 36 For a fully-informed physician, this probability is constant at about 0.18, as we saw in the previous section.
In contrast, the forward-looking physician is less likely to prescribe omeprazole at any time during the period considered, although he becomes more likely to do so as he learns about it. Learning is faster at the beginning, when he is more uncertain about the true quality. The myopic physician also learns over time, but at a much lower rate. While the forwardlooking physician weighs the interest of the current patient with the continuation value of prescribing the new drug, the myopic physician only considers the first. Thus, he prescribes the new drug less, which means that he has fewer opportunities to learn about it. The difference in behavior between forward-looking and myopic physicians is substantial-forwardlooking doctors are about 20 times more likely 36. For instance, for patient number 201, who comes at the beginning of the 5th month, this probability is the fraction of simulation sequences in which the patient obtains the new drug, relative to the 100,000 sequences. to prescribe the new drug at the beginning of the period, and 1.5 times more likely at the end. Such a difference provides evidence that prescribing the new drug has indeed a high learning value for forward-looking doctors.
The question, however, is whether these differences in prescription behavior translate into differences in health outcomes. For instance, a physician who believes that omeprazole's quality is lower than it actually is will not prescribe it to patients who should receive it. Thus, in Figure 5 we seek to answer the following question: for a patient who comes at a particular time, what is her expected treatment outcome? Expected outcomes differ over time because beliefs, which condition prescription choices, differ over time. 37 37. We express the expected health outcome in thousands of liras. For a patient with match parameter equal to η, we calculate the expected health outcome as (14/2)(δ + η)/φ when the new drug is prescribed, and zero otherwise. We multiply by 14 days to adjust for treatment length and divide by 2 because patients only pay half of a drug's actual price.
The figure shows that if a patient visits a fully informed physician, she can expect her health outcome to be more than 3.5 times better at the beginning of the period than if visiting a forward-looking physician, and 0.11 times better at the end of the 10th year. However, it is worth noting that if the new drug were not available at all, then health outcomes in Figure 5 would coincide with the horizontal axis (recall that instant utility is normalized to zero for the old drug). In other words, although uncertainty is costly, it is even more costly not to have the new drug at all. Figure 6 depicts similar patterns for expected instantaneous utility, which adjusts health outcomes for the disutility of paying for the new drug. 38 Expected health outcomes are worse for myopic than forward-looking doctors. Myopic doctors, who do not learn fast enough due to the lack of experimentation, write the "wrong" 38. We express the expected instantaneous utility in thousands of liras. For a patient with match parameter equal to η, we calculate the expected instantaneous utility as (14/2) (δ + η − 1.106φ) when the new drug is prescribed, and zero otherwise. prescription at a higher rate-namely, they prescribe the old drug to some of the patients who would receive the new drug from a fully informed physician. Although expected health outcomes for myopic and forward-looking doctors would converge in the long run, when all doctors learn the new drug's true quality, in the short-and medium-run myopic physicians contribute to relatively poor health outcomes. Even after 10 years, the expected outcomes for myopic doctors are 40% lower than for forwardlooking doctors.
When writing a prescription, a physician could make two types of errors. First, he could prescribe the new drug to a patient who does not need it (in the sense that her match parameter η is below the full-information threshold). Second, he could fail to prescribe the new drug to a patient who does need it (in the sense that her match parameter is above the full-information threshold value). In our model, as long as doctors are initially pessimistic about the new drug's quality, the first type of error will not happen because the full-information threshold is lower than the forward-looking threshold; only the second type of error may happen, affecting the patients whose match parameter is between the full-information and the forward-looking thresholds. In contrast, only the first type of error would happen if doctors were initially optimistic about the new drug's quality.
Because the forward-looking doctor maximizes expected discounted utility, his prescription for a particular patient may not maximize her expected instantaneous utility, conditional on the doctor's beliefs. In contrast, the myopic doctor does maximize the patient's expected instantaneous utility, conditional on his beliefs. Thus, it might seem that the forward-looking doctor exploits the current patient on behalf of future patients, whereas the myopic doctor does not. However, this is not the case, because the myopic doctor updates his beliefs more slowly. In the case of initial pessimism, his threshold is higher than the forward-looking doctor's and hence he prescribes the new drug less. This means that patients who need the new drug (in the sense explained above) are less likely to get it from him than from the forward-looking doctor. This, in turn, entails lower expected health outcomes and utility for any given patient, as illustrated in Figures 5 and 6.
We have also evaluated a scenario in which the physician never learns about the new medication-namely, a situation in which the physician never updates his beliefs. In this scenario, patients who would benefit from the knowledge gained by the physician through previous patients do not reap such benefits (we have not depicted the corresponding results because they are visually indistinguishable from the horizontal axis, although they are positive). Results from this scenario tell us that the gains from the existence of the new drug are rendered almost mute when pessimistic doctors do not learn.
To provide a summary measure of the distortions induced by myopia and incomplete information, Table 3 shows the present value of the expected discounted utility from prescription choices over 20 years following the introduction of the new drug. 39 The table compares payoffs to physicians who are fully informed, forwardlooking, myopic, or who do not learn. These payoffs differ substantially: almost 34,000 dollars for the fully informed physician; 31,000 for the forward-looking physician; 20,000 for the myopic physician; and 950 for the physician who does not learn (recall that the payoff equals zero when the new drug does not exist). That is, uncertainty causes 9% losses for the forward-looking relative to the fully informed physician. The myopic physician, in turn, suffers 41% losses relative to the fully informed. The myopic physician's losses encompass both the effect of uncertainty and short-sightedness, and most of his losses are accounted for by his short-sightedness. In other words, his failure to experiment in order to reduce the uncertainty has more severe consequences than the uncertainty itself. The worst scenario, however, corresponds to the physician who never learns, whose payoff is many times lower than the myopic physician's.
Although we do not examine the production and supply of the new drug in this paper, it is clear that a delayed adoption of the new drug causes revenue losses for the new drug's manufacturer. For instance, some simple calculations show that the present value of the 39. To calculate the presented discounted value, we use a discount rate of 0.997, for consistency with our estimation. manufacturer's revenues over the first 10 years following the new drug's entry is approximately 23% lower when physicians are forwardlooking than when they are fully informed. The manufacturer, then, may want to reduce initial uncertainty, perhaps through advertising and detailing.
Because uncertainty causes inferior health outcomes, it is plausible that the Italian policy maker, who controls prices, might want to choose the price path that maximizes social welfare by inducing the optimal amount of experimentation on the part of physicians. In other words, a sufficiently low price may stimulate the use of the new drug, otherwise limited by uncertainty. 40 Our model allows us to find the price that maximizes the expected social welfare. We consider, again, a representative physician who sees 50 patients per month during his infinite lifetime. Importantly, we assume that the new and old drug have the same, constant marginal cost of production. Under this assumption, maximizing social welfare is equivalent to 40. Even if the policy maker cannot implement these policies at t = 0, he can still accomplish a large welfare improvement by implementing them early on. For instance, following the new drug's entry he could have waited for 31 months (the length of our sample period) to gather data and estimate the new drug's true quality. Armed with this estimate, he could have designed his pricing policy. Considering that this policy would have raised welfare from that moment onwards, the informational losses incurred during the first 31 months would probably have been relatively small, particularly given the large value of the discount factor. maximizing net expected social surplus, defined as the difference between the new and old drug's expected welfare: 41 The price affects the decision d k and the threshold value for the new drug. Hence, in Equation (32), price also affects the set of values of η for which the new drug is prescribed. We calculate Equation (32) by simulating 100,000 sequences of patients and averaging over them. We restrict our search to price paths in which the price is constant and deterministic. 42 According to our calculations, the socially optimal price difference per day of treatment is −15 real liras, which means that the new drug's price should be 15 liras below the old drug's. In contrast, in the sample the new drug costs, on average, 1,106 liras more than the old drug. For the representative doctor, the social surplus associated with the optimal price is 1,043,953 thousands of real liras ($844,622), which is about 5% higher than the net social surplus from the observed mean price. 43 In other words, a slight subsidy for the 41. Social welfare for the representative doctor is the discounted sum of social surpluses from all patients. For a specific patient k (k is the patient's order of arrival), the instantaneous social surplus is the difference between the monetary value of the treatment outcome and the production cost: whereů is the monetary value of patient's utility from the old drug treatment and the other terms are familiar (ů corresponds to the second term in the last equality in Appendix A), and c is production cost. Thus, the expected discounted social surplus S is where E t=0 stands for expectation taken at time zero. Because a price policy does not affect c orů, maximization of S is equivalent to maximization of the net discounted social surplus given by Equation (32). 42. Our model describes doctors' behavior under i.i.d. price difference paths. It can be shown that no i.i.d. path yields a higher expected social welfare than the maximal social welfare under constant price difference paths. Thus, searching over constant paths is less restrictive than it might seem.
43. The percent difference between the optimal social welfare and the social welfare under the observed price may seem small. This is because the high discount factor gives great weight to the distant future, when the truth is (almost) known. The percent difference would be larger if the time horizon were shorter (i.e., 10 or 20 years). new drug induces the experimentation rate that is optimal from a social perspective. Our calculations also show that even if it were not feasible to implement the socially optimal price, it would still be possible to raise social welfare up to the level reached by fully informed physicians under the observed price difference. In particular, lowering the price difference from 1,106 to 1,031 liras per day would accomplish this goal.

VIII. CONCLUSIONS
In this paper, we have developed and estimated a model to predict the demand for experience goods. In particular, we have investigated how physicians learn about the quality of a new pharmaceutical. We have studied the adoption of omeprazole, an antiulcer molecule that entered the Italian market in June of 1990. Using a panel dataset of prescriptions written by physicians in Rome for almost 3 years following the new drug's entry, we estimated a dynamic discrete choice model of prescription choice and learning.
Exploiting theoretical properties of our model, we avoided value function and continuation payoff calculations and reduced the dynamic discrete choice problem to the straightforward calculation of threshold functions. Our parsimonious model fits the data well and provides evidence that physicians indeed learn by experimentation. In particular, prescribing the new drug has a high learning value for forwardlooking physicians.
Our counterfactuals show that uncertainty has large effects on the propensity to prescribe the new drug, which in turn leads to large negative effects in health outcomes. While uncertainty has negative effects, myopia's are even worse, as it substantially reduces a doctor's propensity to prescribe the new drug and thus limits opportunities for learning. The negative effects of uncertainty, however, might be mitigated by the positive effect of a price discount on the new medication.
We believe that our modeling and computational approach are helpful for predicting demand in markets for experience goods. More broadly, the approach may be applicable to other dynamic discrete choice problems with two choices (for instance, whether to search for a new job or not).

APPENDIX A: INSTANTANEOUS UTILITY AND NORMALIZATION
Here, we show that the optimal behavior of the doctor depends only on the difference between the instantaneous utility from the new and old drug. We begin by assuming a general instantaneous utility specification: where the variables have the same meaning as those in the model. Doctor i maximizes his expected discounted utility where in the last step we rearranged terms and replaced the differences with the following notations: and p t k ≡ p 1t k − p 0t k .
The expectation in the second term of the last equality does not depend on the physician's choices. Hence, the physician's optimal behavior is the same as the behavior of a physician for whom the utility of the old drug is normalized to zero.

APPENDIX B: NUMERICAL SOLUTION FOR THE THRESHOLD FUNCTION
We begin by noting that the variance of the doctor's belief 44 at time t,σ 2 t , is uniquely determined by the number of the new drug prescriptions n that he has written up to that point. Hence, for simplicity, we refer toσ 2 n , given by the following expression: σ 2 n = σ 2 ξ / σ 2 0 n + σ 2 ξ σ 2 0 .
44. We omit the doctor's index, i, for notational simplicity.
47. By experimenting with the grid size, we found that 400 points evenly spaced on the interval [−5, 5] are enough to reach the required accuracy of ω when we use quadratic interpolation between points.
In this recursive procedure, we find each value ω δ j ,σ 2 n by solving at a time one equation in one unknown, ω δ j ,σ 2 n , as is clear from Equation (B2). We solve each equation using a Newton method for which the starting point is ω δ j ,σ 2 n+1 . To solve each equation, a crucial aspect is evaluating the integral in Equation (B2). We must numerically approximate this integral, because we only know the "grid values" of F (.) , namely the values of F (.) at the grid pointsδ j , j = 1, ..., J , equal to F ω δ j ,σ 2 n + φ p , j = 1, . . . , J . We construct a piecewise quadratic approximation to F (.) using the grid values. We favor a quadratic over a linear approximation because this allows us to use fewer grid points. From the theory of approximation by piecewise quadratic functions, we know that for any degree of accuracy ε there is a sufficiently large number of grid points such that the approximating function differs from F (.) at most by ε over the whole range ofδ values. Hence, the integral in (B2) is approximated with an accuracy of ε. Moreover, the integral of the product of the approximation to F (.) and the p.d.f. of the known normal distribution dH (.) can be calculated analytically, which allows for a fast evaluation of the integral. The integral does not need to be recalculated during the Newton iterations, which further enhances numerical performance.