background-image: url("../pic/slide-front-page.jpg") class: center,middle count: false # Advanced Econometrics III ## (高级计量经济学III 全英文) <!--- chakra: libs/remark-latest.min.js ---> ### Hu Huaping (胡华平 ) ### NWAFU (西北农林科技大学) ### School of Economics and Management (经济管理学院) ### huhuaping01 at hotmail.com ### 2023-04-07
--- count: false class: center, middle, duke-orange,hide_logo # Part 2:Simultaneous Equation Models (SEM) .large[ Chapter 17. Endogeneity and Instrumental Variables Chapter 18. Why Should We Concern SEM ? Chapter 19. What is the Identification Problem ? .red[Chapter 20. How to Estimate SEM ?] ] --- layout: false class: center, middle, duke-softblue,hide_logo name: chpt20 ## Chapter 20. How to Estimate SEM ? [20.1 Approaches to Estimation](#approaches) [20.2 General least squares (OLS)](#LS) [20.3 Indirect least squares (ILS)](#ILS) [20.4 Two-stage least square method (2SLS)](#TSLS) [20.5 Truffle case](#truffle) [20.6 Cod case](#cod) --- layout: false class: center, middle, duke-softblue,hide_logo name: approaches ## 20.1 Approaches to Estimation --- layout: true <div class="my-header-h2"></div> <div class="watermark1"></div> <div class="watermark2"></div> <div class="watermark3"></div> <div class="my-footer"><span>huhuaping@ <a href="#chpt20"> Chapter 20. How to Estimate SEM ? </a>               <a href="#approaches"> 20.1 Approaches to Estimation </a> </span></div> --- ### Approaches to Estimation In order to estimate the structural SEM, two approaches can be adopted: - **Single equation method**, also known as **limited information methods**. > estimate each equation in SEM one by one, considering only the constraints in that equation - **System method** , also known as **full information method** >estimate all the equations in the model simultaneously, taking into account all the constraints in the SEM ??? 为了估计结构方程,可采取两种方法: - 单方程法,又称有限息法(limited information methods) 。 - 我们逐个估计(联立)方程组中的每一个方程,仅考虑对该方程的约束(如对某些变量的排除)而不考虑对其他方程的约束 - 方程组法(或系统法),又称完全信息法 (full information methods) - 我们同时估计模型中的全部方程,适当考虑了因某些变量被排除而对方程组造成的全部约束 --- ### Approaches to Estimation Instrumental variables are often used to estimate simultaneous equation problems, mainly including three IV techniques for **System method** : - **Three-stage least squares** (**3SLS**):Applicable in a few cases - **Generalized moment method**( **GMM**):It is commonly used for dynamic model problems - **Full information maximum likelihood**( **FIML**):It has much theoretical value and it brings no advantage over 3SLS, but is much more complicated to compute. ??? 工具变量往往被用于估计联立方程问题,主要又包括三类方法: --- ### Approaches to Estimation Consider the following SEM: `$$\begin{cases} \begin{alignedat}{9} & Y_{t1} &-\gamma_{21}Y_{t2}&-\gamma_{31}Y_{t3} & &-\beta_{01}&-\beta_{11}X_{t1} & & &= &u_{t1} \\ & & Y_{t2} &-\gamma_{32} Y_{3t} & & -\beta_{02}&-\beta_{12}X_{1t} &- \beta_{22}X_{2t} & &= &u_{t2}\\ &-\gamma_{13}Y_{t1} & &+ Y_{t3} & & -\beta_{03}&-\beta_{13}X_{1t} &-\beta_{23}X_{2t} & &= &u_{t3} \\ &-\gamma_{14}Y_{t1}&-\gamma_{24}Y_{t2} & &+Y_{t4} &-\beta_{04} & & &-\beta_{34}X_{t3} &= &u_{t4} \end{alignedat} \end{cases}$$` - If you focus only on estimating the third equation, we can use the **single equation method** , which the variables `\(Y_2, Y_4, X_3\)` were excluded from the estimation. - If you want to estimate all four equations **simultaneously**, you should use the **system method**, and it will take into account all the constraints on multiple equations in the system. ??? - 如果仅仅关注于估计第三个方程,采用**单方程法**将只考虑此方程,也即仅注意变量 `\(Y_2,Y_4,X_3\)` 被排除在此方程之外。 - 如果希望**同时**估计全部四个方程,采用**方程组法**将会对方程组中多个方程的全部约束都考虑进来。 --- ### Approaches to Estimation In order to use all information of SEM, it is most desirable to apply **system method**, such as **full information maximum likelihood** (FIML). In practice, however, **systems method** are not commonly used for the following main reasons: 1. The computational burden is too great. 2. Systematic methods such as FIML often bring highly nonlinearity on parameters , which are difficult to determine and caculate. 3. If there is one or more **specification error** in SEM (eg. an incorrect functional form or missing variables), the error will be passed to the remaining equations. As a result, the system method becomes very sensitive to the specification errors. ??? 为了保持联立方程模型的品质,最理想的应是使用**方程组法**,比如**完全信息极大似然法** (full information maximum likelihood, FIML) 。 然而,实际上**方程组法**并不常用,主要原因包括: 1. 计算上的负担太大。 2. 像FIML这样的系统方法常常导致参数的高度非线性解,以致难以确定。 3. 如果方程组中的一个或多个方程有**设定误差**(比如说,一个错误的函数形式或漏掉有关变量),则误差将传递至其余方程。其结果是,方程组法变得对设定误差非常敏感。 --- layout: false class: center, middle, duke-softblue,hide_logo name: LS ## 20.2 Least squares approach (LS) --- layout: true <div class="my-header-h2"></div> <div class="watermark1"></div> <div class="watermark2"></div> <div class="watermark3"></div> <div class="my-footer"><span>huhuaping@ <a href="#chpt20"> Chapter 20. How to Estimate SEM ? </a>               <a href="#LS"> 20.2 Least squares approach (LS) </a> </span></div> --- ### OLS Approarch with recursive model **Recursive model** : also known as the **triangle model** or **causality model**. >The simultaneous disturbance terms in different equations are unrelated, and each equation exhibits a one-way causal dependence. Consider the following structural SEM: `$$\begin{cases} \begin{alignat}{6} Y_{t1} = & & & & +\beta_{01}& + \beta_{11}X_{t1} & + \beta_{21}X_{t2} & & +u_{t1} \\ Y_{t2} = & + \gamma_{12}Y_{1t} & & &+\beta_{02} & + \beta_{12}X_{t1} & + \beta_{22}X_{t2} & & + u_{t2}\\ Y_{t3} = & + \gamma_{13}Y_{t1}& + \gamma_{23}Y_{t2}& &+\beta_{03} & + \beta_{13}X_{t1} & + \beta_{23}X_{t2} & & + u_{t3} \\ \end{alignat} \end{cases}$$` ??? 递归模型(recursive model):也称为三角形模型(triangular model)或因果性(causalmodel)模型。 不同方程中的同期干扰项是不相关的,每个方程都展现一种单向的因果依赖关系。 考虑下面的联立方程模型: --- ### OLS Approarch with recursive model `$$\begin{cases} \begin{alignat}{6} Y_{t1} = & & & & +\beta_{01}& + \beta_{11}X_{t1} & + \beta_{21}X_{t2} & & +u_{t1} \\ Y_{t2} = & + \gamma_{12}Y_{1t} & & &+\beta_{02} & + \beta_{12}X_{t1} & + \beta_{22}X_{t2} & & + u_{t2}\\ Y_{t3} = & + \gamma_{13}Y_{t1}& + \gamma_{23}Y_{t2}& &+\beta_{03} & + \beta_{13}X_{t1} & + \beta_{23}X_{t2} & & + u_{t3} \\ \end{alignat} \end{cases}$$` It is easy to find that the contemporaneous disturbance terms in different equations are irrelevant (namely **zero contemporaneous correlation**) : `$$cov(u_{t1},u_{t2})=cov(u_{t1},u_{t3})=cov(u_{t2},u_{t3})=0$$` - Since the first equation' right-hand side only contains exogenous variables, and are not correlated with disturbance terms, so this equation satisfies the CLRM and OLS can be applied directly to it. - because `\(cov(u_{t1},u_{t2})=0\)`, and `\(cov(Y_{t1},u_{t2})=0\)`. Thus OLS can be applied directly to it. - because `\(cov(u_{t1},u_{t3})=0\)`, and `\(cov(Y_{t1},u_{t3})=0\)`. Also `\(cov(u_{t1},u_{t3})=0\)`, and `\(cov(Y_{t2},u_{t3})=0\)`. Thus OLS can be applied directly to it. ??? zero contemporaneous correlation 我们很容易发现,不同方程中的同期干扰项是不相关的(或者说是零同期相关): 对于第一个方程。因为它的右边仅含有外生变量,假定外生变量与干扰项均不相关,所以此方程满足经典OLS解释变量与干扰项不相关的基本假定。因而OLS可直接应用于此方程的估计。 --- ### OLS Approarch with recursive model We can also visualize it graphically: <div class="figure" style="text-align: center"> <img src="pic/chpt20-recursive-graph.png" alt="SEM in recursive form" width="711" /> <p class="caption">SEM in recursive form</p> </div> --- ### OLS Approarch with recursive model Let's look at the **wage-price model**: `$$\begin{cases} \begin{align} P_t &= \beta_0+\beta_1UN_t+\beta_2R_t+\beta_3M_t+u_{t2} &&\text{(price equation)}\\ W_t &= \alpha_0+\alpha_1UN_t+\alpha_2P_t+u_{t1} &&\text{(wage equation)} \end{align} \end{cases}$$` Where: - W, the money wage rate; - UN, unemployment, %; - P, price rate; - R, the cost of capital rate; - M, import price change rate of raw materials. ??? - 价格方程假定当前价格变化率是资本和原料价格变化率、劳动生产率变化率以及前期工资变化率等的函数。 - 工资方程则表示当前工资变化率取决于当前价格变化率和失业率。 - 货币工资变化率W - 失业率UN,% - 价格变化率P - 资本成本变化率R - 进口原材料的价格变化率M --- layout: false class: center, middle, duke-softblue,hide_logo name: ILS ## 20.3 Indirect least squares (ILS) --- layout: true <div class="my-header-h2"></div> <div class="watermark1"></div> <div class="watermark2"></div> <div class="watermark3"></div> <div class="my-footer"><span>huhuaping@ <a href="#chpt20"> Chapter 20. How to Estimate SEM ? </a>               <a href="#ILS"> 20.3 Indirect least squares (ILS) </a> </span></div> --- ### ILS approach with Just Identification model For a just or exactly identified structural equation, the method of obtaining the estimates of the structural coefficients from the OLS estimates of the reduced-form coefficients is known as the method of **Indirect Least Squares** (ILS), and the estimates thus obtained are known as the **indirect least squares estimates**. **ILS** involves the following three steps: - Step 1. We first obtain the reduced-form SEM. - Step 2. We apply **OLS** to the reduced-form SEM individually. - Step 3. We obtain estimates of the original structural coefficients from the estimated reduced-form coefficients obtained in Step 2. > If an equation is **exactly identified**, there is one-to-one mapping between the structural and reduced coefficients. ??? > This operation is permissible since the explanatory variables in these equations are predetermined and hence uncorrelated with the stochastic disturbances. The estimates thus obtained are consistent. 对一个恰好识别的结构方程,从**约简型系数**的OLS估计值获得**结构系数**估计值的方法叫做间接最小二乘法 (method of indirect least squares, ILS),而如此得到的估计值称**间接最小二乘估计值**。 间接最小二乘法(ILS方法)包含以下三个步骤: - step 1:先求约简型方程。从结构方程组解出约简型方程,使得在每个方程的因变量都成为唯一的内生变量,并且仅仅是前定变量(外生或滞后内生)和随机误差项的函数。 - step 2:对约简型方程逐个应用OLS。因为这些方程中的解释变量是前定的并因而与随机干扰项不相关,所以这种做法是合适的,由此得到的估计值是一致的。 - step 3: 从得到的约简型系数的估计值求原始结构系数的估计值 。若方程恰可识别,则结构与约简型系数之间有一一对应关系,就是说可以从后者导出前者的唯一估计值。 --- ### Case demo: US crop supply and demand The variables in the US crop supply and demand case are illustrated below
--- ### Case demo: data set The data for US crop supply and demand case show here:
--- ### Case demo: structural SEM So we can construct the following structural SEM: `$$\begin{cases} \begin{align} Q &= \alpha_0+\alpha_1P_t+\alpha_2X_t+u_{t1} &(\alpha_1<0,\alpha_2>0) &&\text{(demand function)}\\ Q &= \beta_0+\beta_1P_t+u_{t2} &(\beta_1>0) &&\text{(supply function)} \end{align} \end{cases}$$` >where: - `\(Q=\)`Crop yield index; - `\(P=\)`Agricultural products purchasing prices index; - `\(X=\)`Capital personal consumption expenditure. --- ### Case demo: reduced SEM Thus we can obtain the reduced SEM: `$$\begin{cases} \begin{align} P_t &= \pi_{11}+ \pi_{21}X_t+w_t &&\text{(eq1)}\\ Q_t &= \pi_{12}+\pi_{22}X_t+v_t &&\text{(eq2)} \\ \end{align} \end{cases}$$` and the relationship between structural and reduced coefficients is: .pull-left[ `$$\begin{cases} \begin{align} \pi_{11} &= \frac{\beta_0-\alpha_0}{\alpha_1-\beta_1} \\ \pi_{21} &= - \frac{\alpha_2}{\alpha_1-\beta_1}\\ w_t &= \frac{u_{2t}-u_{t1}}{\alpha_1-\beta_1} \\ \end{align} \end{cases}$$` ] .pull-right[ `$$\begin{cases} \begin{align} \pi_{12} &= \frac{\alpha_1\beta_0-\alpha_0\beta_1}{\alpha_1-\beta_1} \\ \pi_{22} &= - \frac{\alpha_2\beta_1}{\alpha_1-\beta_1} \\ v_t &= \frac{\alpha_1u_{t2}-\beta_1u_{1t}}{\alpha_1-\beta_1} \end{align} \end{cases}$$` ] --- ### Case demo: reduced coefficients For the above reduced SEM, we can use OLS method to obtain the estimated coefficients: `$$\begin{cases} \begin{align} \widehat{\pi}_{21} &= \frac{\sum{p_t x_t}}{\sum{x^2_t}} &&\text{(slope of the reduced price eq)} \\ \widehat{\pi}_{11} &= \overline{P} - \widehat{\pi}_1 \cdot \overline{X} &&\text{(intercept of the reduced price eq)} \\ \widehat{\pi}_{22} &= \frac{\sum{q_t x_t}}{\sum{x^2_t}} &&\text{(slope of the reduced quantaty eq)} \\ \widehat{\pi}_{12} &= \overline{Q} - \widehat{\pi}_3 \cdot \overline{X} &&\text{(intercept of the reduced quantaty eq)} \end{align} \end{cases}$$` --- ### Case demo: structural coefficients Because we already know that **the supply equation** in the structural SEM is **Just identification** (please review the order and rank conditions) , hence the structural coefficients of the supply equation can be calculated uniquely with the reduced coefficients. `$$\begin{cases} \begin{align} \beta_0 &= \pi_{12}+ \beta_1 \pi_{11} \\ \beta_1 &= \frac{\pi_{22}}{\pi_{21}} \\ \end{align} \end{cases}$$` which is: `$$\begin{cases} \begin{align} \hat{\beta_0} &= \widehat{\pi}_{12}+ \hat{\beta_1} \widehat{\pi}_{11} \\ \hat{\beta_1} &= \frac{\widehat{\pi}_{22}}{\widehat{\pi}_{21}} \\ \end{align} \end{cases}$$` ??? 因为我们已经知道(请复习阶条件和秩条件)**供给方程**是**恰好可识别的**,因此供给方程的结构系数是可以由简约系数唯一地估计得到: --- ### Case demo: OLS estimates for reduced equation Next, we carry out OLS regression for the reduced equation. `$$\begin{cases} \begin{align} P_t &= \pi_{11}+ \pi_{21}X_t+w_t &&\text{(reduced eq1)}\\ Q_t &= \pi_{12}+\pi_{22}X_t+v_t &&\text{(reduced eq2)} \\ \end{align} \end{cases}$$` .pull-left[ The regression result of the reduced price equation is: `$$\begin{equation} \begin{alignedat}{999} &\widehat{P}=&&+90.96&&+0.00X\\ &\text{(t)}&&(22.4499)&&(3.0060)\\&\text{(se)}&&(4.0517)&&(0.0002)\\&\text{(fitness)}&& R^2=0.2440;&& \bar{R^2}=0.2170\\& && F^{\ast}=9.04;&& p=0.0055 \end{alignedat} \end{equation}$$` ] .pull-right[ The regression result of the reduced quantity equation is: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Q}=&&+59.76&&+0.00X\\ &\text{(t)}&&(38.3080)&&(20.9273)\\&\text{(se)}&&(1.5600)&&(0.0001)\\&\text{(fitness)}&& R^2=0.9399;&& \bar{R^2}=0.9378\\& && F^{\ast}=437.95;&& p=0.0000 \end{alignedat} \end{equation}$$` ] --- exclude: true ### Rscript: calculate values --- ### Case demo: obtain structural coefficients we can obtain the **reduced coefficients**: .pull-left[ - `\(\widehat{\pi}_{21}=\)` 0.00074 - `\(\widehat{\pi}_{11}=\)` 90.96007 ] .pull-right[ - `\(\widehat{\pi}_{22}=\)` 0.00197 - `\(\widehat{\pi}_{12}=\)` 59.76183 ] Because **supply equation** in structural SEM is **Just identification**, so the structural coefficients of **supply equation** can be calculated by using the estimated reduced coefficients. `\(\hat{\beta_1} = \frac{\widehat{\pi}_{22}}{\widehat{\pi}_{12}}\)` = 0.00197 / 0.00074 = 2.68052 `\(\hat{\beta_0} = \widehat{\pi}_{12}+ \hat{\beta_1} \widehat{\pi}_{11}\)` = 59.76183 - 2.68052 `\(\cdot\)` 90.96007 = -184.05874 Therefore, the ILS estimators of **supply equation** parameters are: `\(\hat{Q_t}=\)` -184.05874 + 2.68052 `\(P_t\)` --- ### Case demo: result comparison As comparison, we will show a **"biased"** estimation method, which use OLS directly for both quantity and price equation. .pull-left[ - Estimation of the **supply equation** based on the ILS approach: `\(\hat{Q_t}=\)` -184.05874 + 2.68052 `\(P_t\)` ] .pull_right[ - Estimation of the **supply equation** based on the **biased** OLS approach: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Q}=&&+20.89&&+0.67P\\ &\text{(t)}&&(0.9067)&&(2.9940)\\&\text{(se)}&&(23.0396)&&(0.2246)\\&\text{(fitness)}&& R^2=0.2425;&& \bar{R^2}=0.2154\\& && F^{\ast}=8.96;&& p=0.0057 \end{alignedat} \end{equation}$$` ] --- layout: false class: center, middle, duke-softblue,hide_logo name: TSLS ## 20.4 Two-stage least square method (2SLS) --- layout: true <div class="my-header-h2"></div> <div class="watermark1"></div> <div class="watermark2"></div> <div class="watermark3"></div> <div class="my-footer"><span>huhuaping@ <a href="#chpt20"> Chapter 20. How to Estimate SEM ? </a>               <a href="#TSLS"> 20.4 Two-stage least square method (2SLS) </a> </span></div> --- ### Overidentification: structural SEM Consider the following structural SEM: `$$\begin{cases} \begin{alignat}{6} Y_{t1} = & &+ \gamma_{21}Y_{2t} & & +\beta_{01} & + \beta_{11}X_{t1} & + \beta_{21}X_{t2} & +u_{t1} & \text{ (income eq)} \\ Y_{t2} = & + \gamma_{12}Y_{t1} & & & +\beta_{02} & & & + u_{t2} & \text{ (monetary supply eq)} \end{alignat} \end{cases}$$` where: `\(Y_1=\)`Income; `\(Y_2=\)`Monetary stock; `\(X_1=\)`Government expenditure; `\(X_2=\)`Government spending on goods and services Using **order condition rules** and **rank condition rules** (Review), we can know: - The **Income equation** is **underidentification** - if you don't change the model specification, then god can't help you! - The **monetary supply equation** is **overidentification** it's easy to prove that if we apply the ILS approach we will obtain two estimates on `\(\gamma_{21}\)`. Hence it is impossible to determin the exact value. ??? 运用**阶条件识别规则**和**秩条件识别规则**(复习),可以知道: - **收入方程**是**不可识别的** - 如果不改变模型设定,那“神也帮不了你”! - **货币供给方程**是**过度识别的** - 容易证明,如果采用ILS估计方法,$\gamma_{21}$有两个ILS估计值。所以也没办法估计得到! --- ### Overidentification: Instrument variables Looking for **Instrument variables approach** to crack the **overidentification** problems: - In practice, people might want to use OLS to estimate the monetary supply equation, but it will get the biased estimators, because there exist correlationship between `\(Y_1\)` and `\(u_2\)`. > **Instrument Variable**: An agent variable which is highly correlated with `\(Y_1\)` but have no relationship with `\(u_2\)`. - if we can find an **instrument variable**, then we can apply OLS approach directly to estimate the structural monetary supply eqution. But how does one obtain such an instrumental variable? One answer is provided by the **two-stageleast squares** (2SLS), developed independently by Henri Theil and Robert Basmann. ??? 由瑟尔和巴斯曼 (Robert Basmann)各自独立发现的两阶段最小二乘 (two-stage least squares, 2SLS)成为寻找合适**工具变量**的重要方途径和方法。 --- ### Overidentification: stage 1 of 2SLS **2SLS method** involves two successive applications of OLS. The process is as follows: **Stage 1**. To get rid of the likely correlation between `\(Y_1\)` and `\(u_2\)`,apply regresssion `\(Y_1\)` on all the predetermined variables in the whole system, not just that equation. `$$\begin{align} Y_{t1} &= \widehat{\pi}_{01} + \widehat{\pi}_{11} X_{t1} + \widehat{\pi}_{21} X_{t2} + \hat{v}_{t1} \\ &= \hat{Y_{t1} } + \hat{v}_{t1} \\ \hat{Y_{t1} }&= \widehat{\pi}_{01} + \widehat{\pi}_{11} X_{t1} + \widehat{\pi}_{21} X_{t2} \end{align}$$` Indicates that the random `\(Y_1\)` is composed of two parts: - a linear combination of the nonstochastic `\(X\)` - random component `\(\hat{u}_t\)` >according to OLS theory, `\(\hat {Y}_{t1}\)` is not related to `\(\hat{v}_{t1}\)` (Why?). ??? **stage 1: **为摆脱 `\(Y_1\)` 和 `\(u_2\)` 之间可能的相关性,先求 `\(Y_1\)` 对整个方程组(不仅仅是所考虑的方程中)全部**前定变量**的回归。也即: 表明随机的 `\(Y_1\)` 是由两部分构成: - 随机X的一个线性组合而成的 `\(\hat{Y}_{t1}\)` - 随机成分 `\(\hat{u}_t\)` - 按照OLS理论, `\(\hat{Y}_{t1}\)` 与 `\(\hat{v}_{t1}\)` 是不相关的(Why?提问)。 --- ### Overidentification: stage 2 of 2SLS **stage 2**. Now retransform the **overidentification** supply equation as follow: `$$\begin{align} Y_{t2} & = \beta_{02} + \gamma_{12} Y_{t1} + u_{t2} \\ & = \beta_{02} + \gamma_{12} (\hat{Y}_{t1}+\hat{v}_{t1}) + u_{t2} \\ & = \beta_{02} + \gamma_{12} \hat{Y}_{t1}+(\gamma_{12} \hat{v}_{t1} + u_{t2} ) \\ & = \beta_{02} + \gamma_{12} \hat{Y}_{t1}+ u_{t2}^{\ast} \end{align}$$` We can prove that: - the variable `\(Y_{t1}\)` may be relative with the disturbance term `\(u_{t2}\)`, which will invalid the OLS approach. - Meanwhile, `\(\hat{Y_{1t}}\)`is uncorrelated with `\(u_{t2}^{\ast}\)` asymptotically, that is, in the large sample (or more accurately, as the sample size increases indefinitely). > As a result, OLS can be applied to monetary Eq, which will give consistent estimates of the parameters of the monetary supply function. --- ### 2SLS approach: Features Note the following features of 2SLS: - It can be applied to an individual equation in the system without directly taking into account any other equation(s) in the system. Hence, for solving econometric models involving a large number of equations, 2SLS offers an economical method. - Unlike ILS, which provides multiple estimates of parameters in the overidentified equations, 2SLS provides only one estimate per parameter. - It is easy to apply because all one needs to know is the total number of exogenous or pre-determined variables in the system without knowing any other variables in the system. - Although specially designed to handle overidentified equations, the method can also be applied to exactly identified equations. But then ILS and 2SLS will give identical estimates. (Why?) ??? 两阶段最小二乘法(2SLS方法)的特点: - 它可以应用于方程组中的某个方程而无需考虑方程组中的其他方程。因此,在求解涉及大量方程的计量经济模型时,2SLS提供了一个经济适用的方法。由于这一原因,此法在实际中被广泛应用。 - 相对于ILS为过度识别的方程提供参数的多个估计值,而2SLS对每个参数只提供一个估计值。 - 它只需知道方程组中一共有多少个外生或前定变量,而无需知道方程组中的任何其他变量,故易于应用。 --- ### 2SLS approach: Features Note the following features of 2SLS (continue): - If the `\(R^2\)` values in the reduced-form regressions (that is, Stage 1 regressions) are very high, say, in excess of 0.8, the classical OLS estimates and 2SLS estimates will be very close. > But this result should not be surprising because if the `\(R^2\)` value in the first stage is very high, it means that the estimated values of the endogenous variables are very close to their actual values. > And hence the latter are less likely to be correlated with the stochastic disturbances in the original structural SEM. (Why?) ??? .large[ 两阶段最小二乘法(2SLS方法)的特点(续): - 此法虽然专为过度识别的方程而设计,但同样适用于恰好识别的方程。但这时ILS和2SLS将给出相同的估计。(为什么?) - 如果约简型回归(即阶段1的回归)的F值很高,比如说高于0.8。则经典OLS估计和 2SLS估计将相差无几。 - 在ILS方法的回归报告中,我们没有给出所估系数的**标准误**,但我们能对 2SLS估计值给出这些**标准误**。 ] --- ### 2SLS approach: Features Note the following features of 2SLS (continue): - Notice that in reporting the ILS regression we did not state the **standard errors** of the estimated coefficients . But we can do this for the 2SLS estimates because the structural coefficients are directly estimated from the second-stage (OLS) regressions. > The estimated standard errors in the second-stage regressions need to be modified because the error term `\(u^{\ast}_t\)` is, in fact, equal to `\(u_{2t}+β_{21}\hat{u}_t\)`. > Hence, the variance of `\(u^{\ast}_t\)` is not exactly equal to the variance of the original `\(u_{2t}\)`. --- ### 2SLS approach: Features Note the following features of 2SLS (continue): - Remarks from `Henri Theil`: > - The statistical justification of the 2SLS is of the large-sample type. > - When the equation system contains lagged endogenous variables, the consistency and large-sample normality of the 2SLS coefficient estimators require an additional condition. > - Take cautions when lagged endogenous variables are not really predetermined. --- ### Standard error correction: why In the regression report of ILS method, we do not give the **standard error** of the estimated coefficient, but we can give these **standard error** for the estimator of 2SLS. - Remind `\(u^{\ast}_{t2}=u_{t2}+\gamma_{12}\hat{v}_{t1}\)` - It will imply `\(u^{\ast}_{t2} \neq u_{t2}\)` , and then we need to calculate the "correct" standard error for the purpose of inference. > For the specific method of error correction, please refer to appendix 20a.2 of the textbook (Damodar Gujarati). - In the following cases illurtration, we will show the 2SLS estimates **without error correction** and the 2SLS estimates **with error correction** respectively. ??? 在ILS方法的回归报告中,我们没有给出所估系数的**标准误**,但我们能对 2SLS估计值给出这些**标准误**。 - 需要注意的是: `\(u^{\ast}_{t2}=u_{t2}+\gamma_{12}\hat{v}_{t1}\)` - 这意味着 `\(u^{\ast}_{t2} \neq u_{t2}\)` ,因此需要进行**误差校正** - 误差矫正的具体办法,可以参考教材附录20A.2 - 下面的案例展示中,我们将会分别展示**没有误差校正**的2SLS估计和**经过误差矫正**的2SLS估计。 --- ### Standard error correction: focus stage 2 The process for error correction show as below. - **stage 2:** The regression form of the supply equation is `$$\begin{align} Y_{t2} & = \beta_{02} + \gamma_{12} Y_{1t} + u_{2t} \\ & = \beta_{02} + \gamma_{12} (\hat{Y_{1t}}+\hat{v}_{t1}) + u_{2t} \\ & = \beta_{02} + \gamma_{12} \hat{Y_{1t}}+(\gamma_{12} \hat{v}_{t1} + u_{t2} ) \\ & = \beta_{02} + \gamma_{12} \hat{Y_{1t}}+ u_{t2}^{\ast} \end{align}$$` > Where: > `\(u^{\ast}_{t2}=u_{t2}+\gamma_{12}\hat{v}_{t1}\)` --- ### Standard error correction: focus stage 2 - **stage 2:** the estimation for the parameter `\(\gamma_{12}\)` is `\(\hat{\gamma}_{12}\)`, and its standard error `\(s.e_{\hat{\gamma}_{12}}\)` can be calculated as below. `$$\begin{align} Y_{t2} & = \beta_{02} + \gamma_{12} \hat{Y_{1t}}+ u_{t2}^{\ast} \end{align}$$` `$$\begin{align} s.e_{\hat{\gamma}_{21}} & = \frac{\hat{\sigma}^2_{u^{\ast}_{t2}}}{\sum{\hat{y}^2_{t1}}}\\ \hat{\sigma}^2_{u^{\ast}_{t2}} & =\frac{ \sum{\left ( u^{\ast}_{t2} \right )^2} }{n-2} = \frac{ \sum{\left (Y_{t2}-\hat{\beta}_{02} -\hat{\gamma}_{12}\hat{Y}_{1t} \right )^2} }{n-2} \end{align}$$` --- ### Standard error correction: results - In fact, we know `\(u^{\ast}_{t2} \neq u_{t2}\)` , which means `\(\hat{\sigma}_{u^{\ast}_{t2}} \neq \hat{\sigma}_{u_{t2}}\)`. - Thus we can obtain `\(\hat{\sigma}_{u_{t2}}\)`. `$$\begin{align} \hat{u}_{t2} &= Y_{t2} - \hat{\beta}_{02} - \hat{\gamma}_{12}Y_{t1} \\ \hat{\sigma}^2_{u_{2t}} & =\frac{ \sum{\left ( u_{2t} \right )^2} }{n-2} = \frac{ \sum{\left (Y_{t2}-\hat{\beta}_{02} -\hat{\gamma}_{12}Y_{1t} \right )^2} }{n-2} \end{align}$$` --- ### Standard error correction: for coefficients Therefore, in order to correct the standard error of the coefficients estimated by **stage 2** regression, it is necessary to multiply the standard error of each coefficient by the following **error correction factor**. `$$\begin{align} \eta &= \frac{ \hat{\sigma}^2_{u_{t2}} } {\hat{\sigma}^2_{u^{\ast}_{t2}} } \end{align}$$` `$$\begin{align} s.e^{\ast}_{\hat{\gamma}_{12}} & = s.e_{\hat{\gamma}_{12}} \cdot \eta = s.e_{\hat{\gamma}_{12}} \cdot \frac{ \hat{\sigma}^2_{u_{t2}} } {\hat{\sigma}^2_{u^{\ast}_{t2}} } \\ s.e^{\ast}_{\hat{\beta}_{02}} & = s.e_{\hat{\beta}_{02}} \cdot \eta = s.e_{\hat{\beta}_{20}} \cdot \frac{ \hat{\sigma}^2_{u_{t2}} } {\hat{\sigma}^2_{u^{\ast}_{t2}} } \end{align}$$` ??? - **误差矫正因子**:因此校正**stage 2** 回归所估计的系数的标准误,需要对每一个系数的标准误乘以如下的**误差矫正因子** --- class: inverse, middle, center ## Case study and application for 2SLS approach --- ### variable description
--- ### data set
--- ## Modeling scenario 1 Only the money supply equation is overidentifiable --- ### Structural SEM and identification problems Therefore, we can construct the following structural SEM: `$$\begin{cases} \begin{alignat}{6} Y_{t1} = \beta_{01} & &+ \gamma_{21}Y_{t2} & & + \beta_{11}X_{t1} & + \beta_{21}X_{t2} & & +u_{t1} && \text{ (income eq)} \\ Y_{t2} = \beta_{02} & + \gamma_{12}Y_{1t} & & & & & & + u_{t2} && \text{ (money supply eq)} \end{alignat} \end{cases}$$` Where: - `\(Y_1=GDP\)` (gross domestic product GDP); - `\(Y_2=M2\)` (money supply); - `\(X_1=GDPI\)` (Private domestic investment); - `\(X_2=FEDEXP\)` (Federal expenditure) --- ### 2SLS approach 1: without error correction `$$\begin{cases} \begin{alignat}{6} Y_{t1} &&= \beta_{01} && &&+ \gamma_{21}Y_{t2} && + \beta_{11}X_{t1} && + \beta_{21}X_{t2} && +u_{t1} && \text{ (income eq)} \\ Y_{t2} &&= \beta_{02} && + \gamma_{12}Y_{1t} && && && && + u_{t2} && \text{ (money supply eq)} \end{alignat} \end{cases}$$` **stage 1: ** Estimate the regression of `\(Y_1\)` to all **predetermined variables** in the structural SEM (not only in the equation under consideration), and obtain `\(\hat{Y}_{t1}; \hat {n} _ {t1}\)`. .pull-left[ That is: `$$\begin{align} Y_{1t} &= \widehat{\pi_0} + \widehat{\pi_1} X_{1t} + \widehat{\pi_2} X_{2t} + \hat{v}_{t1} \\ &= \hat{Y_{1t} } + \hat{v}_{t1} \end{align}$$` ] .pull-right[ Regression results of **stage 1**: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y1}=&&+2689.85&&+1.87X1&&+2.03X2\\ &\text{(t)}&&(39.5639)&&(10.8938)&&(18.9295)\\&\text{(se)}&&(67.9874)&&(0.1717)&&(0.1075)\\&\text{(fitness)}&& R^2=0.9964;&& \bar{R^2}=0.9962\\& && F^{\ast}=4534.36;&& p=0.0000 \end{alignedat} \end{equation}$$` ] ??? **stage 1: **先求$Y_1$对整个方程组(不仅仅是所考虑的方程中)全部**前定变量**的回归,并分别得到$\hat{Y }_{t1} ;\hat{v}_{t1}$。也即: --- ### 2SLS approach 1: without error correction At the same time, we can obtain `\(\hat{Y }_{t1} ;\hat{v}_{t1}\)`:
--- ### 2SLS approach 1: without error correction `$$\begin{cases} \begin{alignat}{6} Y_{t1} &&= \beta_{01} && &&+ \gamma_{21}Y_{t2} && + \beta_{11}X_{t1} && + \beta_{21}X_{t2} && +u_{t1} && \text{ (income eq)} \\ Y_{t2} &&= \beta_{02} && + \gamma_{12}Y_{1t} && && && && + u_{t2} && \text{ (money supply eq)} \end{alignat} \end{cases}$$` **stage 2:** Now transform the **overidentification** supply equation as follows: `$$\begin{align} Y_{t2} & = \beta_{02} + \gamma_{12} \hat{Y}_{1t}+ u_{t2}^{\ast} \end{align}$$` >Using the new variables in **stage 1** results, and apply OLS estimation to obtain: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y2}=&&-2440.20&&+0.79Y1.hat\\ &\text{(t)}&&(-19.1578)&&(44.5241)\\&\text{(se)}&&(127.3738)&&(0.0178)\\&\text{(fitness)}&& R^2=0.9831;&& \bar{R^2}=0.9826\\& && F^{\ast}=1982.40;&& p=0.0000 \end{alignedat} \end{equation}$$` --- ### Comparison 1: OLS approach with biased estimation As comparison, we give a **"biased"** estimation with OLS method directly to the money supply equation, and obtain following result: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y2}=&&-2430.34&&+0.79Y1\\ &\text{(t)}&&(-19.1042)&&(44.5059)\\&\text{(se)}&&(127.2148)&&(0.0178)\\&\text{(fitness)}&& R^2=0.9831;&& \bar{R^2}=0.9826\\& && F^{\ast}=1980.77;&& p=0.0000 \end{alignedat} \end{equation}$$` ??? 作为对比,我们给出一个**"有偏误"**的估计方法,也即直接对数量$Y2$对$Y1$采用OLS方法,将得到如下结果: --- ### Comparison 1: OLS approach with biased estimation The R raw report for the biased regression show as: ``` Call: lm(formula = models_money$mod.ols, data = us_money_new) Residuals: Min 1Q Median 3Q Max -418.3 -151.6 40.2 143.5 380.8 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.43e+03 1.27e+02 -19.1 <2e-16 *** Y1 7.90e-01 1.78e-02 44.5 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 229 on 34 degrees of freedom Multiple R-squared: 0.983, Adjusted R-squared: 0.983 F-statistic: 1.98e+03 on 1 and 34 DF, p-value: <2e-16 ``` --- ## Modeling scenario 2 both income equation and money supply equation are over-identifiable --- ### Improved structural SEM and identification problems Different from the former structural SEM, we can construct the improved one: `$$\begin{cases} \begin{alignat}{6} Y_{t1} = \beta_{01} & &+ \gamma_{12}Y_{t2} & & + \beta_{11}X_{t1} & + \beta_{21}X_{t2} & & & +u_{t1} & \text{ (income eq)} \\ Y_{t2} = \beta_{02} & + \gamma_{12}Y_{1t} & & & & &+ \beta_{12}Y_{1,t-1} &+ \beta_{22}Y_{2,t-1} & + u_{t2} & \text{ (money supply eq)} \end{alignat} \end{cases}$$` Next, we judge the identification problem according to **order condition rules 2** : - The number of predetermineed variables in the structural SEM is `\(K=5\)`. - The first equation: the number of predetermined variables is `\(k=3\)`, and `\((K-k)=2\)`. Also the number of endogenous variables is `\(m=2\)`, and `\((m-1)=1\)`. We will see `\((K-k)>(m-1)\)`. So the first equation is **overidentification**. - The second equation: the number of predetermined variables is `\(k=3\)`, and `\((K-k)=2\)`; Also the number of endogenous variables is `\(m=2\)`, and `\((m-1)=1\)`. We will see `\((K-k)>(m-1)\)`. Thus it is also **overidentification**. ??? - Conclusion: both the income equation and the money supply equation are **overidentifiable**. --- ### 2SLS approach 2: without error correction Now, we will use two-stage least squares (2SLS) to get **consistent estimates** for both **income equation** and **money supply equation**. **Stage 1**: - Estimate the regression of `\(Y_1\)` to all **predetermined variables** in the structural SEM (not only in the equation under consideration), and obtain `\(\hat{Y}_{t1} ;\hat{v}^{\ast}_{t1}\)`. - .red[Meanwhile], estimate the regression of `\(Y_2\)` to all **predetermined variables** in the structural SEM (not only in the equation under consideration), and obtain `\(\hat{Y}_{t2} ;\hat{v}^{\ast}_{t2}\)`: `$$\begin{align} Y_{t1} &= \widehat{\pi}_{01} + \widehat{\pi}_{11} X_{t1} + \widehat{\pi}_{21} X_{t2} + \widehat{\pi}_{31} Y_{t-1,1} + \widehat{\pi}_{41} Y_{t-1,2}+ \hat{v}_{t1} \\ &= \hat{Y}_{1t} +\hat{v}_{t1} \\ Y_{t2} &= \widehat{\pi}_{02} + \widehat{\pi}_{12} X_{t1} + \widehat{\pi}_{22} X_{t2} + \widehat{\pi}_{32} Y_{t-1,1} + \widehat{\pi}_{42} Y_{t-1,2}+ \hat{v}_{t2} \\ &= \hat{Y}_{t2} +\hat{v}_{t2} \end{align}$$` --- ### 2SLS approach 2: without error correction - OLS regression results of **new income equation** at **stage 1**: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y1}=&&+1098.90&&+0.98X1&&+0.77X2&&+0.59Y1.l1&&-0.01Y2.l1\\ &\text{(t)}&&(5.9222)&&(7.4954)&&(4.1895)&&(8.8729)&&(-0.1024)\\&\text{(se)}&&(185.5566)&&(0.1308)&&(0.1831)&&(0.0667)&&(0.0721)\\&\text{(fitness)}&& R^2=0.9990;&& \bar{R^2}=0.9989\\& && F^{\ast}=7857.58;&& p=0.0000 \end{alignedat} \end{equation}$$` --- ### 2SLS approach 2: without error correction - OLS regression results of **new money supply equation** at **stage 1**: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y2}=&&-207.14&&+0.20X1&&-0.35X2&&+0.06Y1.l1&&+1.06Y2.l1\\ &\text{(t)}&&(-1.1202)&&(1.5298)&&(-1.9455)&&(0.9352)&&(14.7779)\\&\text{(se)}&&(184.9121)&&(0.1303)&&(0.1824)&&(0.0665)&&(0.0718)\\&\text{(fitness)}&& R^2=0.9985;&& \bar{R^2}=0.9983\\& && F^{\ast}=5050.57;&& p=0.0000 \end{alignedat} \end{equation}$$` --- ### 2SLS approach 2: without error correction Hence, we can obtain new variables from the two former regressions results respectively: `\(\hat{Y}_{t1} ;\hat{v}_{t1}\)`, and `\(\hat{Y}_{t2} ;\hat{v}_{t2}\)`.
--- ### 2SLS approach 2: without error correction **Stage 2:** - The re-transformed new income equation and the money supply equation are: `$$\begin{align} Y_{t1} &= \beta_{01} + \gamma_{21} \hat{Y}_{t2} + \beta_{11}X_{t1} + \beta_{21}X_{t2} +u^{\ast}_{t1} \\ Y_{t2} &= \beta_{20} + \gamma_{12} \hat{Y}_{1t} + \beta_{12}Y_{t-1,1} + \beta_{22}Y_{t-1,2} + u^{\ast}_{t2} \end{align}$$` - And then, we can conduct these new equations by using the former variables frome stage 1. --- ### 2SLS approach 2: without error correction - OLS estimation results of **new** income equation in **stage 2**: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y1}=&&+2723.68&&+0.22Y2.hat&&+1.71X1&&+1.57X2\\ &\text{(t)}&&(40.3310)&&(1.8961)&&(9.2748)&&(5.9811)\\&\text{(se)}&&(67.5331)&&(0.1156)&&(0.1848)&&(0.2623)\\&\text{(fitness)}&& R^2=0.9966;&& \bar{R^2}=0.9963\\& && F^{\ast}=3073.97;&& p=0.0000 \end{alignedat} \end{equation}$$` --- ### 2SLS approach 2: without error correction - OLS estimation results of the **new** money supply equation in **stage 2**: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y2}=&&-228.13&&+0.11Y1.hat&&-0.03Y1.l1&&+0.93Y2.l1\\ &\text{(t)}&&(-1.4843)&&(0.7685)&&(-0.1691)&&(15.0961)\\&\text{(se)}&&(153.6925)&&(0.1431)&&(0.1481)&&(0.0618)\\&\text{(fitness)}&& R^2=0.9981;&& \bar{R^2}=0.9979\\& && F^{\ast}=5461.60;&& p=0.0000 \end{alignedat} \end{equation}$$` --- ### 2SLS approach 2: without error correction - The R raw report of OLS estimation for the new income equation in **stage 2**: .scroll-box-18[ ``` Call: lm(formula = models_money2$mod2.stage2.1, data = us_money_new2) Residuals: Min 1Q Median 3Q Max -360.4 -66.8 35.7 81.2 186.2 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2723.681 67.533 40.33 < 2e-16 *** Y2.hat 0.219 0.116 1.90 0.067 . X1 1.714 0.185 9.27 1.9e-10 *** X2 1.569 0.262 5.98 1.3e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 130 on 31 degrees of freedom (因为不存在,1个观察量被删除了) Multiple R-squared: 0.997, Adjusted R-squared: 0.996 F-statistic: 3.07e+03 on 3 and 31 DF, p-value: <2e-16 ``` ] --- ### 2SLS approach 2: without error correction - The R raw report of OLS estimation for the new money supply equation in **stage 2**: .scroll-box-18[ ``` Call: lm(formula = models_money2$mod2.stage2.2, data = us_money_new2) Residuals: Min 1Q Median 3Q Max -192.58 -42.96 7.05 42.93 218.68 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -228.1320 153.6925 -1.48 0.15 Y1.hat 0.1100 0.1431 0.77 0.45 Y1.l1 -0.0250 0.1481 -0.17 0.87 Y2.l1 0.9330 0.0618 15.10 7.8e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 78 on 31 degrees of freedom (因为不存在,1个观察量被删除了) Multiple R-squared: 0.998, Adjusted R-squared: 0.998 F-statistic: 5.46e+03 on 3 and 31 DF, p-value: <2e-16 ``` ] --- ### 2SLS approach 2: with error correction By using R `systemfit` package, we can apply the two-stage least square method with **"error correction"**, and the report summarized as follows:
--- ### 2SLS approach 2: with error correction By using R `systemfit` package, we can apply the two-stage least square method with **"error correction"**, and the detail report show as follows: .scroll-box-16[ ``` systemfit results method: 2SLS N DF SSR detRCov OLS-R2 McElroy-R2 system 70 62 749260 1.07e+08 0.997 0.998 N DF SSR MSE RMSE R2 Adj R2 eq1 35 31 549669 17731 133.2 0.996 0.996 eq2 35 31 199592 6438 80.2 0.998 0.998 The covariance matrix of the residuals eq1 eq2 eq1 17731 -2604 eq2 -2604 6438 The correlations of the residuals eq1 eq2 eq1 1.000 -0.244 eq2 -0.244 1.000 2SLS estimates for 'eq1' (equation 1) Model Formula: Y1 ~ Y2 + X1 + X2 Instruments: ~X1 + X2 + Y1.l1 + Y2.l1 Estimate Std. Error t value Pr(>|t|) (Intercept) 2723.681 69.102 39.42 < 2e-16 *** Y2 0.219 0.118 1.85 0.073 . X1 1.714 0.189 9.06 3.2e-10 *** X2 1.569 0.268 5.85 1.9e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 133.159 on 31 degrees of freedom Number of observations: 35 Degrees of Freedom: 31 SSR: 549668.549 MSE: 17731.244 Root MSE: 133.159 Multiple R-Squared: 0.996 Adjusted R-Squared: 0.996 2SLS estimates for 'eq2' (equation 2) Model Formula: Y2 ~ Y1 + Y1.l1 + Y2.l1 Instruments: ~X1 + X2 + Y1.l1 + Y2.l1 Estimate Std. Error t value Pr(>|t|) (Intercept) -228.1320 157.9455 -1.44 0.16 Y1 0.1100 0.1471 0.75 0.46 Y1.l1 -0.0250 0.1522 -0.16 0.87 Y2.l1 0.9330 0.0635 14.69 1.8e-15 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 80.24 on 31 degrees of freedom Number of observations: 35 Degrees of Freedom: 31 SSR: 199591.775 MSE: 6438.444 Root MSE: 80.24 Multiple R-Squared: 0.998 Adjusted R-Squared: 0.998 ``` ] --- ### Comparison 2: OLS approach with biased estimation - **"biased"** OLS estimation results of the income equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y1}=&&+2706.39&&+0.17Y2&&+1.75X1&&+1.68X2\\ &\text{(t)}&&(40.0265)&&(1.5062)&&(9.3862)&&(6.6008)\\&\text{(se)}&&(67.6150)&&(0.1115)&&(0.1864)&&(0.2552)\\&\text{(fitness)}&& R^2=0.9966;&& \bar{R^2}=0.9963\\& && F^{\ast}=3139.86;&& p=0.0000 \end{alignedat} \end{equation}$$` - **"biased"** OLS estimates of the money supply equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Y2}=&&-206.53&&-0.02Y1&&+0.10Y1.l1&&+0.94Y2.l1\\ &\text{(t)}&&(-1.3373)&&(-0.1377)&&(0.7870)&&(15.1363)\\&\text{(se)}&&(154.4428)&&(0.1178)&&(0.1252)&&(0.0622)\\&\text{(fitness)}&& R^2=0.9981;&& \bar{R^2}=0.9979\\& && F^{\ast}=5362.58;&& p=0.0000 \end{alignedat} \end{equation}$$` --- layout: false class: center, middle, duke-softblue,hide_logo name: truffle ## 20.5 Truffle supply and demand --- layout: true <div class="my-header-h2"></div> <div class="watermark1"></div> <div class="watermark2"></div> <div class="watermark3"></div> <div class="my-footer"><span>huhuaping@ <a href="#chpt20"> Chapter 20. How to Estimate SEM ? </a>               <a href="#truffle"> 20.5 Truffle supply and demand </a> </span></div> --- ### Case of Truffle supply and demand .pull-left[ <img src="../pic/Truffle-pig.jpg" width="367" style="display: block; margin: auto;" /> <img src="../pic/Truffle-pig2.jpg" width="367" style="display: block; margin: auto;" /> ] .pull-right[ <img src="../pic/Truffle-pig4.jpg" width="342" style="display: block; margin: auto;" /> ] --- ### Variables description
??? case source: Hill, R. C., W. E. Griffiths and G. C. Lim. Principles of Econometrics 4th Edition [M], Wiley, 2011. chpt 11 refferenc: [PoE with R](https://bookdown.org/ccolonescu/RPoE4/simultaneous-equations-models.html) --- ### Data set
--- ### Scatter The scatter plot on truffle quantity Q and truffle market price P is given below: <img src="SEM-slide-eng-part3-estimation_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- ### The structural SEM Given the structural SEM: `$$\begin{cases} \begin{align} Q_i &= \alpha_0+\alpha_1P_i+\alpha_2PS_i+\alpha_3DI_i+u_{i1} &&\text{(demand function)}\\ Q_i &= \beta_0+\beta_1P_i+\beta_2PF_i+u_{i2} &&\text{(supply function)} \end{align} \end{cases}$$` --- class: page-font-20 ### The reduced SEM We can get the reduced SEM: `$$\begin{cases} \begin{align} P_i &= \pi_{01}+ \pi_{11}PS_i+\pi_{21}DI_i+\pi_{31}PF_i+v_{t1}\\ Q_i &= \pi_{02}+\pi_{12}PS_t+\pi_{22}DI_i+\pi_{32}PF_i+v_{t2} \end{align} \end{cases}$$` Also we obtain the relationship between structural and reduced coefficients: .smaller[ .pull-left[ `$$\begin{cases} \begin{align} & \pi_{01} = \frac{\beta_0-\alpha_0}{\alpha_1-\beta_1} \\ & \pi_{11} = - \frac{\alpha_2}{\alpha_1-\beta_1} \\ & \pi_{21} = - \frac{\alpha_3}{\alpha_1-\beta_1} \\ & \pi_{31} = \frac{\beta_2}{\alpha_1-\beta_1} \\ & v_{t1} = \frac{u_{2t}-u_{1t}}{\alpha_1-\beta_1} && \end{align} \end{cases}$$` ] .pull-right[ `$$\begin{cases} \begin{align} & \pi_{02} = - \frac{\alpha_1\beta_0-\alpha_0\beta_1}{\alpha_1-\beta_1} \\ & \pi_{12} = - \frac{\alpha_2\beta_1}{\alpha_1-\beta_1} \\ & \pi_{22} = - \frac{\alpha_3\beta_1}{\alpha_1-\beta_1} \\ & \pi_{32} = \frac{\alpha_1\beta_2}{\alpha_1-\beta_1} \\ & v_{t2} = \frac{\alpha_1u_{2t}-\beta_1u_{1t}}{\alpha_1-\beta_1} \end{align} \end{cases}$$` ] ] --- exclude: true ### Rscript: simple OLS estimate --- ### Simple OLS solution: results We can apply OLS method directly. Of course estimation results will be biased. - tidy results of **bias** OLS estimation for the demand equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Q}=&&+1.09&&+0.02P&&+0.71PS&&+0.08DI\\ &\text{(t)}&&(0.2940)&&(0.3032)&&(3.3129)&&(0.0642)\\&\text{(se)}&&(3.7116)&&(0.0768)&&(0.2143)&&(1.1909)\\&\text{(fitness)}&& R^2=0.4957;&& \bar{R^2}=0.4375\\& && F^{\ast}=8.52;&& p=0.0004 \end{alignedat} \end{equation}$$` - tidy results of **bias** OLS estimation for the supply equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Q}=&&+20.03&&+0.34P&&-1.00PF\\ &\text{(t)}&&(16.3938)&&(15.5436)&&(-13.1028)\\&\text{(se)}&&(1.2220)&&(0.0217)&&(0.0764)\\&\text{(fitness)}&& R^2=0.9019;&& \bar{R^2}=0.8946\\& && F^{\ast}=124.08;&& p=0.0000 \end{alignedat} \end{equation}$$` --- ### Simple OLS solution: R code (`lm`) ```r # set equation systems eq.D <- Q~P+PS+DI eq.S <- Q~P+PF # fit using direct `OLS` method ols.D <- lm(formula = eq.D, data = truffles) ols.S <- lm(formula = eq.S, data = truffles) # report smry.olsD <- summary(ols.D) smry.olsS <- summary(ols.S) ``` --- ### Simple OLS solution: R report (`lm`) .pull-left[ .scroll-box-18[ ``` Call: lm(formula = eq.D, data = truffles) Residuals: Min 1Q Median 3Q Max -7.155 -1.936 -0.374 2.396 6.335 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.0910 3.7116 0.29 0.7711 P 0.0233 0.0768 0.30 0.7642 PS 0.7100 0.2143 3.31 0.0027 ** DI 0.0764 1.1909 0.06 0.9493 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.5 on 26 degrees of freedom Multiple R-squared: 0.496, Adjusted R-squared: 0.438 F-statistic: 8.52 on 3 and 26 DF, p-value: 0.000416 ``` ] ] .pull-right[ .scroll-box-18[ ``` Call: lm(formula = eq.S, data = truffles) Residuals: Min 1Q Median 3Q Max -3.783 -0.853 0.227 0.758 3.347 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 20.0328 1.2220 16.4 1.5e-15 *** P 0.3380 0.0217 15.5 5.4e-15 *** PF -1.0009 0.0764 -13.1 3.2e-13 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.5 on 27 degrees of freedom Multiple R-squared: 0.902, Adjusted R-squared: 0.895 F-statistic: 124 on 2 and 27 DF, p-value: 2.45e-14 ``` ] ] --- ### Simple OLS solution: R code (`symtemfit`) --- ### Simple OLS solution: R report (`symtemfit`) .scroll-box-18[ ``` systemfit results method: OLS N DF SSR detRCov OLS-R2 McElroy-R2 system 60 53 372 23.6 0.699 0.809 N DF SSR MSE RMSE R2 Adj R2 eq1 30 26 311.2 11.97 3.46 0.496 0.438 eq2 30 27 60.6 2.24 1.50 0.902 0.895 The covariance matrix of the residuals eq1 eq2 eq1 11.97 1.81 eq2 1.81 2.24 The correlations of the residuals eq1 eq2 eq1 1.000 0.349 eq2 0.349 1.000 OLS estimates for 'eq1' (equation 1) Model Formula: Q ~ P + PS + DI Estimate Std. Error t value Pr(>|t|) (Intercept) 1.0910 3.7116 0.29 0.7711 P 0.0233 0.0768 0.30 0.7642 PS 0.7100 0.2143 3.31 0.0027 ** DI 0.0764 1.1909 0.06 0.9493 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.46 on 26 degrees of freedom Number of observations: 30 Degrees of Freedom: 26 SSR: 311.21 MSE: 11.97 Root MSE: 3.46 Multiple R-Squared: 0.496 Adjusted R-Squared: 0.438 OLS estimates for 'eq2' (equation 2) Model Formula: Q ~ P + PF Estimate Std. Error t value Pr(>|t|) (Intercept) 20.0328 1.2220 16.4 1.3e-15 *** P 0.3380 0.0217 15.5 5.3e-15 *** PF -1.0009 0.0764 -13.1 3.2e-13 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.498 on 27 degrees of freedom Number of observations: 30 Degrees of Freedom: 27 SSR: 60.555 MSE: 2.243 Root MSE: 1.498 Multiple R-Squared: 0.902 Adjusted R-Squared: 0.895 ``` ] --- exclude: true ### Rscript: IV 2SLS estimate --- ### IV-2SLS Solution: results
--- ### IV-2SLS Solution: R code (`symtemfit`) .scroll-box-18[ ```r # load pkg require(systemfit) # set equation systems eq.D <- Q~P+PS+DI eq.S <- Q~P+PF eq.sys <- list(eq.D, eq.S) # set instruments instr <- ~PS+DI+PF # system fit using `2SLS` method system.iv <-systemfit( formula = eq.sys, inst = instr, method="2SLS", data=truffles) # report smry.iv <- summary(system.iv) ``` ] --- ### IV-2SLS Solution: R report (`symtemfit`) .scroll-box-18[ ``` systemfit results method: 2SLS N DF SSR detRCov OLS-R2 McElroy-R2 system 60 53 692 49.8 0.439 0.807 N DF SSR MSE RMSE R2 Adj R2 eq1 30 26 631.9 24.30 4.93 -0.024 -0.142 eq2 30 27 60.6 2.24 1.50 0.902 0.895 The covariance matrix of the residuals eq1 eq2 eq1 24.30 2.17 eq2 2.17 2.24 The correlations of the residuals eq1 eq2 eq1 1.000 0.294 eq2 0.294 1.000 2SLS estimates for 'eq1' (equation 1) Model Formula: Q ~ P + PS + DI Instruments: ~PS + DI + PF Estimate Std. Error t value Pr(>|t|) (Intercept) -4.279 5.544 -0.77 0.4471 P -0.374 0.165 -2.27 0.0315 * PS 1.296 0.355 3.65 0.0012 ** DI 5.014 2.284 2.20 0.0372 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 4.93 on 26 degrees of freedom Number of observations: 30 Degrees of Freedom: 26 SSR: 631.917 MSE: 24.305 Root MSE: 4.93 Multiple R-Squared: -0.024 Adjusted R-Squared: -0.142 2SLS estimates for 'eq2' (equation 2) Model Formula: Q ~ P + PF Instruments: ~PS + DI + PF Estimate Std. Error t value Pr(>|t|) (Intercept) 20.0328 1.2231 16.4 1.6e-15 *** P 0.3380 0.0249 13.6 1.4e-13 *** PF -1.0009 0.0825 -12.1 1.9e-12 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.498 on 27 degrees of freedom Number of observations: 30 Degrees of Freedom: 27 SSR: 60.555 MSE: 2.243 Root MSE: 1.498 Multiple R-Squared: 0.902 Adjusted R-Squared: 0.895 ``` ] --- ### IV-2SLS Solution: results of the reduced SEM The OLS regression results of **reduced price equation**: `$$\begin{equation} \begin{alignedat}{999} &\widehat{P}=&&-32.51&&+1.71PS&&+7.60DI&&+1.35PF\\ &\text{(t)}&&(-4.0721)&&(4.8682)&&(4.4089)&&(4.5356)\\&\text{(se)}&&(7.9842)&&(0.3509)&&(1.7243)&&(0.2985)\\&\text{(fitness)}&& R^2=0.8887;&& \bar{R^2}=0.8758\\& && F^{\ast}=69.19;&& p=0.0000 \end{alignedat} \end{equation}$$` The OLS regression results of **reduced quantity equation**: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Q}=&&+7.90&&+0.66PS&&+2.17DI&&-0.51PF\\ &\text{(t)}&&(2.4342)&&(4.6051)&&(3.0938)&&(-4.1809)\\&\text{(se)}&&(3.2434)&&(0.1425)&&(0.7005)&&(0.1213)\\&\text{(fitness)}&& R^2=0.6974;&& \bar{R^2}=0.6625\\& && F^{\ast}=19.97;&& p=0.0000 \end{alignedat} \end{equation}$$` --- class: page-font-20 ### Comparison: the biased OLS estimation We can apply OLS method directly. Of course estimation results will be biased. - tidy results of **bias** OLS estimation for the demand equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Q}=&&+1.09&&+0.02P&&+0.71PS&&+0.08DI\\ &\text{(t)}&&(0.2940)&&(0.3032)&&(3.3129)&&(0.0642)\\&\text{(se)}&&(3.7116)&&(0.0768)&&(0.2143)&&(1.1909)\\&\text{(fitness)}&& R^2=0.4957;&& \bar{R^2}=0.4375\\& && F^{\ast}=8.52;&& p=0.0004 \end{alignedat} \end{equation}$$` - tidy results of **bias** OLS estimation for the supply equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{Q}=&&+20.03&&+0.34P&&-1.00PF\\ &\text{(t)}&&(16.3938)&&(15.5436)&&(-13.1028)\\&\text{(se)}&&(1.2220)&&(0.0217)&&(0.0764)\\&\text{(fitness)}&& R^2=0.9019;&& \bar{R^2}=0.8946\\& && F^{\ast}=124.08;&& p=0.0000 \end{alignedat} \end{equation}$$` --- layout: false class: center, middle, duke-softblue,hide_logo name: cod ## 20.6 Cod supply and demand --- layout: true <div class="my-header-h2"></div> <div class="watermark1"></div> <div class="watermark2"></div> <div class="watermark3"></div> <div class="my-footer"><span>huhuaping@ <a href="#chpt20"> Chapter 20. How to Estimate SEM ? </a>               <a href="#cod"> 20.6 Cod supply and demand </a> </span></div> --- class: center, inverse ### Cod supply and demand .pull-left[ <img src="../pic/whiting-fishing.jpg" width="345" style="display: block; margin: auto;" /> <img src="../pic/whiting-fishing2.jpg" width="345" style="display: block; margin: auto;" /> ] .pull-right[ <img src="../pic/whiting-fishing4.jpg" width="259" style="display: block; margin: auto;" /> ] --- ### Variables description
.footer-note[.tiny[ source:Hill, R. C., W. E. Griffiths and G. C. Lim. Principles of Econometrics 4th Edition [M], Wiley, 2011. chpt 11。reference:[PoE with R](https://bookdown.org/ccolonescu/RPoE4/simultaneous-equations-models.html) ] ] --- ### Sample data set
--- ### Scatter <img src="SEM-slide-eng-part3-estimation_files/figure-html/unnamed-chunk-25-1.png" style="display: block; margin: auto;" /> --- ### The structural and reduced SEM Given the structural SEM: `$$\begin{cases} \begin{align} lquan_t &= \alpha_0+\alpha_1lprice_t+\alpha_2mon_t+\alpha_3tue_t+\alpha_4wen_t+\alpha_5thu_t+u_{1t} &&\text{(demand eq)}\\ lquan_t &= \beta_0+\beta_1lprice_t+\beta_3stormy_t+u_{2t} &&\text{(supply eq)} \end{align} \end{cases}$$` We can obtain the reduced SEM: `$$\begin{cases} \begin{align} lquan_t &= \pi_0 + \pi_1mon_t+\pi_2tue_t+\pi_3wen_t+\pi_4thu_t +\pi_5stormy_t+v_t &&\text{(reduced eq1)}\\ lprice_t &= \pi_0 + \pi_1mon_t+\pi_2tue_t+\pi_3wen_t+\pi_4thu_t +\pi_5stormy_t+v_t &&\text{(reduced eq2)}\\ \end{align} \end{cases}$$` ??? display --- ### OLS regression results of the reduced SEM The regression results of reduced **quantity** equation show as follows: `$$\begin{alignedat}{999} &\widehat{lquan}=&&+8.81&&+0.10mon&&-0.48tue&&-0.55wed&&+0.05thu&&-0.39stormy\\ &\text{(t)}&&(59.9225)&&(0.4891)&&(-2.4097)&&(-2.6875)&&(0.2671)&&(-2.6979)\\ &\text{(se)}&&(0.1470)&&(0.2065)&&(0.2011)&&(0.2058)&&(0.2010)&&(0.1437)\\ &\text{(fitness)}&& n=111;&& R^2=0.1934;&& \bar{R^2}=0.1550\\ & && F^{\ast}=5.03;&& p=0.0004\\ \end{alignedat}$$` The regression results of reduced **price** equation show as follows: `$$\begin{alignedat}{999} &\widehat{lprice}=&&-0.27&&-0.11mon&&-0.04tue&&-0.01wed&&+0.05thu&&+0.35stormy\\ &\text{(t)}&&(-3.5569)&&(-1.0525)&&(-0.3937)&&(-0.1106)&&(0.4753)&&(4.6387)\\ &\text{(se)}&&(0.0764)&&(0.1073)&&(0.1045)&&(0.1069)&&(0.1045)&&(0.0747)\\ &\text{(fitness)}&& n=111;&& R^2=0.1789;&& \bar{R^2}=0.1398\\ & && F^{\ast}=4.58;&& p=0.0008 \end{alignedat}$$` ??? why --- ### Two-stage least squares (2SLS) regression results
--- ### Two-stage least squares (2SLS) regression results .scroll-box-20[ ``` systemfit results method: 2SLS N DF SSR detRCov OLS-R2 McElroy-R2 system 222 213 110 0.107 0.094 -0.598 N DF SSR MSE RMSE R2 Adj R2 eq1 111 105 52.1 0.496 0.704 0.139 0.098 eq2 111 108 57.5 0.533 0.730 0.049 0.032 The covariance matrix of the residuals eq1 eq2 eq1 0.496 0.396 eq2 0.396 0.533 The correlations of the residuals eq1 eq2 eq1 1.000 0.771 eq2 0.771 1.000 2SLS estimates for 'eq1' (equation 1) Model Formula: lquan ~ lprice + mon + tue + wed + thu Instruments: ~mon + tue + wed + thu + stormy Estimate Std. Error t value Pr(>|t|) (Intercept) 8.5059 0.1662 51.19 <2e-16 *** lprice -1.1194 0.4286 -2.61 0.010 * mon -0.0254 0.2148 -0.12 0.906 tue -0.5308 0.2080 -2.55 0.012 * wed -0.5664 0.2128 -2.66 0.009 ** thu 0.1093 0.2088 0.52 0.602 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.704 on 105 degrees of freedom Number of observations: 111 Degrees of Freedom: 105 SSR: 52.09 MSE: 0.496 Root MSE: 0.704 Multiple R-Squared: 0.139 Adjusted R-Squared: 0.098 2SLS estimates for 'eq2' (equation 2) Model Formula: lquan ~ lprice + stormy Instruments: ~mon + tue + wed + thu + stormy Estimate Std. Error t value Pr(>|t|) (Intercept) 8.62835 0.38897 22.18 <2e-16 *** lprice 0.00106 1.30955 0.00 1.00 stormy -0.36325 0.46491 -0.78 0.44 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.73 on 108 degrees of freedom Number of observations: 111 Degrees of Freedom: 108 SSR: 57.522 MSE: 0.533 Root MSE: 0.73 Multiple R-Squared: 0.049 Adjusted R-Squared: 0.032 ``` ] --- ### Comparison: the biased OLS estimation - tidy results of **bias** OLS estimation for the demand equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{lquan}=&&+8.61&&-0.56lprice&&+0.01mon&&-0.52tue&&-0.56wed&&+0.08thu\\ &\text{(t)}&&(60.1698)&&(-3.3443)&&(0.0706)&&(-2.6114)&&(-2.7450)&&(0.4126)\\&\text{(se)}&&(0.1430)&&(0.1682)&&(0.2026)&&(0.1977)&&(0.2023)&&(0.1978)\\&\text{(fitness)}&& R^2=0.2205;&& \bar{R^2}=0.1834\\& && F^{\ast}=5.94;&& p=0.0001 \end{alignedat} \end{equation}$$` - tidy results of **bias** OLS estimation for the supply equation: `$$\begin{equation} \begin{alignedat}{999} &\widehat{lquan}=&&+8.50&&-0.44lprice&&-0.22stormy\\ &\text{(t)}&&(86.6914)&&(-2.2560)&&(-1.3253)\\&\text{(se)}&&(0.0981)&&(0.1942)&&(0.1630)\\&\text{(fitness)}&& R^2=0.0923;&& \bar{R^2}=0.0755\\& && F^{\ast}=5.49;&& p=0.0053 \end{alignedat} \end{equation}$$` --- ### Comparison: the biased OLS estimation - raw R summry of **bias** OLS estimation for the demand equation: .scroll-box-18[ ``` Call: lm(formula = fish.D, data = fultonfish) Residuals: Min 1Q Median 3Q Max -2.2384 -0.3674 0.0883 0.4230 1.2487 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.6069 0.1430 60.17 <2e-16 *** lprice -0.5625 0.1682 -3.34 0.0011 ** mon 0.0143 0.2026 0.07 0.9438 tue -0.5162 0.1977 -2.61 0.0103 * wed -0.5554 0.2023 -2.75 0.0071 ** thu 0.0816 0.1978 0.41 0.6807 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.67 on 105 degrees of freedom Multiple R-squared: 0.22, Adjusted R-squared: 0.183 F-statistic: 5.94 on 5 and 105 DF, p-value: 7.08e-05 ``` ] --- ### Comparison: the biased OLS estimation - raw R summry of **bias** OLS estimation for the supply equation: ``` Call: lm(formula = fish.S, data = fultonfish) Residuals: Min 1Q Median 3Q Max -2.4042 -0.3754 0.0734 0.5197 1.2267 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.5009 0.0981 86.69 <2e-16 *** lprice -0.4381 0.1942 -2.26 0.026 * stormy -0.2160 0.1630 -1.33 0.188 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.71 on 108 degrees of freedom Multiple R-squared: 0.0923, Adjusted R-squared: 0.0755 F-statistic: 5.49 on 2 and 108 DF, p-value: 0.00534 ``` --- layout:false background-image: url("../pic/thank-you-gif-funny-little-yellow.gif") class: inverse,center # End of this chapter!