欢迎访问 生活随笔!

生活随笔

当前位置: 首页 > 编程语言 > python >内容正文

python

python经济学函数_有没有python计量经济学的教程?

发布时间:2024/9/27 python 40 豆豆
生活随笔 收集整理的这篇文章主要介绍了 python经济学函数_有没有python计量经济学的教程? 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

多元线性回归模型假设:

假设中国2013年各地区人均现金消费支出与工资性收入、其他收入之间的关系为:

Y= β 0 =\beta_0=β0​+β 1 X 1 \beta_1X_1β1​X1​+β 2 X 2 \beta_2X_2β2​X2​+μ \muμ

通过p y t h o n pythonpython的s t a t s m o d e l s statsmodelsstatsmodels库对数据进行回归计算:

import statsmodels.api as sm

import seaborn as sns

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

from sklearn import model_selection

data = pd.read_excel(r'./计量经济学数据.xlsx', sheet_name='Sheet1')

fit = sm.formula.ols(formula='现金消费支出Y ~ 工资性收入X1 + 其他收入X2', data=data).fit()

print(fit.summary())

sns.lmplot(x='工资性收入X1', y='现金消费支出Y', data=data, ci=None)

plt.rcParams['font.sans-serif'] = ['SimHei']

plt.rcParams['axes.unicode_minus'] = False

plt.show()

sns.pairplot(data.loc[:, ['现金消费支出Y', '工资性收入X1', '其他收入X2']])

# 显示图形

plt.show()

OLS Regression Results

==============================================================================

Dep. Variable: 现金消费支出Y R-squared: 0.922

Model: OLS Adj. R-squared: 0.917

Method: Least Squares F-statistic: 166.6

Date: Sun, 26 May 2019 Prob (F-statistic): 2.84e-16

Time: 13:43:41 Log-Likelihood: -260.68

No. Observations: 31 AIC: 527.4

Df Residuals: 28 BIC: 531.7

Df Model: 2

Covariance Type: nonrobust

==============================================================================

coef std err t P>|t| [0.025 0.975]

------------------------------------------------------------------------------

Intercept 2599.1455 827.342 3.142 0.004 904.412 4293.879

工资性收入X1 0.4865 0.058 8.448 0.000 0.369 0.604

其他收入X2 0.6017 0.104 5.772 0.000 0.388 0.815

==============================================================================

Omnibus: 1.082 Durbin-Watson: 1.915

Prob(Omnibus): 0.582 Jarque-Bera (JB): 0.556

Skew: 0.327 Prob(JB): 0.757

Kurtosis: 3.064 Cond. No. 8.50e+04

==============================================================================

Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

[2] The condition number is large, 8.5e+04. This might indicate that there are

strong multicollinearity or other numerical problems.

模型检验:

H 0 : β j = 0 H_0:\beta_j=0H0​:βj​=0

H 1 : β j H_1:\beta_jH1​:βj​不全部为零

拟合优度检验:

从回归估计来看,模型拟合较好,可决系数R 2 = 0.922. R^{2}=0.922.R2=0.922.

F检验:

F值为166.6,查表得F α ( k , n − k − 1 ) = 3.34 F_{\alpha}(k,n-k-1)=3.34Fα​(k,n−k−1)=3.34,其中k=2,n=31,显然有F > F α ( k , n − k − 1 ) F>F_{\alpha}(k,n-k-1)F>Fα​(k,n−k−1),表明模型的线性关系在5%的显著水平下显著成立.所以拒绝原假设。

t检验:

∣ t 1 ∣ = 8.448 , ∣ t 2 ∣ = 5.772 , t α / 2 ( n − k − 1 ) = 2.048 \left|t_1\right|=8.448,\left|t_2\right|=5.772,t_{\alpha/2}(n-k-1)=2.048∣t1​∣=8.448,∣t2​∣=5.772,tα/2​(n−k−1)=2.048

由于∣ t ∣ > t α / 2 ( n − k − 1 ) \left|t\right|>t_{\alpha/2}(n-k-1)∣t∣>tα/2​(n−k−1),所以拒绝零假设.

综上可得中国2013年各地区人均现金消费支出与工资性收入、其他收入之间的关系为:

Y = 2599.1455 + 0.4865 X 1 + 0.6017 X 2 Y=2599.1455+0.4865X_1+0.6017X_2Y=2599.1455+0.4865X1​+0.6017X2​

β 1 < β 2 \beta_1

因变量预测有时建立完模型并对其进行检验后,还需观察实际值和预测值具体情况,以确定模型的可用性。

data4 = pd.read_excel(r'./计量经济学数据.xlsx', sheet_name='Sheet1')

train, test = model_selection.train_test_split(data4, test_size=0.2, random_state=1234)

fit4 = sm.formula.ols(formula='现金消费支出Y ~ 工资性收入X1 + 其他收入X2', data=train).fit()

test_X = test.drop(labels='现金消费支出Y', axis=1)

pred = fit4.predict(exog=test_X)

print('对比预测值和实际值:\n', pd.DataFrame({'prediction': pred, 'real': test.现金消费支出Y}))

对比预测值和实际值:

prediction real

7 13874.648201 14161.7

10 25068.272118 23257.2

4 16645.508042 19249.1

1 21539.239415 21711.9

29 15077.077324 15321.1

8 28477.482744 28155.0

3 15073.999588 13166.2由预测值和实际值对比可以看出,有的预测值和实际值相差比较大,但总体上来说预测值与实际值比较接近,也就一定程度上说明了这个模型的可用性。

化为线性的非线性实例模型假设:

由Cobb-Dauglas生产函数Y = A K β 1 L β 2 Y=AK^{\beta1}L^{\beta2}Y=AKβ1Lβ2,A代表既定的工程技术水平,β 1 \beta_1β1​、β 2 \beta_2β2​分别为资本与劳动投入的产出弹性,当β 1 + β 2 = 1 时 \beta_1+\beta_2=1时β1​+β2​=1时,当大于1或小于1时,表明规模收益递增或递减。为了便于比较,下面将会对此模型进行线性变换,即假设2010年中国制造业各行业的总产出及要素投入的关系为:

Y = β 0 + β 1 log ⁡ K + β 2 log ⁡ L + μ Y=\beta_0+\beta_1\log K+\beta_2\log L+\muY=β0​+β1​logK+β2​logL+μ

data2 = pd.read_excel(r'./计量经济学数据.xlsx', sheet_name='Sheet2')

fit2 = sm.formula.ols(formula='np.log(工业总产值) ~ np.log(资本投入) + np.log(年均从业人员)', data=data2).fit()

sns.pairplot(data2.loc[:, ['工业总产值', '资本投入', '年均从业人员']])

print(fit2.summary())

plt.show()

OLS Regression Results

==============================================================================

Dep. Variable: np.log(工业总产值) R-squared: 0.941

Model: OLS Adj. R-squared: 0.938

Method: Least Squares F-statistic: 286.3

Date: Sun, 26 May 2019 Prob (F-statistic): 7.86e-23

Time: 13:43:42 Log-Likelihood: -12.793

No. Observations: 39 AIC: 31.59

Df Residuals: 36 BIC: 36.58

Df Model: 2

Covariance Type: nonrobust

==================================================================================

coef std err t P>|t| [0.025 0.975]

----------------------------------------------------------------------------------

Intercept 1.8003 0.401 4.493 0.000 0.988 2.613

np.log(资本投入) 0.6778 0.081 8.344 0.000 0.513 0.843

np.log(年均从业人员) 0.2911 0.086 3.395 0.002 0.117 0.465

==============================================================================

Omnibus: 37.173 Durbin-Watson: 1.263

Prob(Omnibus): 0.000 Jarque-Bera (JB): 165.957

Skew: -2.018 Prob(JB): 9.18e-37

Kurtosis: 12.264 Cond. No. 75.3

==============================================================================

Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

模型检验:

H 0 : β j = 0 H_0:\beta_j=0H0​:βj​=0

H 1 : β j H_1:\beta_jH1​:βj​不全部为零

拟合优度检验:

从回归估计来看,模型拟合较好,可决系数R 2 = 0.941. R^{2}=0.941.R2=0.941.

F检验:

F值为286.3,查表得F α ( k , n − k − 1 ) = 3.26 F_{\alpha}(k,n-k-1)=3.26Fα​(k,n−k−1)=3.26,其中k = 2 k=2k=2,n = 39 n=39n=39,显然有F > F α ( k , n − k − 1 ) F>F_{\alpha}(k,n-k-1)F>Fα​(k,n−k−1),表明模型的线性关系在5%的显著水平下显著成立.所以拒绝原假设。

t检验:

∣ t 1 ∣ = 8.344 , ∣ t 2 ∣ = 3.395 , t α / 2 ( n − k − 1 ) = 2.036 \left|t_1\right|=8.344,\left|t_2\right|=3.395,t_{\alpha/2}(n-k-1)=2.036∣t1​∣=8.344,∣t2​∣=3.395,tα/2​(n−k−1)=2.036

由于∣ t ∣ > t α / 2 ( n − k − 1 ) \left|t\right|>t_{\alpha/2}(n-k-1)∣t∣>tα/2​(n−k−1),所以拒绝零假设.

综上可得2010年中国制造业各行业的总产出及要素投入的关系为:Y = 1.8003 + 0.6778 log ⁡ K + 0.2911 log ⁡ L , 0.6778 + 0.2911 = 0.9689 Y=1.8003+0.6778\log K+0.2911\log L,0.6778+0.2911=0.9689Y=1.8003+0.6778logK+0.2911logL,0.6778+0.2911=0.9689,以上结果表明,在2010年,中国工业总产出关于资本投入的产出弹性为0.6778,表明当其他因素不变时,工业的资本每增加1%,总产出将增加0.6778%,同样地,当其他因素不变时,劳动力投入每增长1%,总产出将增加0.2911%,可见,资本投入的增加对工业总产出的增长起到了更大的作用。

虚拟变量问题在一些数据中,通常会有一些变量无法通过量化来进行处理,但是这些变量往往对模型结果产生较大的影响,所以,这类因素是无法被丢弃的,因此引入了“虚拟变量”,又叫做哑变量,来进行“量化处理”。下面我们将会以城镇居民为基准线对2013年中国农村与城镇居民家庭人均工资收入、其他收入和生活消费支出进行模型建立。

假设模型为:

Y = α 0 + α 1 X 1 + α 2 X 2 + C Y=\alpha_0+\alpha_1X_1+\alpha_2X_2+CY=α0​+α1​X1​+α2​X2​+C

data3 = pd.read_excel(r'./计量经济学数据.xlsx', sheet_name='Sheet3')

fit3 = sm.formula.ols(formula='生活消费 ~ 工资收入 + 其他收入 + C(农村or城镇)', data=data3).fit()

print(fit3.summary())

OLS Regression Results

==============================================================================

Dep. Variable: 生活消费 R-squared: 0.975

Model: OLS Adj. R-squared: 0.974

Method: Least Squares F-statistic: 758.1

Date: Sun, 26 May 2019 Prob (F-statistic): 1.81e-46

Time: 13:43:42 Log-Likelihood: -513.02

No. Observations: 62 AIC: 1034.

Df Residuals: 58 BIC: 1043.

Df Model: 3

Covariance Type: nonrobust

=====================================================================================

coef std err t P>|t| [0.025 0.975]

-------------------------------------------------------------------------------------

Intercept 1783.7377 345.728 5.159 0.000 1091.687 2475.788

C(农村or城镇)[T.城镇居民] 140.8608 483.598 0.291 0.772 -827.166 1108.888

工资收入 0.5477 0.039 13.978 0.000 0.469 0.626

其他收入 0.5589 0.073 7.666 0.000 0.413 0.705

==============================================================================

Omnibus: 0.360 Durbin-Watson: 1.733

Prob(Omnibus): 0.835 Jarque-Bera (JB): 0.086

Skew: 0.082 Prob(JB): 0.958

Kurtosis: 3.079 Cond. No. 6.19e+04

==============================================================================

Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

[2] The condition number is large, 6.19e+04. This might indicate that there are

strong multicollinearity or other numerical problems.模型检验:

H 0 : β j = 0 H_0:\beta_j=0H0​:βj​=0

H 1 : β j H_1:\beta_jH1​:βj​不全部为零

拟合优度检验:

从回归估计来看,模型拟合较好,可决系数R 2 = 0.975. R^{2}=0.975.R2=0.975.

F检验:

F值为758.1,查表得F α ( k , n − k − 1 ) = 4.16 F_{\alpha}(k,n-k-1)=4.16Fα​(k,n−k−1)=4.16,其中k=3,n=62,显然有F > F α ( k , n − k − 1 ) F>F_{\alpha}(k,n-k-1)F>Fα​(k,n−k−1),表明模型的线性关系在5%的显著水平下显著成立.所以拒绝零假设。

t检验:

∣ t 1 ∣ = 0.291 , ∣ t 2 ∣ = 13.978 , , ∣ t 3 ∣ = 7.666 , t α / 2 ( n − k − 1 ) = 2.010 \left|t_1\right|=0.291,\left|t_2\right|=13.978,,\left|t_3\right|=7.666,t_{\alpha/2}(n-k-1)=2.010∣t1​∣=0.291,∣t2​∣=13.978,,∣t3​∣=7.666,tα/2​(n−k−1)=2.010

由 于 ∣ t ∣ > t α / 2 ( n − k − 1 ) 由于\left|t\right|>t_{\alpha/2}(n-k-1)由于∣t∣>tα/2​(n−k−1),所以拒绝零假设.

综上可得2013年中国农村与城镇居民家庭人均工资收入、其他收入和生活消费支出的关系为:

Y = 1783.7377 + 0.5477 X 1 + 0.5589 X 2 + 140.8608 Y=1783.7377+0.5477X_1+0.5589X_2+140.8608Y=1783.7377+0.5477X1​+0.5589X2​+140.8608城镇居民,

以上结果表明,当其他因素不变时,中国城镇居民平均消费支出比农村居民平均消费水平多140.8608元。

受约束回归在建立回归模型时,有时根据经济理论需要对自变量之间的关系进行约束,比如两个回归系数β 1 \beta_1β1​和β 2 \beta_2β2​之间的约束条件使得β 1 + β 2 = 1 \beta_1+\beta_2=1β1​+β2​=1或者使得β 1 = β 2 \beta_1=\beta_2β1​=β2​,此时称为此回归模型为受约束回归。

首先建立无约束回归模型

即:l n ( Q ) = β 0 + β 1 l n ( X / P 0 ) + β 2 ( P 1 / P ) + β 3 ( P 2 / P ) + β 4 P 01 + β 5 P 02 + β 6 P 03 ln(Q)=\beta_0+\beta_1ln(X/P_0)+\beta_2(P_1/P)+\beta_3(P_2/P)+\beta_4P_{01}+\beta_5P_{02}+\beta_6P_{03}ln(Q)=β0​+β1​ln(X/P0​)+β2​(P1​/P)+β3​(P2​/P)+β4​P01​+β5​P02​+β6​P03​

import statsmodels.api as sm

data4 = pd.read_excel(r'./计量经济学数据.xlsx', sheet_name='Sheet4')

Q = data4["蛋类消费量Q(千克)"]

X = data4["人均消费支出X(元)"]

P0 = data4["居民消费价格指数P0"]

P = data4["蛋类P(价格指数)"]

P1 = data4["肉禽类P1(价格指数)"]

P2 = data4["水产类P2(价格指数)"]

P01 = data4["粮食P3(价格指数)"]

P02 = data4["油脂P4(价格指数)"]

P03 = data4["蔬菜P5(价格指数)"]

df = pd.DataFrame({"log(X/P0)":np.log(X/P0),

"P1/P":P1/P,

"P2/P":P2/P,

"P01":P01,

"P02":P02,

"P03":P03},)

df = sm.add_constant(df)

fit = sm.OLS(np.log(Q),df).fit()

print(fit.summary())

Dep. Variable: 蛋类消费量Q(千克) R-squared: 0.527

Model: OLS Adj. R-squared: 0.409

Method: Least Squares F-statistic: 4.462

Date: Tue, 05 Jan 2021 Prob (F-statistic): 0.00361

Time: 18:00:28 Log-Likelihood: -19.335

No. Observations: 31 AIC: 52.67

Df Residuals: 24 BIC: 62.71

Df Model: 6

Covariance Type: nonrobust

==============================================================================

coef std err t P>|t| [0.025 0.975]

------------------------------------------------------------------------------

const -11.1535 12.746 -0.875 0.390 -37.461 15.154

log(X/P0) 1.3283 0.326 4.078 0.000 0.656 2.001

P1/P -1.4528 4.210 -0.345 0.733 -10.141 7.235

P2/P 5.1265 2.281 2.248 0.034 0.419 9.834

P01 0.0150 0.077 0.196 0.846 -0.144 0.174

P02 0.0051 0.076 0.068 0.946 -0.151 0.161

P03 0.0101 0.033 0.310 0.759 -0.057 0.078

==============================================================================

Omnibus: 0.203 Durbin-Watson: 1.391

Prob(Omnibus): 0.904 Jarque-Bera (JB): 0.405

Skew: -0.094 Prob(JB): 0.817

Kurtosis: 2.473 Cond. No. 2.63e+04

==============================================================================

Notes:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

[2] The condition number is large, 2.63e+04. This might indicate that there are

strong multicollinearity or other numerical problems.模型检验:

H 0 : β j = 0 H_0:\beta_j=0H0​:βj​=0

H 1 : β j H_1:\beta_jH1​:βj​不全部为零

拟合优度检验:

从回归估计来看,调整的可决系数R 2 = 0.409 R^{2}=0.409R2=0.409,但是此回归结果并不作为预测模型来进行预测,所以可以不必过分关注可决系数.

F检验:

F值为 4.641,P = 0.00290 P=0.00290P=0.00290,表明模型的线性关系在5%的显著水平下显著成立,所以拒绝零假设。

t检验:

以上结果除变量log(X/P0) 和P2/P在5%的显著水平下拒绝原假设,其他变量均无法通过t检验,在其他条件不变的情况下,农村人均消费支出会明显增加蛋类消费量,同时,当水产类价格上升速度大于蛋类产品时,会刺激农村消费者倾向于消费更多的蛋类产品,即在农村消费者的消费倾向中,水产品类与蛋类产品有一定的替代作用。

建立受约束回归模型

约束条件为H 0 : β 2 = β 4 = β 5 = β 6 = 0 H_0:\beta_2=\beta_4=\beta_5=\beta_6=0H0​:β2​=β4​=β5​=β6​=0,即回归模型为l n ( Q ) = β 0 + β 1 l n ( X / P 0 ) + β 3 ( P 2 / P ) ln(Q)=\beta_0+\beta_1ln(X/P_0)+\beta_3(P2/P)ln(Q)=β0​+β1​ln(X/P0​)+β3​(P2/P)

OLS Regression Results

==============================================================================

Dep. Variable: 蛋类消费量Q(千克) R-squared: 0.517

Model: OLS Adj. R-squared: 0.482

Method: Least Squares F-statistic: 14.98

Date: Tue, 05 Jan 2021 Prob (F-statistic): 3.77e-05

Time: 18:12:23 Log-Likelihood: -19.671

No. Observations: 31 AIC: 45.34

Df Residuals: 28 BIC: 49.64

Df Model: 2

Covariance Type: nonrobust

==============================================================================

coef std err t P>|t| [0.025 0.975]

------------------------------------------------------------------------------

const -8.9767 2.364 -3.797 0.001 -13.819 -4.134

log(X/P0) 1.2843 0.288 4.456 0.000 0.694 1.875

P2/P 4.8805 2.062 2.367 0.025 0.657 9.104

==============================================================================

Omnibus: 0.178 Durbin-Watson: 1.377

Prob(Omnibus): 0.915 Jarque-Bera (JB): 0.313

Skew: -0.156 Prob(JB): 0.855

Kurtosis: 2.619 Cond. No. 152.

==============================================================================

Notes:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.从以上结果可以看出,在约束条件下,线性关系检验(F检验)和回归系数检验(t检验)在5%的显著水平下更加显著,拒绝原假设的理由更加充分,即更加印证了无约束回归所说明的结论。

如果大家在学习中遇到困难,想找一个python学习交流环境,可以加入我们一起学习正在跳转​jq.qq.com

有关Python问题都可以给我留言喔

与50位技术专家面对面20年技术见证,附赠技术全景图

总结

以上是生活随笔为你收集整理的python经济学函数_有没有python计量经济学的教程?的全部内容,希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。