欢迎访问 生活随笔!

生活随笔

当前位置: 首页 >

r语言的逻辑回归分类

发布时间:2024/10/8 36 豆豆
生活随笔 收集整理的这篇文章主要介绍了 r语言的逻辑回归分类 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

iris 是r语言内置的数据集

head(iris) # 与python的不同iris.head() Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
5.4 3.9 1.7 0.4 setosa
# 查看数据的行和列 dim(iris)
  • 150
  • 5
  • # 数据的类型 mode(iris)

    ‘list’

    # columns的名字 names(iris)
  • 'Sepal.Length'
  • 'Sepal.Width'
  • 'Petal.Length'
  • 'Petal.Width'
  • 'Species'
  • # r是data.frame py是pandas.Dateframe str(iris) 'data.frame': 150 obs. of 5 variables:$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ... # 查看数据集的属性 attributes(iris)

    # 数据的概述 summary(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 Median :5.800 Median :3.000 Median :4.350 Median :1.300 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 Species setosa :50 versicolor:50 virginica :50 # 查看分类的种类 table(iris$Species) setosa versicolor virginica 50 50 50 # 画图 Sepal萼片长度 hist(iris$Sepal.Length)

    # 密度分布图 plot(density(iris$Sepal.Length))

    # 花萼长度散点图 plot(iris$Sepal.Length,iris$Sepal.Width)

    plot(iris)

    # 逻辑回归 只能分两类 a<-which(iris$Species=='virginica') head(a) # 对应的编号
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • # 取出其他的两类 myir <- iris[-a,] # 数据分样 测试和训练 s <- sample(100,80) # 100抽80 # 排序 s <- sort(s) ir_trian <- myir[s,] head(ir_trian) Sepal.LengthSepal.WidthPetal.LengthPetal.WidthSpecies134579
    5.1 3.5 1.4 0.2 setosa
    4.7 3.2 1.3 0.2 setosa
    4.6 3.1 1.5 0.2 setosa
    5.0 3.6 1.4 0.2 setosa
    4.6 3.4 1.4 0.3 setosa
    4.4 2.9 1.4 0.2 setosa
    ir_test <- myir[-s,] model <-glm(Species~.,family = binomial(link="logit"),data= ir_trian) summary(model) Call: glm(formula = Species ~ ., family = binomial(link = "logit"), data = ir_trian)Deviance Residuals: Min 1Q Median 3Q Max -1.570e-05 -2.110e-08 2.110e-08 2.110e-08 1.865e-05 Coefficients:Estimate Std. Error z value Pr(>|z|) (Intercept) 4.691 681526.322 0 1 Sepal.Length -9.568 216769.252 0 1 Sepal.Width -7.254 99870.123 0 1 Petal.Length 18.946 153746.614 0 1 Petal.Width 25.341 222619.596 0 1(Dispersion parameter for binomial family taken to be 1)Null deviance: 1.1070e+02 on 79 degrees of freedom Residual deviance: 1.0579e-09 on 75 degrees of freedom AIC: 10Number of Fisher Scoring iterations: 25 # 残差 a<- predict(model,type="response") # 大于0.5 为1 res_train <- ifelse(a>0.5,1,0) b<- predict(model,type="response",newdata=ir_test) res_test <- ifelse (b>0.5,1,0) model <- glm(Species~.,family = binomial(link = "logit"),data= ir_trian,control= list(maxit=100)) summary(model) Call: glm(formula = Species ~ ., family = binomial(link = "logit"), data = ir_trian, control = list(maxit = 100))Deviance Residuals: Min 1Q Median 3Q Max -9.535e-06 -2.110e-08 2.110e-08 2.110e-08 1.132e-05 Coefficients:Estimate Std. Error z value Pr(>|z|) (Intercept) 5.292e+00 1.125e+06 0 1 Sepal.Length -1.013e+01 3.577e+05 0 1 Sepal.Width -7.501e+00 1.645e+05 0 1 Petal.Length 1.988e+01 2.534e+05 0 1 Petal.Width 2.634e+01 3.667e+05 0 1(Dispersion parameter for binomial family taken to be 1)Null deviance: 1.1070e+02 on 79 degrees of freedom Residual deviance: 3.8911e-10 on 75 degrees of freedom AIC: 10Number of Fisher Scoring iterations: 26

    总结

    以上是生活随笔为你收集整理的r语言的逻辑回归分类的全部内容,希望文章能够帮你解决所遇到的问题。

    如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。