当前位置:
首页 >
基于ThinkPHP5 使用QueryList爬取 并存入mysql数据库
发布时间:2023/12/29
42
豆豆
生活随笔
收集整理的这篇文章主要介绍了
基于ThinkPHP5 使用QueryList爬取 并存入mysql数据库
小编觉得挺不错的,现在分享给大家,帮大家做个参考.
QueryList4教程 地址:
https://doc.querylist.cc/site/index/doc/45在ThinkPHP5代码根目录执行composer命令安装QueryList:
composer require jaeger/querylist如果出现 以下错误
Loading composer repositories with package information Updating dependencies (including require-dev)Authentication required (packagist.phpcomposer.com):Username:出现这样的 情况
使用
composer config -g repo.packagist composer https://packagist.laravel-china.org1-下面演示在Index控制器中使用QueryList:
use QL\QueryList;public function qulist(){$data = QueryList::get('http://maoyan.com/board/4')// 设置采集规则->rules([// 爬取图片地址"src"=>array(".board-wrapper dd img.board-img","data-src"),// 爬取电影名"name"=>array(".board-wrapper dd .movie-item-info .name","html"),// 爬取电影主演信息"star"=>array(".board-wrapper dd .movie-item-info .star","html"),// 爬取上映时间"releasetime"=>array(".board-wrapper dd .movie-item-info .releasetime","html"),])->query()->getData();$excel_array=$data->all();$city = [];foreach($excel_array as $k=>$v) {$city[$k]['src'] = $v['src'];$city[$k]['name'] = $v['name'];$city[$k]['star'] = $v['star'];$city[$k]['releasetime'] = $v['releasetime'];}Db::name("article")->insertAll($city);}如果没有错的 则插入到数据库的
截图
2-如果想继续抓取下一页 数据 则要根据规律来取数据
public function qulist(){for($i=0;$i<20;$i++){$page=$i*10;$data = QueryList::get('http://maoyan.com/board/4?offset='.$page)// 设置采集规则->rules([// 爬取图片地址"src"=>array(".board-wrapper dd img.board-img","data-src"),// 爬取电影名"name"=>array(".board-wrapper dd .movie-item-info .name","html"),// 爬取电影主演信息"star"=>array(".board-wrapper dd .movie-item-info .star","html"),// 爬取上映时间"releasetime"=>array(".board-wrapper dd .movie-item-info .releasetime","html"),])->query()->getData();$excel_array=$data->all();$city = [];foreach($excel_array as $k=>$v) {$city[$k]['src'] = $v['src'];$city[$k]['name'] = $v['name'];$city[$k]['star'] = $v['star'];$city[$k]['releasetime'] = $v['releasetime'];}Db::name("article")->insertAll($city);}}3- 继续抓取下一页 数据 并判断数据库是否存在 存在不爬虫 不存在继续填满 ,并且休眠几秒再爬取
public function qulist(){set_time_limit(0); //防止程序响应30秒后 报错for($i=0;$i<20;$i++){$page=$i*10;$data = QueryList::get('http://maoyan.com/board/4?offset='.$page)// 设置采集规则->rules([// 爬取图片地址"src"=>array(".board-wrapper dd img.board-img","data-src"),// 爬取电影名"name"=>array(".board-wrapper dd .movie-item-info .name","html"),// 爬取电影主演信息"star"=>array(".board-wrapper dd .movie-item-info .star","html"),// 爬取上映时间"releasetime"=>array(".board-wrapper dd .movie-item-info .releasetime","html"),])->query()->getData();$excel_array=$data->all();$city = [];foreach($excel_array as $k=>$v) {$find = Db::name("article")->where("src", "=",$excel_array[$k]["src"])->find();if (empty($find)) {$city[$k]['src'] = $v['src'];$city[$k]['name'] = $v['name'];$city[$k]['star'] = $v['star'];$city[$k]['releasetime'] = $v['releasetime'];}}if (!empty($city)){Db::name("article")->insertAll($city);}/** 暂停时间 为2秒执行*/sleep(2);}}总结
以上是生活随笔为你收集整理的基于ThinkPHP5 使用QueryList爬取 并存入mysql数据库的全部内容,希望文章能够帮你解决所遇到的问题。
- 上一篇: 翟婉明院士:中国高铁发展面临的科技挑战与
- 下一篇: excel 瀵煎叆mysql_瀵煎叆fu