记录一下自己爬虎牙LOL主播的爬虫思路

1.明确爬虫目的

爬虫目的需要我们明确的,没有目的的爬虫都是耍流氓!像我这次爬虫目的能不能从网页上爬下来。

2.怎么来爬?

a. 先要找到具有唯一性的标签

<li class="game-live-item" gid="1"> <a href="http://www.huya.com/baozha" class="video-info new-clickstat " target="_blank" report="{&quot;eid&quot;:&quot;click/position&quot;,&quot;position&quot;:&quot;lol/0/1/1&quot;,&quot;game_id&quot;:&quot;1&quot;,&quot;ayyuid&quot;:&quot;17363578&quot;}"> <img class="pic" data-original="//screenshot.msstatic.com/yysnapshot/1801cfa4fc99aabc841eb9e25fa43f15a608b02d1055?imageview/4/0/w/338/h/190/blur/1" src="//screenshot.msstatic.com/yysnapshot/1801cfa4fc99aabc841eb9e25fa43f15a608b02d1055?imageview/4/0/w/338/h/190/blur/1/format/webp" onerror="this.onerror=null; this.src='//a.msstatic.com/huya/main/assets/img/default/338x190.jpg';" alt="炸姐ADC的直播" title="炸姐ADC的直播"> <em class="tag tag-recommend">大神推荐</em> <div class="item-mask"></div> <i class="btn-link__hover_i"></i> <p class="tag-right"> <!-- 蓝光 --> <!-- 热舞 --> <!-- 存活人数 --> </p> </a> <a href="http://www.huya.com/baozha" class="title new-clickstat" report="{&quot;eid&quot;:&quot;click/position&quot;,&quot;position&quot;:&quot;lol/0/1/1&quot;,&quot;game_id&quot;:&quot;1&quot;,&quot;ayyuid&quot;:&quot;17363578&quot;}" title="S8定位赛开始了11-0 裁决已解决" target="_blank">S8定位赛开始了11-0 裁决已解决</a> <span class="txt"> <span class="avatar fl"> <img data-original="//huyaimg.msstatic.com/avatar/1095/83/2aa2f6905fe4382221d08b66d7cdcb_180_135.jpg" src="//huyaimg.msstatic.com/avatar/1095/83/2aa2f6905fe4382221d08b66d7cdcb_180_135.jpg" onerror="this.onerror=null; this.src='//a.msstatic.com/huya/main/assets/img/default/84x84.jpg';" alt="炸姐ADC" title="炸姐ADC"> <i class="nick" title="炸姐ADC">炸姐ADC</i> </span> <span class="num"><i class="num-icon"></i><i class="js-num">67.0万</i></span> </span> </li>

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/wppdfy.html