小白学 Python 爬虫（35）：爬虫框架 Scrapy 入门基础（三） Selector 选择器 (3)

日期：2021-12-25 栏目：程序人生浏览：次

我们接着看 CSS 选择器，还是上面的示例，小编这里就不多 BB 了，直接上示例：

>>> response.css('a') [<Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image1.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image2.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image3.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image4.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image5.html">Name: My image ...'>]

我们同样可以进行属性选择和嵌套选择：

>>> response.css('a[href="http://www.likecs.com/image1.html"]').extract() ['<a href="http://www.likecs.com/image1.html">Name: My image 1 <br><img src="http://www.likecs.com/image1_thumb.jpg"></a>'] >>> response.css('a[href="http://www.likecs.com/image1.html"] img').extract() ['<img src="http://www.likecs.com/image1_thumb.jpg">']

接下来获取文本值和属性值的方法稍有区别：

>>> response.css('a[href="http://www.likecs.com/image1.html"]::text').extract() ['Name: My image 1 '] >>> response.css('a[href="http://www.likecs.com/image1.html"] img::attr(src)').extract() ['http://www.likecs.com/image1_thumb.jpg']

获取文本和属性需要用 ::text 和 ::attr() 的写法。

当然，我们的 CSS 选择器和 Xpath 选择器一样可以嵌套选择，一个简单的小示例感受下：

>>> response.xpath('//a').css('img').xpath('@src').extract() ['http://www.likecs.com/image1_thumb.jpg', 'http://www.likecs.com/image2_thumb.jpg', 'http://www.likecs.com/image3_thumb.jpg', 'http://www.likecs.com/image4_thumb.jpg', 'http://www.likecs.com/image5_thumb.jpg']

Selector 选择器就先介绍到这里了，更多的内容和用法可以参考官方文档：https://docs.scrapy.org/en/latest/topics/selectors.html

本文没什么代码，所以示例代码就不放了。

小白学 Python 爬虫（35）：爬虫框架 Scrapy 入门基础（三） Selector 选择器

转载注明出处：https://www.heiqu.com/zwjzdp.html

小白学 Python 爬虫（35）：爬虫框架 Scrapy 入门基础（三） Selector 选择器 (3)

相关推荐