我们接着看 CSS 选择器,还是上面的示例,小编这里就不多 BB 了,直接上示例:
>>> response.css('a') [<Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image1.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image2.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image3.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image4.html">Name: My image ...'>, <Selector xpath='descendant-or-self::a' data='<a href="http://www.likecs.com/image5.html">Name: My image ...'>]我们同样可以进行属性选择和嵌套选择:
>>> response.css('a[href="http://www.likecs.com/image1.html"]').extract() ['<a href="http://www.likecs.com/image1.html">Name: My image 1 <br><img src="http://www.likecs.com/image1_thumb.jpg"></a>'] >>> response.css('a[href="http://www.likecs.com/image1.html"] img').extract() ['<img src="http://www.likecs.com/image1_thumb.jpg">']接下来获取文本值和属性值的方法稍有区别:
>>> response.css('a[href="http://www.likecs.com/image1.html"]::text').extract() ['Name: My image 1 '] >>> response.css('a[href="http://www.likecs.com/image1.html"] img::attr(src)').extract() ['http://www.likecs.com/image1_thumb.jpg']获取文本和属性需要用 ::text 和 ::attr() 的写法。
当然,我们的 CSS 选择器和 Xpath 选择器一样可以嵌套选择,一个简单的小示例感受下:
>>> response.xpath('//a').css('img').xpath('@src').extract() ['http://www.likecs.com/image1_thumb.jpg', 'http://www.likecs.com/image2_thumb.jpg', 'http://www.likecs.com/image3_thumb.jpg', 'http://www.likecs.com/image4_thumb.jpg', 'http://www.likecs.com/image5_thumb.jpg']Selector 选择器就先介绍到这里了,更多的内容和用法可以参考官方文档:https://docs.scrapy.org/en/latest/topics/selectors.html
本文没什么代码,所以示例代码就不放了。