scrapy用变量动态创建产量字段

8yoxcaq7 于 7个月前发布在其他

关注(0)|答案(2)|浏览(67)

我想去得到所有的子弹点与scrappy从亚马逊产品页面，例如。Amazon link，但它们的数量不同。我最后用了这样的东西

def parse(self, response):
        t = response
        url = t.request.url
        yield{
                'bullets_no': len(t.xpath('//div[@id="feature-bullets"]//li/span/text()'))
                'bullet_1' : t.xpath('//div[@id="feature-bullets"]//li/span/text()')[0].get().strip()
                'bullet_2' : t.xpath('//div[@id="feature-bullets"]//li/span/text()')[1].get().strip()
                'bullet_3' : t.xpath('//div[@id="feature-bullets"]//li/span/text()')[2].get().strip()
                'bullet_4' : t.xpath('//div[@id="feature-bullets"]//li/span/text()')[3].get().strip()
                'bullet_5' : t.xpath('//div[@id="feature-bullets"]//li/span/text()')[4].get().strip()
...
            }

但是在pythong中，我可以简单地做这样的事情并自动调整：

bullets = t.xpath('//div[@id="feature-bullets"]//li/span/text()')
    for i, bullet in enumerate(bullets):
        row[f'Bullet_{i+1}'] = bullet.strip()

在scrapy中可以创建这样的产出字段吗？

scrapy

来源：https://stackoverflow.com/questions/77035459/scrapy-create-yield-field-dynamically-with-a-variable

2条答案

按热度按时间

wooyq4lh1#

是的，这是详细介绍了在scrapy tutorial，我强烈建议阅读。
使用response.css或response.xpath调用时的返回类型是SelectorList对象。你可以像处理普通的python list对象一样处理这个对象。
运行response.css（'title'）的结果是一个名为SelectorList的类似列表的对象，它表示一个包含XML/HTML元素的XML对象列表，并允许您运行进一步的查询以细化选择或提取数据。
以你的例子为例，你可以这样做：

def parse(self, response):
    item = {'url': response.url}
    for i, bullet in enumerate(response.xpath('//div[@id="feature-bullets"]//li/span/text()'), start=1):
        item[f'bullet_{i}'] = bullet.get().strip()
    item['bullet_no'] = i
    yield item

正如前面的回答中提到的，还有一个getall方法可以在选择器列表上调用：
另一件事是调用.getall（）的结果是一个列表：选择器可能返回多个结果，因此我们将它们全部提取出来。
我建议阅读scrapy docs教程的提取数据和提取引用和作者部分，以了解更多信息。

赞(0）回复(0）举报 7个月前

lp0sw83n2#

在这种情况下使用getall()，您将获得列表

def parse(self, response):
    t = response
    url = t.request.url
    yield{f"bullet_{i}": el.strip() for i, el in enumerate(t.xpath('//div[@id="feature-bullets"]//li/span/text()').getall(), start=1) }

赞(0）回复(0）举报 7个月前

我来回答

scrapy用变量动态创建产量字段

2条答案

相关问题

热门标签

最新问答