beautifulsoup正在从网站返回空数据

qrjkbowd  于 2021-07-13  发布在  Java
关注(0)|答案(2)|浏览(305)

我正在运行这个代码

import requests
from bs4 import BeautifulSoup

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'

}

r = requests.get('https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2')

soup = BeautifulSoup(r.content, 'lxml')

print(soup.find('div', class_='price').text)

我想在这个网站上得到产品的价格:https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2 运行代码时,我得到的只是空数据。是我做错了什么,还是网站做了什么特别的事情来阻止我逃避价格?

gv8xihay

gv8xihay1#

如评论中所述,您可以从商店的api获取产品数据。
方法如下:

import requests
from bs4 import BeautifulSoup

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
                  'AppleWebKit/537.36 (KHTML, like Gecko) '
                  'Chrome/78.0.3904.108 Safari/537.36',
    "x-requested-with": "XMLHttpRequest"

}

product_url = "https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2"
page_content = requests.get(product_url).content
soup = BeautifulSoup(page_content, 'lxml')
product_id = soup.find("input", {"name": "d-session-product"})["value"]

payload = {
    "debug": "off",
    "ajax": "1",
    "product_list": product_id,
    "action": "init",
    "showStockStatusAndShoppingcart": "1",
    "enablePickupAtNearbyStores": "yes",
}

endpoint = "https://www.bohus.no/lite.cgi/module/priceAndStock"
product_data = requests.post(endpoint, data=payload, headers=headers).json()
print(product_data["price"][0]["salesPriceNormal"])

输出:

8799
vsikbqxv

vsikbqxv2#

此外,您还可以使用请求html库:

import requests_html

from requests_html import HTMLSession
session = HTMLSession()

r = session.get('https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2')
r.html.render()
sel='div.price-data > div.price'
print(r.html.find(sel, first=True).text)

相关问题