python-3.x 使用BeautifulSoup库抓取书籍名称时,输出返回0

yqhsw0fo  于 4个月前  发布在  Python
关注(0)|答案(1)|浏览(123)

我试图从一个网站上抓取标题名称,但它每次都返回0作为输出。

import requests
from bs4 import BeautifulSoup

url = "https://www.blinkist.com/en/content/topics/listening-en"

response = requests.get(url)
html = response.text
# print (html)

soup = BeautifulSoup(html, 'html.parser')

book_titles = soup.find_all("a", {"data-test-id":"book-card-title"})
print(len(book_titles))

for title in book_titles:
  print(title.text)

字符串
我期待着书名的结果,但可怕的是错过了一些东西在这里

nhjlsmyf

nhjlsmyf1#

如果selenium是一个选项:

import json
from selenium import webdriver
from selenium.webdriver.common.by import By

url = "https://www.blinkist.com/en/content/topics/listening-en"

with webdriver.Chrome() as driver:
    driver.get(url)
    
    css = "script[type='application/ld+json']"

    data = json.loads(driver.find_element(
        By.CSS_SELECTOR, css).get_attribute("innerHTML"))
    
titles = [d["name"] for d in data["hasPart"]]

字符串
输出量:

# len(titles) # 34

[
    'Crucial Conversations',
    'Just Listen',
    'The Art of Communicating',
    'Talk Lean',
    'The 11 Laws of Likability',
    'Nonviolent Communication',
    ...
]

相关问题