python-beautifulsoup:如何将html保存到数据库中?

bvhaajcl  于 2021-06-20  发布在  Mysql
关注(0)|答案(3)|浏览(318)

我正在尝试将产品描述保存到mysql数据库中。到目前为止,我已经尝试过将数据类型改为blob、longblob、text和longtext,但都不起作用。

import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
import mysql.connector

cnx = mysql.connector.connect(user='root', password='Kradz579032!!',
                              host='127.0.0.1',
                              database='aliexpressapidb')
cursor = cnx.cursor()

add_data = ("INSERT INTO productdetails"
               "(description) "
               "VALUES (%s)")

my_url = 'https://www.aliexpress.com/item/Cheap-Necklace-Jewelry-Alloy-Men-Vintage-Personality-Pendant-Creativity-Simple-Accessories-Symbol-Necklace-Wholesale-Fashion/32879629913.html?spm=a2g01.11147086.layer-iabdzn.4.4a716140Ix00VA&scm=1007.16233.91830.0&scm_id=1007.16233.91830.0&scm-url=1007.16233.91830.0&pvid=acdbf117-c0fb-458f-b8a9-ea73bc0d174b'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
description = page_soup.findAll("div", {"class": "ui-box-body"})

# print(description)

data_insert = description
cursor.execute(add_data, data_insert)

cnx.commit()

cursor.close()
cnx.close()

不断获取错误:
file“/users/reezalaq/.conda/envs/untitled3/lib/python3.6/site packages/mysql/connector/conversion.py”,第160行,在to_mysql return getattr(self,“{0}\u to \u mysql”.format(type \u name))(value)attributeerror:'mysqlconverter'对象没有属性“\u tag \u to \u mysql'”mysql type“。format(type \u name))typeerror:python“tag”无法转换为mysql类型

a2mppw5e

a2mppw5e1#

通过将html数据转换为@micahb建议的字符串,问题得到了解决。
tostring=str(数据)

fjnneemd

fjnneemd2#

maindiv = str(soup.find("body"))

这将以字符串形式返回body标记html。

sql = "UPDATE `abc` SET `html` = %s WHERE `abc`.`id` = %s"
val = (maindiv,1)
mycursor.execute(sql, val)
mydb.commit()

上面的代码更新了记录

cigdeys3

cigdeys33#

description = page_soup.findAll("div", {"class": "ui-box-body"})

^这里findall将返回匹配标记的列表,但您需要传递一个字符串cursor.execute()。因此,您应该从这些匹配的标记中获得一个字符串。以下是一种方法:

description = page_soup.findAll("div", {"class": "ui-box-body"})
data_insert = description[0].get_text()
cursor.execute(add_data, data_insert)

相关问题