python-3.x pymongo以字符串格式插入所有数据

o2g1uqev  于 4个月前  发布在  Python
关注(0)|答案(2)|浏览(83)

我使用pymongo导入csv,然后将其插入到mongodb中,但由于某些原因,所有字段都是字符串格式,我希望是double。下面是python代码。

def saveToMongo():
print("inside saveToMongo")
collection = mydb['country']

header = ['country_id','country_name','zone_id','minLat','maxLat','minLong','maxLong']
csvFile = open('country.csv', 'r')
reader = csv.DictReader(csvFile)

print(reader)

for each in reader:
    row = {}
    for field in header:
        row[field] = each[field]

    #print(row)
    collection.insert(row)

字符串
这是CSV文件

country_id,country_name,zone_id,minLat,maxLat,minLong,maxLong
2,Bangladesh,1,20.6708832870000,26.4465255803000,88.0844222351000,92.6727209818000
3,"Sri Lanka",1,5.9683698592300,9.8240776636100,79.6951668639000,81.7879590189000
4,Pakistan,1,23.6919650335000,37.1330309108000,60.8742484882000,77.8374507995000
5,Bhutan,1,26.7194029811000,28.2964385035000,88.8142484883000,92.1037117859000


我不明白为什么python要以String格式存储数据。
当我尝试插入数据使用mongoimport我得到belopw错误

fatal error: unrecognized DWARF version in .debug_info at 6

runtime stack:
panic during panic

runtime stack:
stack trace unavailable

iszxjhcz

iszxjhcz1#

DictReader只是将任何数据解析为String。
如果你想要另一种格式,你可以创建自定义函数来将它们解析为你想要的类型。
考虑到你在python中调用的double是一个float,你可以做如下事情:

from pymongo import MongoClient 
import csv
myclient = MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydbname"]

def csvToMongo():
  with open('country.csv','r') as myFile :
    reader = csv.DictReader(myFile, delimiter=",")
    myParsedData = [
      {
        'country_id' : int(elem['country_id']),
        'country_name' : elem['country_name'],
        'zone_id' : int(elem['zone_id']),
        'minLat' : float(elem['minLat']),
        'maxLat' : float(elem['maxLat']),
        'minLong' : float(elem['minLong']),
        'maxLong' : float(elem['maxLong']),
      }
      for elem in reader
    ]
    collection = mydb['country']
    collection.insert_many(myParsedData)

#execute function
csvToMongo()

字符串

0ve6wy6x

0ve6wy6x2#

Python将以字符串格式读取所有CSV数据。这与MongoDB或 *pymongo以字符串格式插入数据 * 无关。
如果你想从CSV数据中进行类型推断和自动转换,可以使用类似Pandas read_csv的东西。
否则,使用header在类似的循环中执行类型转换-在字段名称和字段类型之间创建Map,然后将每个字段转换为该类型:

header = {
    'country_id': int, 'country_name': str, 'zone_id': int,
    'minLat': float, 'maxLat': float, 'minLong': float, 'maxLong': float
}

for row in reader:
    doc = {}
    for field, ftype in header.items():
        doc[field] = ftype(row[field])
    
    # the `doc=` + loop lines can also be written in one line as:
    # doc = {field: ftype(row[field]) for field, ftype in header.items()}

    collection.insert(row)

字符串

  • (如果没有它,读取每行数据并分配给dict中的每个键的循环是非常多余的。可能只是collection.insert(each))*

顺便说一句,如果你的文件不是很大,读取并转换所有的行,然后使用insert_many

相关问题