python—提高代码的速度，使许多api调用并将所有数据存储到一个 Dataframe 中

ig9co6j1 于 2021-08-25 发布在 Java

关注(0)|答案(1)|浏览(302)

我编写了一个代码，它获取标识符编号，向特定的api发出请求，并返回与该标识符编号相关的数据。代码在dataframe中运行，获取标识符编号（大约600），返回相关信息并转换为dataframe。最后，将所有 Dataframe 连接到一个 Dataframe 中。代码运行速度非常慢。有没有办法让它更快。我对python没有信心，如果您能分享解决方案代码，我将不胜感激。代码：

file = dataframe

list_df = []

for index,row in file.iterrows():

    url = "https://some_api/?eni_number="+str(row['id'])+"&apikey=some_apikey"

    payload={}
    headers = {
      'Accept': 'application/json'
    }

    response = requests.request("GET", url, headers=headers, data=payload)

    a = json.loads(response.text)

    df_index = json_normalize(a, 'vessels')
    df_index['eni_number'] = row['id']

    list_df.append(df_index)

    #print(df_index)

total = pd.concat(list_df)

python DataFrame Api pandas

来源：https://stackoverflow.com/questions/68304768/increase-speed-of-the-code-which-makes-many-api-calls-and-stores-all-data-into-o

1条答案

按热度按时间

qyswt5oh1#

这里的瓶颈似乎是http请求一个接一个地同步执行。因此，大部分时间都浪费在等待服务器的响应上。
使用异步方法可能会获得更好的结果，例如使用grequests并行执行所有http请求：

import grequests

ids = dataframe["id"].to_list()
urls = [f"https://some_api/?eni_number={id}&apikey=some_apikey" for id in ids]

payload = {}
headers = {'Accept': 'application/json'}
requests = (grequests.get(url, headers=headers, data=payload) for url in urls)

responses = grequests.map(requests)  # execute all requests in parallel

list_df = []
for id, response in zip(ids, responses):
    a = json.loads(response.text)
    df_index = json_normalize(a, 'vessels')
    df_index['eni_number'] = id

    list_df.append(df_index)

total = pd.concat(list_df)

赞(0）回复(0）举报 2021-08-25

我来回答

python—提高代码的速度，使许多api调用并将所有数据存储到一个 Dataframe 中

1条答案

相关问题

热门标签

最新问答