python-3.x 尽可能快地创建嵌套循环

mzsu5hc0 于 4个月前发布在 Python

关注(0)|答案(1)|浏览(59)

spatial_data = [[1,2,3,4,5,6], ..... , [1,2,3,4,50,60]] #length 10 million
reference = [[9, 39, 22, 28, 25, 5], ...... , [5, 16, 12, 34, 3, 9]] # length 100

for x in reference:
    result = [c for c in spatial_data if len(set(x).intersection(c)) <= 2]

字符串
每次运行大约需要5分钟，我需要执行这个操作1000次，因此我预计需要3.5天。请帮助我加快速度。谢谢
我试过用非列表解析格式来写这篇文章，每次运行大约5分钟。

python-3.x

来源：https://stackoverflow.com/questions/77787114/make-nested-loops-as-fast-as-possible

1条答案

按热度按时间

eyh26e7m1#

使用multiprocessing.Pool的代码的简单并行版本如下：

from functools import partial
from multiprocessing import Pool, cpu_count

spatial_data = [...] #length 10 million
reference = [...] # length 100

def f(x, data):
    return [c for c in data if len(set(x).intersection(c)) <= 2]

my_func = partial(f, data=spatial_data) #since spatial_data is the same at each iteration

with Pool(cpu_count()) as p:
    result = p.map(my_func, reference)

字符串
简而言之，您定义一个小函数来进行内部循环计算。然后使用Pool.map（）函数在Pool（）中生成尽可能多的进程（cpu_count()为您的硬件拥有的核心数量）。
这就是要点
如果是一次性计算，那就足够了。但是更快（更好）的方法是使用numpy . See this。

赞(0）回复(0）举报 4个月前

我来回答

python-3.x 尽可能快地创建嵌套循环

1条答案

相关问题

热门标签

最新问答