从Python中删除heapq需要O(logn)

4smxwvx5  于 5个月前  发布在  Python
关注(0)|答案(5)|浏览(59)

我有一个这样的堆(Python,heapq模块)-

>>> h = []
>>> heappush(h, (5, 'write code'))
>>> heappush(h, (7, 'release product'))
>>> heappush(h, (1, 'write spec'))
>>> heappush(h, (3, 'create tests'))

字符串
我如何在O(logn)中删除项目值为“创建测试”的元组并保留堆属性?
这是我能想到的(不是O(logn))

for i in range(len(h)):
   if h[i][1] == "create tests":
      h[i], h[-1] = h[-1], h[i]
      popped = h.pop()
      heapq.heapify(h)
      break

muk1a3rh

muk1a3rh1#

如果您确实需要从heap中取出一个项目,但又想保留heap,您可以延迟执行,并在项目自然出现时丢弃它,而不是在列表中搜索它。
如果您将要删除的项目存储在黑名单set中,则每次heapq.heappop检查该项目是否在set中。如果存在,则丢弃它,然后再次heappop,直到您获得未列入黑名单的项目,或者heap为空

alen0pnh

alen0pnh2#

如果多个被删除的元素具有相同的值,那么黑名单集将是有问题的。相反,使用tombstone-counting-dict实现heap_remove

def heap_remove(heap, value):
  tombstones[value] = tombstones.get(value, 0) + 1
  while len(heap) and heap[0] in tombstones and tombstones[heap[0]]:
      heappop(heap)

字符串
正如预期的那样,你已经摊销了O(1)的移除时间,并且只要你不在其他地方的堆上popping,堆的top总是准确的。
考虑这个数字列表,它们首先被全部推入堆,然后以相同的顺序“移除”(而不是弹出):
[三、三、二、七、一、四、二]
按预期工作:

After inserting 3 into heap, top = 3
After inserting 3 into heap, top = 3
After inserting 2 into heap, top = 2
After inserting 7 into heap, top = 2
After inserting 1 into heap, top = 1
After inserting 4 into heap, top = 1
After inserting 2 into heap, top = 1


但是移除是通过增加对象的tombstone来完成的。如果堆的顶部设置了tombstone,那么移除对象并递减tombstone计数器。

Called remove( 3 )
  Marking 3 for deletion
Called remove( 3 )
  Marking 3 for deletion
Called remove( 2 )
  Marking 2 for deletion
Called remove( 7 )
  Marking 7 for deletion
Called remove( 1 )
  Marking 1 for deletion
  Deleting top 1
    Updated heap is: [2, 2, 3, 7, 3, 4]
  Deleting top 1
    Updated heap is: [2, 3, 3, 7, 4]
Called remove( 4 )
  Marking 4 for deletion
Called remove( 2 )
  Marking 2 for deletion
  Deleting top 2
    Updated heap is: [3, 3, 4, 7]
  Deleting top 3
    Updated heap is: [3, 7, 4]
  Deleting top 3
    Updated heap is: [4, 7]
  Deleting top 4
    Updated heap is: [7]
  Deleting top 7
    Updated heap is: []


请注意,当第二个heap_remove(3)被调用时,@GP89的解决方案会崩溃,因为3已经在tombstone集中。
您可以探索此示例here

j9per5c4

j9per5c43#

有了以上两个想法,这里是一个完整的演示:我会尽快使它干净简洁。

from heapq import heappush, heappop

class Solution:

  def demo():

    deleted = {}
    h = [0]

    heappush(h, 789)
    heappush(h, 101)
    heappush(h, 101)

    self.remove(h, 101, deleted)

    max_val = self.peek(h, deleted)

  def remove(self, h, y, deleted):
    deleted[y] = deleted.get(y, 0) + 1
    while len(h) > 0 and h[0] == y and deleted[y] > 0:
        heappop(h)
        deleted[y] -= 1

  def peek(self, h, deleted):
    while len(h) > 0 and deleted.get(h[0],0) > 0:
        deleted[h[0]] -= 1
        heappop(h)
    return h[0]

字符串

ycl3bljg

ycl3bljg4#

在这种方法中,我基本上是在字典中跟踪元素。所以,无论何时,我删除它,搜索过程都变成了O(1)。

import heapq
import collections
import itertools

class RemoveHeap:
    def __init__(self):
        self.h = []
        self.track = collections.defaultdict(collections.deque)
        self.counter = itertools.count()

    def insert_item(self, val):
        count = next(self.counter)
        item = [val, count, 'active']
        self.track[val].append(item)
        heapq.heappush(self.h, item)

    def delete_item(self, val):
        if val in self.track:
            items = self.track[val]
            for item in items:
                if item[2] == 'active':
                    item[2] = 'deleted'
                    break

    def pop_item(self):
        while len(self.h) > 0:
            item = heapq.heappop(self.h)
            item_track = self.track[item[0]]
            item_track.popleft()
            if len(item_track) == 0:
                del self.track[item[0]]
            else:
                self.track[item[0]] = item_track
            if item[2] == 'active':
                return item[0]

    def peek_item(self):
        item = self.h[0]
        if item[2] == 'deleted':
            x = self.pop_item()
            self.insert_item(x)
            return x
        return item[0]

字符串

dwbf0jvd

dwbf0jvd5#

恐怕只有heapq没有这样的方法。因为从堆中搜索元素需要O(n)
但是,您可以将它与dict之类的东西一起使用,这将为O(1)提供搜索条目的时间。

已删除:

我尝试使用字典记账,但我如何才能得到“创建测试”的索引时,它被插入?- Prakhar 3小时前
一种简单的方法是:

# remember to update this hdict when updating the heap.
hdict = { h[i][1]: i for i in range(len(h)) }

字符串
然后你可以通过访问这个hdict而不是你的O(n)线性搜索来获得给定字符串的索引。

相关问题