oraclesql:过滤“假”(重复)行,这些行只相差很小的时间

dzjeubhm  于 2021-07-24  发布在  Java
关注(0)|答案(2)|浏览(203)

几天前我也发布了一个类似的问题,但不幸的是,我误解了客户关于我必须解决的问题的陈述(最确切地说,我忘记了其中的一部分)。这就是为什么这篇文章看起来像是重复的,但事实并非如此。
这是我最初的一篇文章:oraclesql:过滤只相差很小时间的重复行
我们再来一次:
我有一个oracle表,其中包含参数触发的事件警报。警报的状态为打开/关闭,当警报在很短的时间内打开(pkn\u eventstatus='open')和关闭(pkn\u eventstatus='close')时,这两个事件行(打开/关闭)被视为“假”报警(参数“错误”触发的“故障报警”),因此两者都应删除。
我需要创建一个oraclesql查询来选择所有的“假”报警,这样我就可以删除它们了。同样,在“打开”和“关闭”状态之间时间(receiveddate)有微小差异的报警。
我开始创建一个可以工作的查询,但是它非常慢,所以我甚至不能测试它,因为它需要很长时间。我很确定它可以被优化,但是现在找不到如何优化,所以我希望有人能帮助我。
我目前的查询速度非常慢:

select t1.ID, t1.PKN_EVENTNAME, t1.PKN_EVENTSTATUS, t1.RECEIVEDDATE
from PARQUIMETERS_ALARMS t1
where
  exists
     (select 'x'
      from   PARQUIMETERS_ALARMS t2
      where  t1.PKN_EVENTNAME = t2.PKN_EVENTNAME
             and ((t1.PKN_EVENTSTATUS = 'Open' and t2.PKN_EVENTSTATUS = 'Close'
             and abs(t1.RECEIVEDDATE - t2.RECEIVEDDATE) * 24 * 60 * 60 < 30) -- < 30 sec
             or (t1.PKN_EVENTSTATUS = 'Close' and t2.PKN_EVENTSTATUS = 'Open'
             and abs(t2.RECEIVEDDATE - t1.RECEIVEDDATE) * 24 * 60 * 60 < 30))) -- < 30 sec
xcitsw88

xcitsw881#

使用两个可能更快 exists 条件而不是一个:

select t1.id, t1.pkn_eventname, t1.pkn_eventstatus, t1.receiveddate
from parquimeters_alarms t1
where
    (
        t1.pkn_eventstatus = 'Open'
        and exists (
            select 1
            from parquimeters_alarms t2
            where 
                t2.pkn_eventname = t1.pkn_eventname 
                and t2.pkn_eventstatus = 'Close'
                and t2.receiveddate < t1.receiveddate + 30 / 60 / 60 / 24
        )
    )
    or (
        t1.pkn_eventstatus = 'Close'
        and exists (
            select 1
            from parquimeters_alarms t2
            where 
                t2.pkn_eventname = t1.pkn_eventname 
                and t2.pkn_eventstatus = 'Open'
                and t1.receiveddate < t2.receiveddate + 30 / 60 / 60 / 24
        )
    )

此查询可能利用上的索引 (pkn_eventname, pkn_eventstatus, receiveddate) .
你也可以考虑 union all ,这样就不需要 or 条件:

select t1.id, t1.pkn_eventname, t1.pkn_eventstatus, t1.receiveddate
from parquimeters_alarms t1
where
    t1.pkn_eventstatus = 'Open'
    and exists (
        select 1
        from parquimeters_alarms t2
        where 
            t2.pkn_eventname = t1.pkn_eventname 
            and t2.pkn_eventstatus = 'Close'
            and t2.receiveddate < t1.receiveddate + 30 / 60 / 60 / 24
    )
union all
select t1.id, t1.pkn_eventname, t1.pkn_eventstatus, t1.receiveddate
from parquimeters_alarms t1
where 
    t1.pkn_eventstatus = 'Close'
    and exists (
        select 1
        from parquimeters_alarms t2
        where 
            t2.pkn_eventname = t1.pkn_eventname 
            and t2.pkn_eventstatus = 'Open'
            and t1.receiveddate < t2.receiveddate + 30 / 60 / 60 / 24
    )
nszi6y05

nszi6y052#

您可以使用pivot()简化操作—只需扫描一次,而无需对同一个表进行额外请求:

select 
  level, 
  'EVENT'||trunc(dbms_random.value(1,5)), 
  case when dbms_random.value>0.5 then 'CLOSE' else 'OPEN' end case,
  date'2020-01-01' + numtodsinterval( trunc(level*dbms_random.value(1,60)) ,'second')
from dual
connect by level<=50
)
,v_pivot as (
   select 
       ID
     , PKN_EVENTNAME
     , OPEN
     , CLOSE
     , lag(open)over(partition by PKN_EVENTNAME order by coalesce(open,close)) last_open
   from (
      select *
      from PARQUIMETERS_ALARMS v
      pivot (
         max(RECEIVEDDATE)
         for (PKN_EVENTSTATUS) in ('OPEN' as open,'CLOSE' as close)
      )
   )
)
select
 *
from v_pivot
where close is null or
abs(close - last_open) * 24 * 60 * 60 > 30;

关于随机生成的数据的完整示例(因为它是生成的,所以它包括未关闭和未打开的事件):

with PARQUIMETERS_ALARMS(ID, PKN_EVENTNAME, PKN_EVENTSTATUS, RECEIVEDDATE) as (
select 
  level, 
  'EVENT'||trunc(dbms_random.value(1,5)), 
  case when dbms_random.value>0.5 then 'CLOSE' else 'OPEN' end case,
  date'2020-01-01' + numtodsinterval( trunc(level*dbms_random.value(1,60)) ,'second')
from dual
connect by level<=50
)
,v_pivot as (
   select 
       ID
     , PKN_EVENTNAME
     , OPEN
     , CLOSE
     , lag(open)over(partition by PKN_EVENTNAME order by coalesce(open,close)) last_open
   from (
      select *
      from PARQUIMETERS_ALARMS v
      pivot (
         max(RECEIVEDDATE)
         for (PKN_EVENTSTATUS) in ('OPEN' as open,'CLOSE' as close)
      )
   )
)
select
 *
from v_pivot
where close is null or
abs(close - last_open) * 24 * 60 * 60 > 30
/

相关问题