@SmartSi 2023-02-28 00:03 采纳率: 0%
浏览 22

Hive Range Between 结果错误问题

Hive 版本:2.3.4


WITH behavior AS (
  SELECT 'a' AS uid, '20230211' AS dt, 1 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230212' AS dt, 2 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230213' AS dt, 3 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230214' AS dt, 4 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230215' AS dt, 5 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230216' AS dt, 6 AS score
)
SELECT
  uid, dt, score,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS s1,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 3 PRECEDING AND 1 PRECEDING) AS s2,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND 2 FOLLOWING) AS s3,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS s4
FROM behavior;

如上代码输出结果如下所示:

img

但是个人理解如下:

我们以 dt = '20230214' 这一行数据为例,ORDER BY 指定值比较字段为 score,即当前值为 4:

  • UNBOUNDED PRECEDING AND 1 PRECEDING 表示的范围为 (-∞, 3] s1 结果应该为 [1, 2, 3],但结果为 [1,2] 不符合预期
  • 3 PRECEDING AND 1 PRECEDING 表示的范围为 [1, 3],s2 结果应该为 [1, 2, 3],但结果为 [1,2] 不符合预期
  • 1 FOLLOWING AND 2 FOLLOWING 表示的范围为 [5, 6],s3 结果应该为 [5 ,6],但结果为 [6] 不符合预期
  • 1 FOLLOWING AND UNBOUNDED FOLLOWING 表示的范围为 [5, +∞),s4 结果应该为 [5 ,6],但结果为 [6] 不符合预期

帮忙解释一下那个地方理解错了?

  • 写回答

1条回答 默认 最新

  • simple_chenc 2023-02-28 09:19
    关注

    PRECEDING表示的是当前行的前一行,不包括前一行所以结果就是[1,2]

    评论

报告相同问题?

问题事件

  • 创建了问题 2月28日

悬赏问题

  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 对于相关问题的求解与代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料