@SmartSi 2023-02-28 00:03 采纳率: 0%
浏览 22

Hive Range Between 结果错误问题

Hive 版本:2.3.4


WITH behavior AS (
  SELECT 'a' AS uid, '20230211' AS dt, 1 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230212' AS dt, 2 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230213' AS dt, 3 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230214' AS dt, 4 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230215' AS dt, 5 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230216' AS dt, 6 AS score
)
SELECT
  uid, dt, score,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS s1,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 3 PRECEDING AND 1 PRECEDING) AS s2,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND 2 FOLLOWING) AS s3,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS s4
FROM behavior;

如上代码输出结果如下所示:

img

但是个人理解如下:

我们以 dt = '20230214' 这一行数据为例,ORDER BY 指定值比较字段为 score,即当前值为 4:

  • UNBOUNDED PRECEDING AND 1 PRECEDING 表示的范围为 (-∞, 3] s1 结果应该为 [1, 2, 3],但结果为 [1,2] 不符合预期
  • 3 PRECEDING AND 1 PRECEDING 表示的范围为 [1, 3],s2 结果应该为 [1, 2, 3],但结果为 [1,2] 不符合预期
  • 1 FOLLOWING AND 2 FOLLOWING 表示的范围为 [5, 6],s3 结果应该为 [5 ,6],但结果为 [6] 不符合预期
  • 1 FOLLOWING AND UNBOUNDED FOLLOWING 表示的范围为 [5, +∞),s4 结果应该为 [5 ,6],但结果为 [6] 不符合预期

帮忙解释一下那个地方理解错了?

  • 写回答

1条回答 默认 最新

  • simple_chenc 2023-02-28 09:19
    关注

    PRECEDING表示的是当前行的前一行,不包括前一行所以结果就是[1,2]

    评论

报告相同问题?

问题事件

  • 创建了问题 2月28日

悬赏问题

  • ¥15 训练的多模态特征融合模型准确度很低怎么办
  • ¥15 kylin启动报错log4j类冲突
  • ¥15 超声波模块测距控制点灯,灯的闪烁很不稳定,经过调试发现测的距离偏大
  • ¥15 import arcpy出现importing _arcgisscripting 找不到相关程序
  • ¥15 onvif+openssl,vs2022编译openssl64
  • ¥15 iOS 自定义输入法-第三方输入法
  • ¥15 很想要一个很好的答案或提示
  • ¥15 扫描项目中发现AndroidOS.Agent、Android/SmsThief.LI!tr
  • ¥15 怀疑手机被监控,请问怎么解决和防止
  • ¥15 Qt下使用tcp获取数据的详细操作