@SmartSi 2023-02-28 00:03 采纳率: 0%
浏览 24

Hive Range Between 结果错误问题

Hive 版本:2.3.4


WITH behavior AS (
  SELECT 'a' AS uid, '20230211' AS dt, 1 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230212' AS dt, 2 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230213' AS dt, 3 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230214' AS dt, 4 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230215' AS dt, 5 AS score
  UNION ALL
  SELECT 'a' AS uid, '20230216' AS dt, 6 AS score
)
SELECT
  uid, dt, score,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS s1,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 3 PRECEDING AND 1 PRECEDING) AS s2,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND 2 FOLLOWING) AS s3,
  COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS s4
FROM behavior;

如上代码输出结果如下所示:

img

但是个人理解如下:

我们以 dt = '20230214' 这一行数据为例,ORDER BY 指定值比较字段为 score,即当前值为 4:

  • UNBOUNDED PRECEDING AND 1 PRECEDING 表示的范围为 (-∞, 3] s1 结果应该为 [1, 2, 3],但结果为 [1,2] 不符合预期
  • 3 PRECEDING AND 1 PRECEDING 表示的范围为 [1, 3],s2 结果应该为 [1, 2, 3],但结果为 [1,2] 不符合预期
  • 1 FOLLOWING AND 2 FOLLOWING 表示的范围为 [5, 6],s3 结果应该为 [5 ,6],但结果为 [6] 不符合预期
  • 1 FOLLOWING AND UNBOUNDED FOLLOWING 表示的范围为 [5, +∞),s4 结果应该为 [5 ,6],但结果为 [6] 不符合预期

帮忙解释一下那个地方理解错了?

  • 写回答

1条回答 默认 最新

  • simple_chenc 2023-02-28 09:19
    关注

    PRECEDING表示的是当前行的前一行,不包括前一行所以结果就是[1,2]

    评论

报告相同问题?

问题事件

  • 创建了问题 2月28日

悬赏问题

  • ¥15 Stata链式中介效应代码修改
  • ¥15 latex投稿显示click download
  • ¥15 请问读取环境变量文件失败是什么原因?
  • ¥15 在若依框架下实现人脸识别
  • ¥15 网络科学导论,网络控制
  • ¥100 安卓tv程序连接SQLSERVER2008问题
  • ¥15 利用Sentinel-2和Landsat8做一个水库的长时序NDVI的对比,为什么Snetinel-2计算的结果最小值特别小,而Lansat8就很平均
  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错