Hive 版本:2.3.4
WITH behavior AS (
SELECT 'a' AS uid, '20230211' AS dt, 1 AS score
UNION ALL
SELECT 'a' AS uid, '20230212' AS dt, 2 AS score
UNION ALL
SELECT 'a' AS uid, '20230213' AS dt, 3 AS score
UNION ALL
SELECT 'a' AS uid, '20230214' AS dt, 4 AS score
UNION ALL
SELECT 'a' AS uid, '20230215' AS dt, 5 AS score
UNION ALL
SELECT 'a' AS uid, '20230216' AS dt, 6 AS score
)
SELECT
uid, dt, score,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS s1,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 3 PRECEDING AND 1 PRECEDING) AS s2,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND 2 FOLLOWING) AS s3,
COLLECT_SET(score) OVER (ORDER BY score RANGE BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS s4
FROM behavior;
如上代码输出结果如下所示:

但是个人理解如下:
我们以 dt = '20230214' 这一行数据为例,ORDER BY 指定值比较字段为 score,即当前值为 4:
UNBOUNDED PRECEDING AND 1 PRECEDING表示的范围为(-∞, 3]s1 结果应该为[1, 2, 3],但结果为[1,2]不符合预期3 PRECEDING AND 1 PRECEDING表示的范围为[1, 3],s2 结果应该为[1, 2, 3],但结果为[1,2]不符合预期1 FOLLOWING AND 2 FOLLOWING表示的范围为[5, 6],s3 结果应该为[5 ,6],但结果为[6]不符合预期1 FOLLOWING AND UNBOUNDED FOLLOWING表示的范围为[5, +∞),s4 结果应该为[5 ,6],但结果为[6]不符合预期
帮忙解释一下那个地方理解错了?