河源市文章资讯

SQL偏移类窗口函数 LAG、LEAD的用法小结

2026-03-25 23:46:02 浏览次数:0
详细信息

LAG()LEAD() 的用法:

一、基本概念

这两个函数用于访问结果集中当前行之前或之后的行数据,而无需使用自连接。

1. LAG() - 获取前一行的值

LAG(column, offset, default_value) OVER (
    [PARTITION BY partition_expression]
    ORDER BY order_expression
)

2. LEAD() - 获取后一行的值

LEAD(column, offset, default_value) OVER (
    [PARTITION BY partition_expression]
    ORDER BY order_expression
)

二、参数详解

参数 说明
column 要访问的列名
offset 偏移量(默认1,可选)
default_value 当没有前/后行时的默认值(可选)
PARTITION BY 分区子句,在每个分区内独立计算
ORDER BY 排序子句,决定行的前后顺序

三、使用示例

示例数据

CREATE TABLE sales (
    sale_date DATE,
    product VARCHAR(50),
    amount DECIMAL(10,2)
);

INSERT INTO sales VALUES
('2024-01-01', 'A', 100),
('2024-01-02', 'A', 150),
('2024-01-03', 'A', 120),
('2024-01-01', 'B', 200),
('2024-01-02', 'B', 180),
('2024-01-03', 'B', 220);

1. 基本用法

-- 获取前一天的销售额
SELECT 
    sale_date,
    product,
    amount,
    LAG(amount) OVER (PARTITION BY product ORDER BY sale_date) as prev_amount,
    -- 计算日环比增长率
    ROUND((amount - LAG(amount) OVER (PARTITION BY product ORDER BY sale_date)) 
          / LAG(amount) OVER (PARTITION BY product ORDER BY sale_date) * 100, 2) as growth_rate
FROM sales
ORDER BY product, sale_date;

2. 指定偏移量

-- 获取前两天(偏移量为2)的数据
SELECT 
    sale_date,
    product,
    amount,
    LAG(amount, 2) OVER (PARTITION BY product ORDER BY sale_date) as amount_2days_ago,
    LEAD(amount, 2, 0) OVER (PARTITION BY product ORDER BY sale_date) as amount_2days_later
FROM sales;

3. 设置默认值

SELECT 
    sale_date,
    product,
    amount,
    -- 当没有前一天数据时,使用0作为默认值
    LAG(amount, 1, 0) OVER (PARTITION BY product ORDER BY sale_date) as prev_amount,
    -- 当没有后一天数据时,使用NULL作为默认值
    LEAD(amount, 1, NULL) OVER (PARTITION BY product ORDER BY sale_date) as next_amount
FROM sales;

四、实际应用场景

1. 计算环比/同比

-- 计算月度环比
WITH monthly_sales AS (
    SELECT 
        DATE_TRUNC('month', sale_date) as month,
        SUM(amount) as total_amount
    FROM sales
    GROUP BY DATE_TRUNC('month', sale_date)
)
SELECT 
    month,
    total_amount,
    LAG(total_amount) OVER (ORDER BY month) as prev_month_amount,
    ROUND((total_amount - LAG(total_amount) OVER (ORDER BY month)) 
          / LAG(total_amount) OVER (ORDER BY month) * 100, 2) as mom_growth
FROM monthly_sales;

2. 查找连续变化

-- 查找销售额连续下降的产品
WITH sales_trend AS (
    SELECT 
        sale_date,
        product,
        amount,
        LAG(amount) OVER (PARTITION BY product ORDER BY sale_date) as prev_amount
    FROM sales
)
SELECT *
FROM sales_trend
WHERE amount < prev_amount;  -- 当前销售额小于前一天

3. 计算时间间隔

-- 计算用户连续登录的时间间隔
SELECT 
    user_id,
    login_date,
    LAG(login_date) OVER (PARTITION BY user_id ORDER BY login_date) as last_login,
    DATEDIFF(day, 
             LAG(login_date) OVER (PARTITION BY user_id ORDER BY login_date), 
             login_date) as days_between_logins
FROM user_logins;

4. 数据填充(向前/向后填充)

-- 向前填充缺失值
SELECT 
    date,
    COALESCE(value, 
             LAG(value) IGNORE NULLS OVER (ORDER BY date)) as filled_value
FROM data_with_nulls;

五、与相关函数的对比

函数 方向 是否必须排序 典型用途
LAG() 向前看(之前) 访问前一行的值
LEAD() 向后看(之后) 访问后一行的值
FIRST_VALUE() 分区第一行 可选 获取分区第一个值
LAST_VALUE() 分区最后一行 可选 获取分区最后一个值

六、性能优化建议

合理使用索引:确保 PARTITION BYORDER BY 的列有索引 避免多层嵌套:尽量减少窗口函数的嵌套层级 使用CTE或子查询:复杂计算分步骤进行 限制窗口范围:使用 ROWS BETWEEN 子句限制计算范围

七、常见注意事项

ORDER BY 是关键:必须指定 ORDER BY 才能确定行的前后顺序 默认值处理:注意 NULL 值的处理,使用 COALESCE() 或默认值参数 分区边界LAG()LEAD() 不会跨越分区边界 性能影响:大量数据时,窗口函数可能有性能开销

八、复杂示例:计算移动平均

-- 计算3天的移动平均
SELECT 
    sale_date,
    product,
    amount,
    AVG(amount) OVER (
        PARTITION BY product 
        ORDER BY sale_date 
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) as moving_avg_3days,
    -- 使用LAG和LEAD手动计算
    (LAG(amount, 2, 0) OVER (PARTITION BY product ORDER BY sale_date) +
     LAG(amount, 1, 0) OVER (PARTITION BY product ORDER BY sale_date) +
     amount) / 3.0 as manual_moving_avg
FROM sales;

这两个函数是时间序列分析、趋势计算和数据对比的强大工具,掌握它们能大大简化许多复杂的SQL查询。

相关推荐