Redission分布式锁源码篇

小奥2024-01-182024-03-21

Redisson分布式锁源码篇

Redisson 是一个简单的Redis Java客户端，具有内存数据网格功能。

它不仅提供了一系列的分布式的Java常用对象，还提供了许多分布式服务，其中就包含了各种分布式锁的实现。

GitHub地址：https://github.com/redisson/redisson

Redisson分布式锁不仅是相对成熟的分布式锁方案，而且在很多企业中都会去使用的，所以了解一下底层的实现还是很有必要的。

一、使用Redisson分布式锁

Redisson分布式锁的使用非常简单，我们只需要在maven中引入依赖，然后直接调用相关API即可。

1.1 引入依赖

<dependency>
          <groupId>org.redisson</groupId>
          <artifactId>redisson</artifactId>
          <version>3.13.6</version>
      </dependency>

1.2 调用API

// 创建RedissonClient
      RedissonClient redissonClient = Redisson.create();
      // 创建分布式锁
      RLock lock = redissonClient.getLock("lock");
      // 加锁
      lock.lock();
      // 尝试获取锁，无参表示不等待立即返回
      boolean b = lock.tryLock();
      // 解锁
      lock.unlock();

二、源码解析

下面我就带大家去分析一下Redisson的可重入和可重试的源码。

2.1 可重入锁原理

(1) 原理解释

Redisson分布式锁的可重入锁的原理：

使用Redis中Hash的key-value结构，存储线程标识和重入次数；
每次获取锁的时候，如果线程标识相同，就给Hash的value加1；
每次释放锁的时候，并不会删除锁，而是重入次数减1；
当所有逻辑执行完，如果重入次数为0，则删除锁。

逻辑流程图

(2) 源码分析

① 尝试获取锁

// 尝试获取锁，无参表示不等待立即返回
   boolean b = lock.tryLock();

org.redisson.RedissonLock

@Override
   public boolean tryLock() {
       return get(tryLockAsync());
   }

@Override
public RFuture<Boolean> tryLockAsync() {
    return tryLockAsync(Thread.currentThread().getId());
}

@Override
public RFuture<Boolean> tryLockAsync(long threadId) {
    // tryAcquireOnceAsync有三个参数 
    // long waitTime 锁重试的最大等待时间
    // long leaseTime 锁自动释放时间
    // TimeUnit unit 时间单位 
    //  long threadId 线程id
    return tryAcquireOnceAsync(-1, -1, null, threadId);
}

注意，下面的代码我们只需要关注有注释的部分，剩余的部分会在后面讲到。

   // 根据tryAcquireOnceAsync命名可知，这里获取锁只会获取一次，不会进行重试
private RFuture<Boolean> tryAcquireOnceAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId) {
       // 判断leaseTime是否为-1（leaseTime为锁自动释放的时间，如果不传默认为-1）
       if (leaseTime != -1) {
           return tryLockInnerAsync(waitTime, leaseTime, unit, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
       }
       // tryLockInnerAsync是尝试获取锁的方法
       RFuture<Boolean> ttlRemainingFuture = tryLockInnerAsync(waitTime,
                                                   commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(),
                                                   TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
       ttlRemainingFuture.onComplete((ttlRemaining, e) -> {
           if (e != null) {
               return;
           }

           // lock acquired
           if (ttlRemaining) {
               scheduleExpirationRenewal(threadId);
           }
       });
       return ttlRemainingFuture;
   }

tryLockInnerAsync()方法是获取锁的主要实现，我们可以看到，其实都是固定的Lua脚本逻辑。

   // 尝试获取锁的逻辑（重点）
<T> RFuture<T> tryLockInnerAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) {
       internalLockLeaseTime = unit.toMillis(leaseTime);

       return evalWriteAsync(getName(), LongCodec.INSTANCE, command,
               "if (redis.call('exists', KEYS[1]) == 0) then " +
                       "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                       "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                       "return nil; " +
                       "end; " +
                       "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                       "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                       "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                       "return nil; " +
                       "end; " +
                       "return redis.call('pttl', KEYS[1]);",
               Collections.singletonList(getName()), internalLockLeaseTime, getLockName(threadId));
   }

我们把这段Lua脚本摘出来分析一下逻辑：

-- 判断锁是否存在
if (redis.call('exists', KEYS[1]) == 0) then 
    -- 等于0表示不存在，创建锁，设置过期时间
    redis.call('hincrby', KEYS[1], ARGV[2], 1);
    redis.call('pexpire', KEYS[1], ARGV[1]); 
    return nil; 
end; 
-- 判断锁标识是否相同 
if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then 
    -- 等于1表示是自己的锁，重试次数加1，设置过期时间
    redis.call('hincrby', KEYS[1], ARGV[2], 1); 
    redis.call('pexpire', KEYS[1], ARGV[1]); 
    return nil; 
"end; 
return redis.call('pttl', KEYS[1]);

② 释放锁

   // 释放锁    
lock.unlock();

org.redisson.RedissonLock

@Override
public void unlock() {
    try {
        get(unlockAsync(Thread.currentThread().getId()));
    } catch (RedisException e) {
        if (e.getCause() instanceof IllegalMonitorStateException) {
            throw (IllegalMonitorStateException) e.getCause();
        } else {
            throw e;
        }
    }

同样地，我们只需要跟着有注释的代码往下看。

@Override
public RFuture<Void> unlockAsync(long threadId) {
    RPromise<Void> result = new RedissonPromise<Void>();
    // unlockInnerAsync 是释放锁的方法
    RFuture<Boolean> future = unlockInnerAsync(threadId);
		
    future.onComplete((opStatus, e) -> {
        cancelExpirationRenewal(threadId);

        if (e != null) {
            result.tryFailure(e);
            return;
        }

        if (opStatus == null) {
            IllegalMonitorStateException cause = new IllegalMonitorStateException("attempt to unlock lock, not locked by current thread by node id: "
                    + id + " thread-id: " + threadId);
            result.tryFailure(cause);
            return;
        }

        result.trySuccess(null);
    });

    return result;
}

unlockInnerAsync()方法是释放锁的主要逻辑。我们可以看到，其实也都是固定的Lua脚本逻辑。

protected RFuture<Boolean> unlockInnerAsync(long threadId) {
    return evalWriteAsync(getName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
            "if (redis.call('hexists', KEYS[1], ARGV[3]) == 0) then " +
                    "return nil;" +
                    "end; " +
                    "local counter = redis.call('hincrby', KEYS[1], ARGV[3], -1); " +
                    "if (counter > 0) then " +
                    "redis.call('pexpire', KEYS[1], ARGV[2]); " +
                    "return 0; " +
                    "else " +
                    "redis.call('del', KEYS[1]); " +
                    "redis.call('publish', KEYS[2], ARGV[1]); " +
                    "return 1; " +
                    "end; " +
                    "return nil;",
            Arrays.asList(getName(), getChannelName()), LockPubSub.UNLOCK_MESSAGE, internalLockLeaseTime, getLockName(threadId));
}

我们把这段Lua脚本摘出来分析一下逻辑：

   -- 判断锁是否存在
if (redis.call('hexists', KEYS[1], ARGV[3]) == 0) then
   	-- 等于0表示不存在，直接返回
   	return nil;
   end; 
-- 锁重入次数减1
   local counter = redis.call('hincrby', KEYS[1], ARGV[3], -1); 
   if (counter > 0) then 
   	-- 如果锁重入次数大于0，则设置过期时间 
       redis.call('pexpire', KEYS[1], ARGV[2]);
       return 0; 
   else
   	-- 如果锁重入次数等于0，则删除锁，并发布通知
       redis.call('del', KEYS[1]);
       redis.call('publish', KEYS[2], ARGV[1]);
       return 1; 
   end; 
   return nil;

2.2 锁重试和watch dog原理

(1) 原理解释

Redisson分布式锁的锁重试和watch dog的原理：

获取锁

首先尝试根据线程标识获取锁
判断锁的ttl是否为null（为null表示成功获取到锁，不为null表示获取锁失败）
- 获取锁成功，判断锁自动释放的时间是否未设置(-1)
  - 未设置，开启watch dog，进行自动续约，默认值是30s
  - 设置则使用设置的时间
- 获取锁失败，判断剩余等待时间是否大于0
  - 大于0，订阅并等待释放锁的信号，并进行重试逻辑
  - 小于等于0，则获取锁失败，返回

释放锁

首先尝试根据线程标识释放锁
- 释放锁成功，发送释放锁的消息，并关闭watch dog自动续约
- 释放锁失败，记录异常，返回

(2) 源码分析

① 重试获取锁

// 尝试获取锁，并设定获取锁的最大等待时长为1s
boolean b = lock.tryLock(1, TimeUnit.SECONDS);

org.redisson.RedissonLock

@Override
public boolean tryLock(long waitTime, TimeUnit unit) throws InterruptedException {
    return tryLock(waitTime, -1, unit);
}

    @Override
    public boolean tryLock(long waitTime, long leaseTime, TimeUnit unit) throws InterruptedException {
        long time = unit.toMillis(waitTime); // 将等待时间转化为ms
        long current = System.currentTimeMillis(); // 当前时间 
        long threadId = Thread.currentThread().getId(); // 线程ID
        // tryAcquire是尝试获取锁的方法，ttl表示锁的剩余有效期 
        Long ttl = tryAcquire(waitTime, leaseTime, unit, threadId);
        // lock acquired
        if (ttl == null) {
            // 如果ttl为null，则表示获取锁成功，返回true
            return true; 
        }
        // 否则获取锁失败，需要重试
        
        // 这里减去获取锁消耗的时间，防止获取锁时间过长超过了等待时间
        time -= System.currentTimeMillis() - current;
        if (time <= 0) {
            // 如果等待时间小于0，则返回false，获取锁失败
            acquireFailed(waitTime, unit, threadId);
            return false;
        }
        // 这里等待时间还有剩余，可以进行继续尝试获取锁
        current = System.currentTimeMillis(); // 当前时间
        // 订阅释放锁的信号（如果锁被释放，则这里会收到通知，通知的逻辑在释放锁的lua脚本中执行，publish操作）
        RFuture<RedissonLockEntry> subscribeFuture = subscribe(threadId);
        // await方法为等待，等待时间为time
        if (!subscribeFuture.await(time, TimeUnit.MILLISECONDS)) {
            // 如果等待time时间，还没有收到释放锁通知
            if (!subscribeFuture.cancel(false)) {
                subscribeFuture.onComplete((res, e) -> {
                    if (e == null) {
                        unsubscribe(subscribeFuture, threadId); // 取消释放锁信号的订阅
                    }
                });
            }
            acquireFailed(waitTime, unit, threadId);
            return false; // 获取锁失败，返回false
        }

        // 这里已经获取到释放锁的通知
        
        try {
            // 减去等待消耗的时间
            time -= System.currentTimeMillis() - current; 
            if (time <= 0) {
                // 如果等待时间小于0，则返回false，获取锁失败
                acquireFailed(waitTime, unit, threadId);
                return false;
            }
        
            while (true) {
                long currentTime = System.currentTimeMillis();
                // 重试获取锁的逻辑，与上面类似
                ttl = tryAcquire(waitTime, leaseTime, unit, threadId);
                // lock acquired
                if (ttl == null) {
                    return true;
                }

                time -= System.currentTimeMillis() - currentTime;
                if (time <= 0) {
                    acquireFailed(waitTime, unit, threadId);
                    return false;
                }

                // waiting for message
                currentTime = System.currentTimeMillis();
                // 以信号量机制的方式尝试获取
                // ttl为锁剩余时间，time为等待时间
                if (ttl >= 0 && ttl < time) {
                    // 如果ttl < time，只会等待ttl时间 
                    subscribeFuture.getNow().getLatch().tryAcquire(ttl, TimeUnit.MILLISECONDS);
                } else {
                    // 否则等待time
                    subscribeFuture.getNow().getLatch().tryAcquire(time, TimeUnit.MILLISECONDS);
                }

                time -= System.currentTimeMillis() - currentTime;
                if (time <= 0) {
                    acquireFailed(waitTime, unit, threadId);
                    return false;
                }
                // 时间充足，继续重试
            }
        } finally {
            unsubscribe(subscribeFuture, threadId); // 取消订阅
        }
//        return get(tryLockAsync(waitTime, leaseTime, unit));
    }

private Long tryAcquire(long waitTime, long leaseTime, TimeUnit unit, long threadId) {
    // 这里的get其实就是阻塞等待返回锁的剩余有效期（如果存在，不存在返回null）
    return get(tryAcquireAsync(waitTime, leaseTime, unit, threadId));
}

private <T> RFuture<Long> tryAcquireAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId) {
    if (leaseTime != -1) {
        return tryLockInnerAsync(waitTime, leaseTime, unit, threadId, RedisCommands.EVAL_LONG);
    }
    // RFuture，表示这里使用异步的方式去执行获取锁的逻辑
    RFuture<Long> ttlRemainingFuture = tryLockInnerAsync(waitTime,
                             // 如果不设置默认时间，则在这里给定默认值，默认值为30s
                             // private long lockWatchdogTimeout = 30 * 1000;
				     commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(),
                                            TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_LONG);
    // 当RFuture完成以后
    ttlRemainingFuture.onComplete((ttlRemaining, e) -> {
        // ttlRemaining表示剩余有效期，e表示异常
        if (e != null) {
            return;
        }

        // ttlRemaining == null，表示获取锁成功 
        if (ttlRemaining == null) {
            // 解决有效期问题，即过期时间续约
            scheduleExpirationRenewal(threadId);
        }
    });
    return ttlRemainingFuture;
}

<T> RFuture<T> tryLockInnerAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) {
    // 将leaseTime转化为ms赋值给内部的成员变量
    internalLockLeaseTime = unit.toMillis(leaseTime);

    return evalWriteAsync(getName(), LongCodec.INSTANCE, command,
            "if (redis.call('exists', KEYS[1]) == 0) then " +
                    "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                    "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                    "return nil; " +
                    "end; " +
                    "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                    "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                    "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                    "return nil; " +
                    "end; " +
                    // 这里返回的是 锁的剩余有效期，单位为ms
                    "return redis.call('pttl', KEYS[1]);",
            Collections.singletonList(getName()), internalLockLeaseTime, getLockName(threadId));
}

② watch dog自动续约

定时更新有效期的方法是在获取锁成功之后调用的，方法名为scheduleExpirationRenewal。


private static final ConcurrentMap<String, ExpirationEntry> EXPIRATION_RENEWAL_MAP = new ConcurrentHashMap<>();

// name是锁的名称
public RedissonLock(CommandAsyncExecutor commandExecutor, String name) {
       ...
       this.id = commandExecutor.getConnectionManager().getId(); // 当前连接id
	...
       this.entryName = id + ":" + name;
	...
   }

protected String getEntryName() {
       return entryName;
   }

// 解决有效期问题，即过期时间续约
private void scheduleExpirationRenewal(long threadId) {
       ExpirationEntry entry = new ExpirationEntry();
       // EXPIRATION_RENEWAL_MAP是一个ConcurrentMap，key可以理解为就是锁的名称，entry就是锁的实例
       ExpirationEntry oldEntry = EXPIRATION_RENEWAL_MAP.putIfAbsent(getEntryName(), entry);
       if (oldEntry != null) {
           // 同一个线程重入的情况
           oldEntry.addThreadId(threadId);
       } else {
           // 线程第一次来
           entry.addThreadId(threadId);
           // 更新过期时间
           renewExpiration();
       }
   }

   // 更新过期时间
private void renewExpiration() {
       ExpirationEntry ee = EXPIRATION_RENEWAL_MAP.get(getEntryName());
       if (ee == null) {
           return;
       }
	// 定时任务，有三个参数
       // TimerTask 定时任务
       // long delay 延迟时间
       // TimeUnit unit 时间单位 
       Timeout task = commandExecutor.getConnectionManager().newTimeout(new TimerTask() {
           @Override
           public void run(Timeout timeout) throws Exception {
               // 获取ExpirationEntry
               ExpirationEntry ent = EXPIRATION_RENEWAL_MAP.get(getEntryName());
               if (ent == null) {
                   return;
               }
               // 取出线程ID
               Long threadId = ent.getFirstThreadId();
               if (threadId == null) {
                   return;
               }
			// 刷新有效期
               RFuture<Boolean> future = renewExpirationAsync(threadId);
               future.onComplete((res, e) -> {
                   if (e != null) {
                       log.error("Can't update lock " + getName() + " expiration", e);
                       return;
                   }

                   if (res) {
                       // 重新执行自己的逻辑，这样的话，如果锁存在，就会自动续约，10s执行一次
                       renewExpiration();
                   }
               });
           }
           // internalLockLeaseTime是内部锁释放时间（尝试获取锁的逻辑中赋值，是一个成员变量，即wtach dog默认时间30s）
           // 这里表示该定时任务10s后执行
       }, internalLockLeaseTime / 3, TimeUnit.MILLISECONDS);

       ee.setTimeout(task);
   }

   // 刷新有效期
protected RFuture<Boolean> renewExpirationAsync(long threadId) {
       return evalWriteAsync(getName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
               "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                       // 如果锁是当前线程的，则更新有效时间
                       "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                       "return 1; " +
                       "end; " +
                       "return 0;",
               Collections.singletonList(getName()),
               internalLockLeaseTime, getLockName(threadId));
   }

那么什么时候停止定时更新呢？

当然是在释放锁的过程中，我们回过来看释放锁的剩余代码：

@Override
public RFuture<Void> unlockAsync(long threadId) {
    RPromise<Void> result = new RedissonPromise<Void>();
    // unlockInnerAsync 是释放锁的方法
    RFuture<Boolean> future = unlockInnerAsync(threadId);
		
    // 释放锁完成后 
    future.onComplete((opStatus, e) -> {
        // 取消到期更新任务
        cancelExpirationRenewal(threadId);

        if (e != null) {
            result.tryFailure(e);
            return;
        }

        if (opStatus == null) {
            IllegalMonitorStateException cause = new IllegalMonitorStateException("attempt to unlock lock, not locked by current thread by node id: "
                    + id + " thread-id: " + threadId);
            result.tryFailure(cause);
            return;
        }

        result.trySuccess(null);
    });

    return result;
}

void cancelExpirationRenewal(Long threadId) {
    // 从map中获取到当前锁的定时任务
    ExpirationEntry task = EXPIRATION_RENEWAL_MAP.get(getEntryName());
    if (task == null) {
        return;
    }
    
    if (threadId != null) {
        // 从任务中移除线程id
        task.removeThreadId(threadId);
    }

    if (threadId == null || task.hasNoThreads()) {
        Timeout timeout = task.getTimeout();
        if (timeout != null) {
            // 将定时任务取消
            timeout.cancel();
        }
        // 最后将entry移除
        EXPIRATION_RENEWAL_MAP.remove(getEntryName());
    }
}

这样，在锁被释放锁，所有的定时任务、订阅都被取消了。