转载

MyBatis 的秘密（六）缓存

缓存

众所周知的， MyBatis 内置了二级缓存，对于一级缓存是默认打开的，而二级缓存需要手动开启。

接下来，我们探索一下 MyBatis 的缓存。

首选在官方文档中，我们可以找到 MyBatis 的相关配置：

全局配置： cacheEnabled : 全局的开启或关闭配置文件中所有映射器已经配置的缓存。默认 true
全局配置： localCacheScope : 设置一级缓存作用域，可以设置为 SESSION 和 STATEMENT , 默认为 SESSION ,当设置为 STATEMENT 之后，一级缓存仅仅会用在 STATEMENT 范围
映射配置： useCache : 是否将返回结果在二级缓存中缓存起来，默认 select 为 true
映射配置: flushCache : 语句调用后，是否将本地缓存和二级缓存都清空，默认非 select 为 true
映射配置： <cache> : 开启二级缓存，如果是 MyBatis 的内置二级缓存，还可以配置：缓存刷新时间，缓存大小，缓存刷新间隔，缓存替换策略等
映射配置： <cache-ref> : 联合域名空间，使用所指定的域名空间的缓存。

以上便是 MyBatis 中，所有有关缓存的配置。

一级缓存

首先看一级缓存，一级缓存的代码主要在 BaseExecutor 中：

MyBatis 的一级缓存是通过 HashMap 实现的。

@Override
public <E> List<E> query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException {
    ErrorContext.instance().resource(ms.getResource()).activity("executing a query").object(ms.getId());
    if (closed) {
        throw new ExecutorException("Executor was closed.");
    }
    //是否需要刷新缓存
    //queryStack的作用应该是防止在多线程的情况下，其他线程同时在查询缓存，而这里执行
    //清空操作
    //不过这里考虑到了多线程，为什么缓存还用HashMap,是因为觉得并发不够，并且一般很少多线程么
    if (queryStack == 0 && ms.isFlushCacheRequired()) {
        clearLocalCache();
    }
    List<E> list;
    try {
        queryStack++;
        //前面说过，当有自定义的`ResultHandler`时，不会使用缓存
        list = resultHandler == null ? (List<E>) localCache.getObject(key) : null;   
        //如果成功从缓存中找到
        if (list != null) {
            //处理存储过程
            handleLocallyCachedOutputParameters(ms, key, parameter, boundSql);
        } 
        //否则，查询数据库
        else {
            list = queryFromDatabase(ms, parameter, rowBounds, resultHandler, key, boundSql);
        }
    } finally {
        queryStack--;
    }
    //当所有的查询都执行完成了
    if (queryStack == 0) {
        for (DeferredLoad deferredLoad : deferredLoads) {
            deferredLoad.load();
        }
        // issue #601
        deferredLoads.clear();
        //如果配置了一级缓存为`STATEMENT` 则清空缓存
        if (configuration.getLocalCacheScope() == LocalCacheScope.STATEMENT) {
            // issue #482
            clearLocalCache();
        }
    }
    return list;
}

从以上代码可以分析出来，大概就是首先从缓存中查找，如果找不到再从数据库中查找。

同时，如果缓存范围是 STATEMENT ,那么每次执行都会清空本地缓存，那么 STATEMENT 的缓存在哪里呢？

需要知道的 Executor 是属于 SqlSession 的，而 STATMENT 是属于方法的，也就是整个 SqlSession 用的是同一个 Executor ，而对于方法是每执行一个方法，就会新建一个 STATEMENT ，因此我们可以认为，对于作用域为 STATEMENT 的一级缓存，相当于关闭了一级缓存

二级缓存

首先看看 CacheEnable ：

public Executor newExecutor(Transaction transaction, ExecutorType executorType) {
    executorType = executorType == null ? defaultExecutorType : executorType;
    executorType = executorType == null ? ExecutorType.SIMPLE : executorType;
    Executor executor;
    if (ExecutorType.BATCH == executorType) {
        executor = new BatchExecutor(this, transaction);
    } else if (ExecutorType.REUSE == executorType) {
        executor = new ReuseExecutor(this, transaction);
    } else {
        executor = new SimpleExecutor(this, transaction);
    }
    //如果配置cacheEnabled 为`true`
    if (cacheEnabled) {
        //则使用`CachingExecutor`包装生成的`executor`
        executor = new CachingExecutor(executor);
    }
    executor = (Executor) interceptorChain.pluginAll(executor);
    return executor;
}

可以看到，这便是前面我们说的第四个包装类 Executor

而 CachingExecutor 仅仅的作用便是在代理 Executor 执行前或执行后进行缓存的处理：

接下来看看 CachingExecutor 的具体实现：

CachingExecutor 中包含两个成员：

//包装的类
private final Executor delegate;
//事务缓存管理器
private final TransactionalCacheManager tcm = new TransactionalCacheManager();

因为二级缓存是可以跨 Session 的，因此就涉及到事务的提交和回滚。

@Override
  public <E> List<E> query(MappedStatement ms, Object parameterObject, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql)
      throws SQLException {  
    Cache cache = ms.getCache();
    //如果配置了二级缓存
    if (cache != null) {
      //首先看是否需要刷新缓存  
      flushCacheIfRequired(ms);
       //如果配置了useCache以及没有自定义resultHandler 
      if (ms.isUseCache() && resultHandler == null) {
        //判断是否有存储过程  
        ensureNoOutParams(ms, boundSql);
        //查看缓存中是否存在  
        @SuppressWarnings("unchecked")
        List<E> list = (List<E>) tcm.getObject(cache, key);
        //不存在则交给`Executor`查找  
        if (list == null) {
          list = delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
          tcm.putObject(cache, key, list); // issue #578 and #116
        }
        return list;
      }
    }
    //如果没有配置二级缓存，则直接查询
    return delegate.query(ms, parameterObject, rowBounds, resultHandler, key, boundSql);
  }

可以看到，基本的逻辑就是查看缓存中是否存在，不存在则查询数据库

可以看到基本和一级缓存一样，但是唯一不同的是缓存的策略，因为二级缓存涉及到事务的回滚，因此需要单独处理回滚和提交。

MyBatis 处理事务性缓存的方案非常简单：首先，使用一个临时缓存保存产生的数据，当 commit() 的时候，就将数据真正的写入缓存中，当需要 rollback() 则直接 clear()

TransactionalCache#commit()

public void commit() {
    //如果中途发生了异常，则清空
    if (clearOnCommit) {
        delegate.clear();
    }
    //将产生的数据真正的写入缓存
    //将记录的未命中的数据从结果中读取，并加入缓存中
    flushPendingEntries();
    //重置状态
    reset();
}

TransactionalCache#rollback()

public void rollback() {
    //清空所有保存的未命中的key  
    unlockMissedEntries();
    //重置状态
    reset();
  }

有一点疑惑的，它回退并不是简单的把产生数据缓存清空，而且还调用了 unlockMissedEntries(); 将执行过程中发现未命中的数据也删除：

TransactionalCache#unlockMissedEntries()

private void unlockMissedEntries() { 
    for (Object entry : entriesMissedInCache) {
        delegate.removeObject(entry);
    }
}

答案在于 Cache 接口的实现者， BlockingCache 中，增加了数据库的锁，而当回滚的时候，则需要释放锁，这里的操作便是释放锁。

看到这里，我们可以联系数据库中的事务隔离问题，数据库事务存在 脏读，不可重复读，幻读 等问题，而数据库解决这些问题的办法在于通过加锁。对于 MyBatis 的二级缓存，如果在事务的情况下，会不会导致这些事务隔离失效呢？

首先我们看脏读问题：
前面说过， MyBatis 的缓存并不是直接进行提交的，而是当事务真正提交的时候，才写入真正的二级缓存中，这便解决了脏读问题，因为其他事务根本读取不到事务未提交的数据。
解决了脏读问题，接下来看不可重复读：

不可重复的问题也很好解决，因为 MyBatis 存在一级缓存，因此对于同一个事务，必然使用的是同一个 SqlSession ，那么对于相同的 Key ，则会直接命中一级缓存，那么便不会存在不可重复读的问题。
幻读

同样，幻读问题也是在同一个事务中，会命中一级缓存，从而避免了幻读问题。

那是不是说，只要通过 MyBatis 的缓存机制，就可以完全解决脏读问题呢？

答案肯定是否定的，缓存不是一定能命中的， MyBatis 的缓存机制是 Select 查询缓存，其他操作都会清空缓存。

那么如果有一个方式是先查询，然后插入，然后再查询，那么就无法通过缓存来避免事务隔离的问题了，因为第二次查询时，缓存已经被清空了，此时会再次查询数据库。

看完了缓存的使用方式，接下来看看 MyBatis 缓存的真正实现：

MyBatis 的缓存接口为 Cache ：

public interface Cache {

  String getId();

  void putObject(Object key, Object value);

  Object getObject(Object key);

  Object removeObject(Object key);

  void clear();

  int getSize();

  default ReadWriteLock getReadWriteLock() {
    return null;
  }
}

可以看到，就是简单的增删查改。

虽然 Cache 的接口简单，但是其实现有很多，因为 MyBatis 内置了很多不同的 Cache 配置。

可以看看 Cache 的实现类有如下：

BlockingCache ：阻塞缓存，当未在缓存中成功查询到数据的时候，会对该数据加锁，然后仅让其中一个连接查询，其他连接等待，查询完毕后直接使用该缓存，防止缓存雪崩

不知道为什么 MyBatis 官网没有他的说明，但是确实可以配置的，配置方式为 <cache blocking=true>
FifoCache : 先进先出策略缓存
LoggingCache : 增加日志信息的缓存
LruCache ：移除最近最少使用的缓存

通过 LinkedHashMap 包装实现
ScheduledCache ：定时刷新的缓存
SerializedCache : 将 Value 序列化起来存储的缓存
SoftCache : Value 使用软引用的缓存
SynchronizedCache : 将缓存的所有操作都添加锁（ SynchronizedCache ）

为什么不用 ConcurrentHashMap ? 为了更好的解耦么？
TransactionalCache ：具有事务性的缓存
WeakCache : 使用弱引用的缓存

上面所有的缓存实现，每个所具有的功能都不一样，在配置中可以选着配置功能，在 MyBatis 中都将其通过 装饰者模式 包装起来。

真正的具有缓存功能的是： PerpetualCache ,其内部通过 HashMap 实现。

在 MyBatis 中，创建包装类的代码如下所示：

private Cache setStandardDecorators(Cache cache) {
    try {
        MetaObject metaCache = SystemMetaObject.forObject(cache);
        //如果设置了缓存大小，则设置缓存的大小
        //默认1024
        //默认传入的cache是`LruCache`
        if (size != null && metaCache.hasSetter("size")) {
            metaCache.setValue("size", size);
        }
        //如果设置了刷新间隔，则包装定时刷新缓存
        if (clearInterval != null) {
            cache = new ScheduledCache(cache);
            ((ScheduledCache) cache).setClearInterval(clearInterval);
        }
        //如果缓存的对象有可能被改写，那么为了安全，会将对象进行序列化
        //readOnly =false
        if (readWrite) {
            cache = new SerializedCache(cache);
        }
        //添加日志记录
        //设置日志级别为`Debug` 即可看到日志信息
        cache = new LoggingCache(cache);
        //添加锁
        cache = new SynchronizedCache(cache);

        //是否需要阻塞
        if (blocking) {
            cache = new BlockingCache(cache);
        }
        return cache;
    } catch (Exception e) {
        throw new CacheException("Error building standard cache decorators.  Cause: " + e, e);
    }
}

从这里我们可以看到 MyBatis 的二级缓存默认使用了 Serializable 序列化 Value ，因此对于 MyBaits 的 Domain ，我们需要实现 Serializable 接口，否则会报错。

在 MyBaits 中，还可以自定义实现二级缓存：

<cache type="com.domain.something.MyCustomCache"/>

不过由于一些原因， MyBatis 限制了一些包装类只能用在内置类中：

// issue #352, do not apply decorators to custom caches
    //如果是内置类，再进行包装
    if (PerpetualCache.class.equals(cache.getClass())) {
      for (Class<? extends Cache> decorator : decorators) {
        cache = newCacheDecoratorInstance(decorator, cache);
        setCacheProperties(cache);
      }
      cache = setStandardDecorators(cache);
    } 
    //否则，只包装一层`Logging` 
    else if (!LoggingCache.class.isAssignableFrom(cache.getClass())) {
      cache = new LoggingCache(cache);
    }

看完 MyBaits 的 Cache 实现，还有一个问题就是 Cache 对应的 Key ， MyBatis 是如何判断是同一条 SQL 呢？

一般来说，最好的判断方法便是直接看 SQL 语句是不是一样，但事实并不是这么简单。在 MyBatis 中，将 Cache 的 Key 使用 Cachekey 包装起来：

CacheKey 主要包含5个字段：

private final int multiplier;
  private int hashcode;
  private long checksum;
  private int count;
  private List<Object> updateList;

在 MyBatis 执行过程中，当遇到影响 SQL 的结果的时候，就会同时更新这5个字段：

public void update(Object object) {
    int baseHashCode = object == null ? 1 : ArrayUtil.hashCode(object);

    count++;
    checksum += baseHashCode;
    baseHashCode *= count;

    hashcode = multiplier * hashcode + baseHashCode;

    updateList.add(object);
}

比如在更新的时候：

@Override
  public CacheKey createCacheKey(MappedStatement ms, Object parameterObject, RowBounds rowBounds, BoundSql boundSql) {
    if (closed) {
      throw new ExecutorException("Executor was closed.");
    }
    CacheKey cacheKey = new CacheKey();
    //id
    //同一个namespace使用同一个cache  
    cacheKey.update(ms.getId());
    //offset
    cacheKey.update(rowBounds.getOffset());
    //limit
    cacheKey.update(rowBounds.getLimit());
    //sql
    cacheKey.update(boundSql.getSql());
    List<ParameterMapping> parameterMappings = boundSql.getParameterMappings();
    //参数以及具体的值
    TypeHandlerRegistry typeHandlerRegistry = ms.getConfiguration().getTypeHandlerRegistry();
    // mimic DefaultParameterHandler logic
    for (ParameterMapping parameterMapping : parameterMappings) {
      if (parameterMapping.getMode() != ParameterMode.OUT) {
        Object value;
        String propertyName = parameterMapping.getProperty();
        if (boundSql.hasAdditionalParameter(propertyName)) {
          value = boundSql.getAdditionalParameter(propertyName);
        } else if (parameterObject == null) {
          value = null;
        } else if (typeHandlerRegistry.hasTypeHandler(parameterObject.getClass())) {
          value = parameterObject;
        } else {
          MetaObject metaObject = configuration.newMetaObject(parameterObject);
          value = metaObject.getValue(propertyName);
        }
        cacheKey.update(value);
      }
    }
    if (configuration.getEnvironment() != null) {
      // issue #176
      // DataBase 环境添加影响  
      cacheKey.update(configuration.getEnvironment().getId());
    }
    return cacheKey;
  }

这便是 MyBatis 的缓存的大概实现。

总结

简单总结下：

MyBatis 缓存分为一级缓存和二级缓存，其最后的实现都是 HashMap
MyBaits 的一级缓存默认范围为 Session ，可以修改为 STATEMENT ，相当于关闭了缓存（每执行一次方法都会新建一个 Statement ）
MyBatis 利用3层缓存解决了事务的隔离的问题
MyBatis 的缓存可以配置多种功能，其实现是通过装饰者模式实现
MyBatis 的二级缓存默认需要使 Domain 实现 Serializable 接口