有用户反馈说在使用 nacos
时,随着程序的运行, Java
线程在不断的创建,达到了两三千的情况,导致 CPU
的 Load
指标达到百分之百
观察 nacos
发现,这些被大量创建的线程,最终挂钩的对象为 NacosConfigService
public NacosConfigService(Properties properties) throws NacosException { String encodeTmp = properties.getProperty(PropertyKeyConst.ENCODE); if (StringUtils.isBlank(encodeTmp)) { encode = Constants.ENCODE; } else { encode = encodeTmp.trim(); } initNamespace(properties); agent = new MetricsHttpAgent(new ServerHttpAgent(properties)); agent.start(); worker = new ClientWorker(agent, configFilterChainManager, properties); } 复制代码
而其实的挂钩对象为 ClientWorker
@SuppressWarnings("PMD.ThreadPoolCreationRule") public ClientWorker(final HttpAgent agent, final ConfigFilterChainManager configFilterChainManager, final Properties properties) { this.agent = agent; this.configFilterChainManager = configFilterChainManager; // Initialize the timeout parameter init(properties); executor = Executors.newScheduledThreadPool(1, new ThreadFactory() { @Override public Thread newThread(Runnable r) { Thread t = new Thread(r); t.setName("com.alibaba.nacos.client.Worker." + agent.getName()); t.setDaemon(true); return t; } }); executorService = Executors.newScheduledThreadPool(Runtime.getRuntime().availableProcessors(), new ThreadFactory() { @Override public Thread newThread(Runnable r) { Thread t = new Thread(r); t.setName("com.alibaba.nacos.client.Worker.longPolling." + agent.getName()); t.setDaemon(true); return t; } }); executor.scheduleWithFixedDelay(new Runnable() { @Override public void run() { try { checkConfigInfo(); } catch (Throwable e) { LOGGER.error("[" + agent.getName() + "] [sub-check] rotate check error", e); } } }, 1L, 10L, TimeUnit.MILLISECONDS); } 复制代码
因此我最初是怀疑用户是否是创建了大量的 NacosConfigService
对象
用户 jmap
数据
可以看出,当前 JVM
中的 ClientWorker
对象达到了两千多个,而从上面的 nacos
源码分析可知, ClientWorker
对象挂着线程池
首先让用户自行排查是否自行创建了大量的 NacosConfigService
实例,这是部分用户反馈确实由于自己的误操作导致创建了大量的 NacosConfigService
对象
Spring-Cloub-Alibaba
组件检查
但是还有部分用户说,他们仅仅依赖 spring-cloud-alibaba-nacos
组件,没有自己操作 NacosConfigService
对象,仍然存在大量线程被创建的问题,最终由一个用户的自检查的反馈确定了 spring-cloud-alibaba-nacos
的 BUG
@ConfigurationProperties(NacosConfigProperties.PREFIX) public class NacosConfigProperties { ... private ConfigService configService; ... @Deprecated public ConfigService configServiceInstance() { if (null != configService) { return configService; } Properties properties = new Properties(); ... try { configService = NacosFactory.createConfigService(properties); return configService; } catch (Exception e) { log.error("create config service error!properties={},e=,", this, e); return null; } } } 复制代码
这个配置类中,缓存着一个 ConfigService
对象实例,本意是自己维护一个对象的单例,但是实际,每当 spring-cloud
的 context
刷新后,这个 NacosConfigProperties
的 bean
是会被重新创建的,因此,一旦有配置更新——> Context
刷新——> NacosConfigProperties
被重新创建——> ConfigService
缓存失效——> ConfigService
重新创建
因此,由于这个因果关系的存在,导致这个 ConfigService
的缓存在 Context
刷新后就无法作用了
public class NacosConfigManager implements ApplicationContextAware { private ConfigService configService; public ConfigService getConfigService() { return configService; return ServiceHolder.getInstance().getService(); } @Override public void setApplicationContext(ApplicationContext applicationContext) throws BeansException { NacosConfigProperties properties = applicationContext .getBean(NacosConfigProperties.class); configService = properties.configServiceInstance(); ServiceHolder holder = ServiceHolder.getInstance(); if (!holder.alreadyInit) { ServiceHolder.getInstance().setService(properties.configServiceInstance()); } } static class ServiceHolder { private ConfigService service = null; private boolean alreadyInit = false; private static final ServiceHolder holder = new ServiceHolder(); ServiceHolder() { } static ServiceHolder getInstance() { return holder; } void setService(ConfigService service) { alreadyInit = true; this.service = service; } ConfigService getService() { return service; } } } 复制代码