SpringCloud Alibaba系列——2Nacos核心源码分析(上)
作者 | 宇木木兮
来源 |今日头条
学习目标
- Nacos中有哪几种节点,它们的区别是什么
- Nacos作为注册中心用的是什么一致性协议,不同的节点分别用的是什么协议
- Nacos中一致性协议有什么区别
- Nacos配置中心用的是Push模型还是Pull模型又或者是其他的模型?
- 不同模型的区别及优劣
- Nacos的节点是如何保证存活的
- Nacos注册的流程
- Nacos配置的流程
第1章 注册中心
1.1 流程推导
- 服务提供者把自己的协议地址注册到Nacos server
- 服务消费者需要从Nacos Server上去查询服务提供者的地址(根据服务名称)
- Nacos Server需要感知到服务提供者的上下线的变化
- 服务消费者需要动态感知到Nacos Server端服务地址的变化
1.2 源码分析
1.2.1 服务注册
1.查看包
spring-cloud-starter-alibaba-nacos-discover中spring.factories中自动装配的类,有个NacosServiceRegistryAutoConfiguration类
2.NacosServiceRegistryAutoConfiguration初始化。
@Bean
@ConditionalOnBean({AutoServiceRegistrationProperties.class})
public NacosAutoServiceRegistration nacosAutoServiceRegistration(NacosServiceRegistry registry, AutoServiceRegistrationProperties autoServiceRegistrationProperties, NacosRegistration registration) {
return new NacosAutoServiceRegistration(registry, autoServiceRegistrationProperties, registration);
}
AutoServiceRegistrationProperties为服务注册配置表
NacosAutoServiceRegistration类继承与AbstractAutoServiceRegistration
public NacosAutoServiceRegistration(ServiceRegistry<Registration> serviceRegistry, AutoServiceRegistrationProperties autoServiceRegistrationProperties, NacosRegistration registration) {
super(serviceRegistry, autoServiceRegistrationProperties);
this.registration = registration;
}
3.AbstractAutoServiceRegistration实现了ApplicationListener事件,最终回调用
public void onApplicationEvent(WebServerInitializedEvent event) {
this.bind(event);
}
4.调用bind中start方法
public void bind(WebServerInitializedEvent event) {
ApplicationContext context = event.getApplicationContext();
if (!(context instanceof ConfigurableWebServerApplicationContext) || !"management".equals(((ConfigurableWebServerApplicationContext)context).getServerNamespace())) {
this.port.compareAndSet(0, event.getWebServer().getPort());
this.start();
}
}
5.最终调用到start中的register方法
public void start() {
if (!this.isEnabled()) {
if (logger.isDebugEnabled()) {
logger.debug("Discovery Lifecycle disabled. Not starting");
}
} else {
if (!this.running.get()) {
this.context.publishEvent(new InstancePreRegisteredEvent(this, this.getRegistration()));
this.register();
if (this.shouldRegisterManagement()) {
this.registerManagement();
}
this.context.publishEvent(new InstanceRegisteredEvent(this, this.getConfiguration()));
this.running.compareAndSet(false, true);
}
}
}
6.最终调用到NacosServiceRegistry的register方法
public void register(Registration registration) {
if (StringUtils.isEmpty(registration.getServiceId())) {
log.warn("No service to register for nacos client...");
} else {
String serviceId = registration.getServiceId();
String group = this.nacosDiscoveryProperties.getGroup();
//服务实例消息
Instance instance = this.getNacosInstanceFromRegistration(registration);
try {
//进行注册
this.namingService.registerInstance(serviceId, group, instance);
log.info("nacos registry, {} {} {}:{} register finished", new Object[]{group, serviceId, instance.getIp(), instance.getPort()});
} catch (Exception var6) {
log.error("nacos registry, {} register failed...{},", new Object[]{serviceId, registration.toString(), var6});
ReflectionUtils.rethrowRuntimeException(var6);
}
}
}
7.调用注册实例方法
public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
if (instance.isEphemeral()) { //判断是不是临时节点,如果是临时节点,添加心跳检测
BeatInfo beatInfo = new BeatInfo();
beatInfo.setServiceName(NamingUtils.getGroupedName(serviceName, groupName));
beatInfo.setIp(instance.getIp());
beatInfo.setPort(instance.getPort());
beatInfo.setCluster(instance.getClusterName());
beatInfo.setWeight(instance.getWeight());
beatInfo.setMetadata(instance.getMetadata());
beatInfo.setScheduled(false);
beatInfo.setPeriod(instance.getInstanceHeartBeatInterval());
//添加心跳检测
this.beatReactor.addBeatInfo(NamingUtils.getGroupedName(serviceName, groupName), beatInfo);
}
//往注册中心注册
this.serverProxy.registerService(NamingUtils.getGroupedName(serviceName, groupName), groupName, instance);
}
Nacos临时节点与永久节点区别:
临时实例向Nacos注册,Nacos不会对其进行持久化存储,只能通过心跳方式保活。默认模式是:客户端心跳上报Nacos实例健康状态,默认间隔5秒,Nacos在15秒内未收到该实例的心跳,则会设置为不健康状态,超过30秒则将实例删除。
持久化实例向Nacos注册,Nacos会对其进行持久化处理。当该实例不存在时,Nacos只会将其健康状态设置为不健康,但并不会对将其从服务端删除。
另外,可以使用实例的ephemeral来判断健康检查模式,ephemeral为true对应的是client模式(客户端心跳),为false对应的是server模式(服务端检查)。
设置
spring.cloud.nacos.discovery.ephemeral=false 为永久节点
为什么要设计两种模式? 上面说了两种模式的不同和处理上的区别,那么Nacos为什么设计两种模式,它们是为了应对什么样的场景而存在呢?
对于临时实例,健康检查失败,则直接可以从列表中删除。这种特性就比较适合那些需要应对流量突增的场景,服务可以进行弹性扩容。当流量过去之后,服务停掉即可自动注销了。
对于持久化实例,健康检查失败,会被标记成不健康状态。它的好处是运维可以实时看到实例的健康状态,便于后续的警告、扩容等一些列措施。
8.registerService封装
public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {
LogUtils.NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}", new Object[]{this.namespaceId, serviceName, instance});
Map<String, String> params = new HashMap(9);
params.put("namespaceId", this.namespaceId);
params.put("serviceName", serviceName);
params.put("groupName", groupName);
params.put("clusterName", instance.getClusterName());
params.put("ip", instance.getIp());
params.put("port", String.valueOf(instance.getPort()));
params.put("weight", String.valueOf(instance.getWeight()));
params.put("enable", String.valueOf(instance.isEnabled()));
params.put("healthy", String.valueOf(instance.isHealthy()));
params.put("ephemeral", String.valueOf(instance.isEphemeral()));
params.put("metadata", JSON.toJSONString(instance.getMetadata()));
//调用api,url为NACOS_URL_INSTANCE,即NACOS_URL_INSTANCE = NACOS_URL_BASE + "/instance";
this.reqAPI(UtilAndComs.NACOS_URL_INSTANCE,即, params, "POST");
}
9.最终会调用到reqAPI
public String reqAPI(String api, Map<String, String> params, String body, List<String> servers, String method) throws NacosException {
params.put("namespaceId", this.getNamespaceId());
if (CollectionUtils.isEmpty(servers) && StringUtils.isEmpty(this.nacosDomain)) {
throw new NacosException(400, "no server available");
} else {
NacosException exception = new NacosException();
if (servers != null && !servers.isEmpty()) {
//从nacos地址中随机选择一个地址请求
Random random = new Random(System.currentTimeMillis());
int index = random.nextInt(servers.size());
int i = 0;
while(i < servers.size()) {
String server = (String)servers.get(index);
try {
//调用nacos接口,将服务注册到nacos服务器
return this.callServer(api, params, body, server, method);
} catch (NacosException var13) {
exception = var13;
if (LogUtils.NAMING_LOGGER.isDebugEnabled()) {
LogUtils.NAMING_LOGGER.debug("request {} failed.", server, var13);
}
index = (index + 1) % servers.size();
++i;
}
}
}
if (StringUtils.isNotBlank(this.nacosDomain)) {
int i = 0;
while(i < 3) {
try {
return this.callServer(api, params, body, this.nacosDomain, method);
} catch (NacosException var12) {
exception = var12;
if (LogUtils.NAMING_LOGGER.isDebugEnabled()) {
LogUtils.NAMING_LOGGER.debug("request {} failed.", this.nacosDomain, var12);
}
++i;
}
}
}
LogUtils.NAMING_LOGGER.error("request: {} failed, servers: {}, code: {}, msg: {}", new Object[]{api, servers, exception.getErrCode(), exception.getErrMsg()});
throw new NacosException(exception.getErrCode(), "failed to req API:/api/" + api + " after all servers(" + servers + ") tried: " + exception.getMessage());
}
}
1.2.2 临时节点心跳监听
1.我们在刚才registerInstance注册实例方法中;添加心跳addBeatInfo
public void addBeatInfo(String serviceName, BeatInfo beatInfo) {
LogUtils.NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo);
String key = this.buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort());
BeatInfo existBeat = null;
if ((existBeat = (BeatInfo)this.dom2Beat.remove(key)) != null) {
existBeat.setStopped(true);
}
this.dom2Beat.put(key, beatInfo);preserved.heart.beat.interval
//我们发现是开启线程,调用BeatTask的run,beatInfo.getPeriod()为监听间隔,默认5s,可配preserved.heart.beat.interval
this.executorService.schedule(new BeatReactor.BeatTask(beatInfo), beatInfo.getPeriod(), TimeUnit.MILLISECONDS);
MetricsMonitor.getDom2BeatSizeMonitor().set((double)this.dom2Beat.size());
}
2.进入BeatTask的run方法
public void run() {
if (!this.beatInfo.isStopped()) {
long nextTime = this.beatInfo.getPeriod();
try {
//发送心跳,心跳检测
JSONObject result = BeatReactor.this.serverProxy.sendBeat(this.beatInfo, BeatReactor.this.lightBeatEnabled);
long interval = (long)result.getIntValue("clientBeatInterval");
boolean lightBeatEnabled = false;
if (result.containsKey("lightBeatEnabled")) {
lightBeatEnabled = result.getBooleanValue("lightBeatEnabled");
}
BeatReactor.this.lightBeatEnabled = lightBeatEnabled;
if (interval > 0L) {
nextTime = interval;
}
int code = 10200;
if (result.containsKey("code")) {
code = result.getIntValue("code");
}
//如果不存在就注册
if (code == 20404) {
Instance instance = new Instance();
instance.setPort(this.beatInfo.getPort());
instance.setIp(this.beatInfo.getIp());
instance.setWeight(this.beatInfo.getWeight());
instance.setMetadata(this.beatInfo.getMetadata());
instance.setClusterName(this.beatInfo.getCluster());
instance.setServiceName(this.beatInfo.getServiceName());
instance.setInstanceId(instance.getInstanceId());
instance.setEphemeral(true);
try {
BeatReactor.this.serverProxy.registerService(this.beatInfo.getServiceName(), NamingUtils.getGroupName(this.beatInfo.getServiceName()), instance);
} catch (Exception var10) {
}
}
} catch (NacosException var11) {
LogUtils.NAMING_LOGGER.error("[CLIENT-BEAT] failed to send beat: {}, code: {}, msg: {}", new Object[]{JSON.toJSONString(this.beatInfo), var11.getErrCode(), var11.getErrMsg()});
}
//5s后继续检测
BeatReactor.this.executorService.schedule(BeatReactor.this.new BeatTask(this.beatInfo), nextTime, TimeUnit.MILLISECONDS);
}
}
3.心跳检测方法
public JSONObject sendBeat(BeatInfo beatInfo, boolean lightBeatEnabled) throws NacosException {
if (LogUtils.NAMING_LOGGER.isDebugEnabled()) {
LogUtils.NAMING_LOGGER.debug("[BEAT] {} sending beat to server: {}", this.namespaceId, beatInfo.toString());
}
Map<String, String> params = new HashMap(8);
String body = "";
if (!lightBeatEnabled) {
try {
body = "beat=" + URLEncoder.encode(JSON.toJSONString(beatInfo), "UTF-8");
} catch (UnsupportedEncodingException var6) {
throw new NacosException(500, "encode beatInfo error", var6);
}
}
params.put("namespaceId", this.namespaceId);
params.put("serviceName", beatInfo.getServiceName());
params.put("clusterName", beatInfo.getCluster());
params.put("ip", beatInfo.getIp());
params.put("port", String.valueOf(beatInfo.getPort()));
String result = this.reqAPI(UtilAndComs.NACOS_URL_BASE + "/instance/beat", params, body, "PUT");
return JSON.parseObject(result);
}
4.Nacos服务端在15秒内如果没收到客户端的心跳请求,会将该实例设置为不健康,在30秒内没收到心跳,会将这个临时实例摘除。永久实例不会摘除
1.2.3 服务端接收请求
1.在服务端源码中找到接收客户端服务注册的地方,最终找到了InstanceController中的register或beat方法;
这里只看register方法
@CanDistro
@RequestMapping(value = "", method = RequestMethod.POST)
public String register(HttpServletRequest request) throws Exception {
String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
//这个方法进行注册
serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request));
return "ok";
}
根据代码可以看出。注册方法中只做调用ServiceManager中的registerInstance方法,并在调用前会将客户端传过来的参数信息封装成一个Instance实例
2.点击进入registerInstance方法,处理了两个流程,创建了一个服务,添加了一个实例
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
//创建服务
createEmptyService(namespaceId, serviceName, instance.isEphemeral());
Service service = getService(namespaceId, serviceName);
if (service == null) {
throw new NacosException(NacosException.INVALID_PARAM,
"service not found, namespace: " + namespaceId + ", service: " + serviceName);
}
//增加实例
addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
}
3.createEmptyService创建服务
public void createServiceIfAbsent(String namespaceId, String serviceName, boolean local, Cluster cluster) throws NacosException {
Service service = getService(namespaceId, serviceName);
if (service == null) {
Loggers.SRV_LOG.info("creating empty service {}:{}", namespaceId, serviceName);
service = new Service();
service.setName(serviceName);
service.setNamespaceId(namespaceId);
service.setGroupName(NamingUtils.getGroupName(serviceName));
// now validate the service. if failed, exception will be thrown
service.setLastModifiedMillis(System.currentTimeMillis());
service.recalculateChecksum();
if (cluster != null) {
cluster.setService(service);
service.getClusterMap().put(cluster.getName(), cluster);
}
service.validate();
if (local) {
//将服务放入了一个serviceMap中,该map维护了一个键为namespace,值为一个存放服务名称为键,服务为值的map
putService(service);
//处理与客户端的健康检查
service.init();
//用于一致性处理的
consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), true), service);
consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), false), service);
} else {
addOrReplaceService(service);
}
}
}
4.addInstance增加实例
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException {
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
Service service = getService(namespaceId, serviceName);
//获取所有的实例列表
List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
Instances instances = new Instances();
instances.setInstanceList(instanceList);
//处理所有实例的流程
consistencyService.put(key, instances);
}
获取实例列表暂时不描述,点击处理实例的方法consistencyService.put(),consistencyService这里有三个实现,这里会根据key的值的参数判断最终是走的哪个实现,最终会走到DistroConsistencyServiceImpl
打开distroConsistencyServiceImpl的put方法,一个是处理实例的方法,一个是处理实例最终一致性的同步任务
@Override
public void put(String key, Record value) throws NacosException {
onPut(key, value);
taskDispatcher.addTask(key);
}
public void onPut(String key, Record value) {
if (KeyBuilder.matchEphemeralInstanceListKey(key)) {
//实例最终放入到Datum当中并将Datum放入到DataStore当中
Datum<Instances> datum = new Datum<>();
datum.value = (Instances) value;
datum.key = key;
datum.timestamp.incrementAndGet();
dataStore.put(key, datum);
}
if (!listeners.containsKey(key)) {
return;
}
notifier.addTask(key, ApplyAction.CHANGE);
}
1.2.4 服务发现
1.通过dubbo进行服务发现,调用到
public List<ServiceInstance> getInstances(String serviceId) {
try {
return serviceDiscovery.getInstances(serviceId);
}
catch (Exception e) {
throw new RuntimeException(
"Can not get hosts from nacos server. serviceId: " + serviceId, e);
}
}
调用NamingService,根据serviceId、group获得服务实例列表。 然后把instance转化为ServiceInstance对象
public List<ServiceInstance> getInstances(String serviceId) throws NacosException {
String group = discoveryProperties.getGroup();
List<Instance> instances = discoveryProperties.namingServiceInstance()
.selectInstances(serviceId, group, true);
return hostToServiceInstanceList(instances, serviceId);
}
2.NacosNamingService.selectInstances
selectInstances首先从hostReactor获取serviceInfo,然后再从serviceInfo.getHosts()剔除非healty、非enabled、weight小于等于0的instance再返回;如果subscribe为true,则执行
hostReactor.getServiceInfo获取serviceInfo,否则执行hostReactor.getServiceInfoDirectlyFromServer获取serviceInfo
public List<Instance> selectInstances(String serviceName, String groupName, List<String> clusters, boolean healthy, boolean subscribe) throws NacosException {
ServiceInfo serviceInfo;
if (subscribe) {
//是否订阅服务地址的变化,默认为true
serviceInfo = hostReactor.getServiceInfo(NamingUtils.getGroupedName(serviceName, groupName), StringUtils.join(clusters, ","));
} else {
serviceInfo = hostReactor.getServiceInfoDirectlyFromServer(NamingUtils.getGroupedName(serviceName, groupName), StringUtils.join(clusters, ","));
}
return selectInstances(serviceInfo, healthy);
}
3.HostReactor.getServiceInfo
public ServiceInfo getServiceInfo(final String serviceName, final String
clusters) {
NAMING_LOGGER.debug("failover-mode: " + failoverReactor.isFailoverSwitch());
//拼接服务名称+集群名称(默认为空)
String key = ServiceInfo.getKey(serviceName, clusters);
if (failoverReactor.isFailoverSwitch()) {
return failoverReactor.getService(key);
}
//从ServiceInfoMap中根据key来查找服务提供者列表,ServiceInfoMap是客户端的服务地址的本地缓存
ServiceInfo serviceObj = getServiceInfo0(serviceName, clusters);
if (null == serviceObj) {//如果为空,表示本地缓存不存在
//如果找不到则创建一个新的然后放入serviceInfoMap,同时放入updatingMap,执行updateServiceNow,再从updatingMap移除;
serviceObj = new ServiceInfo(serviceName, clusters);
serviceInfoMap.put(serviceObj.getKey(), serviceObj);
updatingMap.put(serviceName, new Object());
updateServiceNow(serviceName, clusters);
updatingMap.remove(serviceName);
} else if (updatingMap.containsKey(serviceName)) {
//如果从serviceInfoMap找出来的serviceObj在updatingMap中则等待UPDATE_HOLD_INTERVAL;
if (UPDATE_HOLD_INTERVAL > 0) {
// hold a moment waiting for update finish
synchronized (serviceObj) {
try {
serviceObj.wait(UPDATE_HOLD_INTERVAL);
} catch (InterruptedException e) {
NAMING_LOGGER.error("[getServiceInfo] serviceName:" +
serviceName + ", clusters:" + clusters, e);
}
}
}
}
//如果本地缓存中存在,则通过scheduleUpdateIfAbsent开启定时任务,再从serviceInfoMap取出
serviceInfo
scheduleUpdateIfAbsent(serviceName, clusters);
return serviceInfoMap.get(serviceObj.getKey());
}
上述代码中,有两个逻辑,分别是
- updateServiceNow, 立马从Nacos server中去加载服务地址信息
- scheduleUpdateIfAbsent 开启定时调度,每10s去查询一次服务地址
4.updateServiceNow
public void updateServiceNow(String serviceName, String clusters) {
ServiceInfo oldService = getServiceInfo0(serviceName, clusters);
try {
String result = serverProxy.queryList(serviceName, clusters, pushReceiver.getUDPPort(), false);
if (StringUtils.isNotEmpty(result)) {
processServiceJSON(result);
}
} catch (Exception e) {
NAMING_LOGGER.error("[NA] failed to update serviceName: " + serviceName, e);
} finally {
if (oldService != null) {
synchronized (oldService) {
oldService.notifyAll();
}
}
}
}
5.queryList
调用
/nacos/v1/ns/instance/list ,从Nacos server端获取服务地址信息。
public String queryList(String serviceName, String clusters, int udpPort,
boolean healthyOnly)
throws NacosException {
final Map<String, String> params = new HashMap<String, String>(8);
params.put(CommonParams.NAMESPACE_ID, namespaceId);
params.put(CommonParams.SERVICE_NAME, serviceName);
params.put("clusters", clusters);
params.put("udpPort", String.valueOf(udpPort));
params.put("clientIP", NetUtils.localIP());
params.put("healthyOnly", String.valueOf(healthyOnly));
return reqAPI(UtilAndComs.NACOS_URL_BASE + "/instance/list", params,
HttpMethod.GET);
}