Spark集群修改ip后一直去连老的Master是怎么回事?

Spark集群迁移了,ip地址发生了改变,我修改了conf下的hdfs-site.xml,slaves,spark-defaults.conf,spark-env.sh四个配置文件和hadoopconf下的hffs-site.xmlhe yarn-site.xml两个配置文件。重启后用jps -m查看有这些进程:

187466 Worker --webui-port 18081 spark://spark-11.server:7077

179369 Worker --webui-port 18083 spark://spark-11.server:7077

174577 Jps -m

172118 Worker --webui-port 18082 spark://spark-11.server:7077

170921 Worker --webui-port 18084 spark://spark-11.server:7077

177586 Master --ip spark-11.server --port 7077 --webui-port 18080

但是过一段时间进程就没了。

查看日志发现报错:

18/10/10 14:09:20 INFO Utils: Successfully started service 'WorkerUI' on port 18081.

18/10/10 14:09:20 INFO WorkerWebUI: Started WorkerWebUI at ​http://172.168.12.11:18081​18/10/10 14:09:20 INFO Worker: Connecting to master spark-11:7077...

18/10/10 14:09:28 INFO Worker: Retrying connection to master (attempt # 1)

18/10/10 14:09:28 INFO Worker: Connecting to master spark-11:7077...

18/10/10 14:09:36 INFO Worker: Retrying connection to master (attempt # 2)

18/10/10 14:09:36 INFO Worker: Connecting to master spark-11:7077...

18/10/10 14:09:44 INFO Worker: Retrying connection to master (attempt # 3)

18/10/10 14:09:44 INFO Worker: Connecting to master spark-11:7077...

18/10/10 14:09:52 INFO Worker: Retrying connection to master (attempt # 4)

18/10/10 14:09:52 INFO Worker: Connecting to master spark-11:7077...

18/10/10 14:10:00 INFO Worker: Retrying connection to master (attempt # 5)

18/10/10 14:10:00 INFO Worker: Connecting to master spark-11:7077...

18/10/10 14:10:00 WARN Worker: Failed to connect to master spark-11:7077

java.io.IOException: Failed to connect to spark-11:7077

at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)

at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)

at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)

at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)

at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.nio.channels.UnresolvedAddressException

at sun.nio.ch.Net.checkAddress(Net.java:101)

at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)

at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:209)

at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:207)

at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1097)

at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:471)

at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:456)

at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47)

at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:471)

at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:456)

at io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:50)

at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:471)

at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:456)

at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:438)

at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:908)

at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:203)

at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:166)

at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)

at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)

at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)

... 1 more

Spark-11还是没改之前的master,是什么原因导致修改的没生效呢,spark-defaults.conf里面的spark.master我已经修改成新的hostname了,有没有大神帮忙看看???

Spark
集群
2023-08-25 14:14:14
浏览
收藏 0
回答 1
待解决
回答 1
按赞同
/
按时间
安静的狗粮

spark-env.sh有一个值没改==


分享
微博
QQ
微信
回复
2023-08-25 16:07:52
相关问题
previewer 一直loading failed怎么回事
488浏览 • 1回复 待解决
自动签名失败一直加载怎么回事
4741浏览 • 2回复 待解决
PolarDB控制台一直加载中怎么回事
1430浏览 • 1回复 待解决
PolarDB修改表数据慢怎么回事
1407浏览 • 1回复 待解决
如何修改spark资源上传目录?
434浏览 • 1回复 待解决
签名不致报错怎么回事
1013浏览 • 1回复 待解决
如何修改OceanBase 集群参数?
2425浏览 • 1回复 待解决
PopupDialog 不显示怎么回事
5827浏览 • 2回复 待解决
不能下载SDK怎么回事
143浏览 • 1回复 待解决
ohpm安装失败怎么回事
468浏览 • 1回复 待解决