Spark集群修改ip后一直去连老的Master是怎么回事?
Spark集群迁移了,ip地址发生了改变,我修改了conf下的hdfs-site.xml,slaves,spark-defaults.conf,spark-env.sh四个配置文件和hadoopconf下的hffs-site.xmlhe yarn-site.xml两个配置文件。重启后用jps -m查看有这些进程:
187466 Worker --webui-port 18081 spark://spark-11.server:7077
179369 Worker --webui-port 18083 spark://spark-11.server:7077
174577 Jps -m
172118 Worker --webui-port 18082 spark://spark-11.server:7077
170921 Worker --webui-port 18084 spark://spark-11.server:7077
177586 Master --ip spark-11.server --port 7077 --webui-port 18080
但是过一段时间进程就没了。
查看日志发现报错:
18/10/10 14:09:20 INFO Utils: Successfully started service 'WorkerUI' on port 18081.
18/10/10 14:09:20 INFO WorkerWebUI: Started WorkerWebUI at http://172.168.12.11:1808118/10/10 14:09:20 INFO Worker: Connecting to master spark-11:7077...
18/10/10 14:09:28 INFO Worker: Retrying connection to master (attempt # 1)
18/10/10 14:09:28 INFO Worker: Connecting to master spark-11:7077...
18/10/10 14:09:36 INFO Worker: Retrying connection to master (attempt # 2)
18/10/10 14:09:36 INFO Worker: Connecting to master spark-11:7077...
18/10/10 14:09:44 INFO Worker: Retrying connection to master (attempt # 3)
18/10/10 14:09:44 INFO Worker: Connecting to master spark-11:7077...
18/10/10 14:09:52 INFO Worker: Retrying connection to master (attempt # 4)
18/10/10 14:09:52 INFO Worker: Connecting to master spark-11:7077...
18/10/10 14:10:00 INFO Worker: Retrying connection to master (attempt # 5)
18/10/10 14:10:00 INFO Worker: Connecting to master spark-11:7077...
18/10/10 14:10:00 WARN Worker: Failed to connect to master spark-11:7077
java.io.IOException: Failed to connect to spark-11:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622)
at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:209)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:207)
at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1097)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:471)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:456)
at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:471)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:456)
at io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:50)
at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:471)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:456)
at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:438)
at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:908)
at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:203)
at io.netty.bootstrap.Bootstrap$2.run(Bootstrap.java:166)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
Spark-11还是没改之前的master,是什么原因导致修改的没生效呢,spark-defaults.conf里面的spark.master我已经修改成新的hostname了,有没有大神帮忙看看???
spark-env.sh有一个值没改==