外網無法訪問hdfs文件系統
阿新 • • 發佈:2018-06-06
actor hand dfs 安裝 RR and hosts method 鏈接
由於本地測試和服務器不在一個局域網,安裝的hadoop配置文件是以內網ip作為機器間通信的ip.
在這種情況下,我們能夠訪問到namenode
機器,
namenode
會給我們數據所在機器的ip地址供我們訪問數據傳輸服務,
但是返回的的是datanode
內網的ip,我們無法根據該IP
訪問datanode
服務器.
報錯如下
2018-06-06 17:01:44,555 [main] WARN [org.apache.hadoop.hdfs.BlockReaderFactory] - I/O error constructing remote block reader.
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717 )
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3450)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:777 )
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:694)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:355)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:665)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:874 )
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:926)
at java.io.DataInputStream.read(DataInputStream.java:149)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at com.feiyangshop.recommendation.HdfsHandler.main(HdfsHandler.java:36)
2018-06-06 17:01:44,560 [main] WARN [org.apache.hadoop.hdfs.DFSClient] - Failed to connect to /192.168.1.219:50010 for block, add to deadNodes and continue. java.net.ConnectException: Connection timed out: no further information
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3450)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:777)
為了能夠讓開發機器訪問到hdfs,我們可以通過域名訪問hdfs.
讓namenode
返回給我們datanode
的域名,在開發機器的hosts
文件中配置datanode
對應的外網ip和域名,並且在與hdfs交互的程序中添加如下代碼即可
import org.apache.hadoop.conf.Configuration;
Configuration conf = new Configuration();
//設置通過域名訪問datanode
conf.set("dfs.client.use.datanode.hostname", "true");
windows bug
還有一個就是比較常見的bug
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V
at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(Native Method)
at org.apache.hadoop.util.NativeCrc32.verifyChunkedSums(NativeCrc32.java:59)
at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:301)
at org.apache.hadoop.hdfs.RemoteBlockReader2.readNextPacket(RemoteBlockReader2.java:231)
at org.apache.hadoop.hdfs.RemoteBlockReader2.read(RemoteBlockReader2.java:152)
at org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:767)
at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:823)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:883)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:926)
at java.io.DataInputStream.read(DataInputStream.java:149)
windows中的HADOOP_HOME
裏的bin
目錄下的腳本是32位的,應該替換成支持windows版本的64位的,我這有編譯好的windows64位版本的hadoop包,如果缺少可以通過下面的鏈接下載.
這個應該是win7-64位版本的,但是我使用win10-64位機器也可以使用.
鏈接:https://pan.baidu.com/s/13Mf3m2fXt0TnXwgsiDejEg 密碼:pajo
還需要在idea中配置環境參數
HADOOP_HOME=E:\hadoop\hadoop-2.6.1
PATH=%PATH%;E:\hadoop\hadoop-2.6.1\bin
FIN
外網無法訪問hdfs文件系統