HDFS的Java客戶端編寫
阿新 • • 發佈:2019-01-14
總結: 之前在教材上看hdfs的Java客戶端編寫,只有關鍵程式碼,呵呵……。閒話不說,上正文。
1. Hadoop 的Java客戶端編寫建議在linux系統上開發
2. 可以使用eclipse,idea 等IDE工具,目前比較流行的是idea
3. 新建專案之後需要新增很多jar包,win,linux下新增jar方式略有不同
4. 使用程式碼會出現檔案格式不認識,許可權等問題。
具體:
1.首先測試從hdfs中下載檔案:
下載檔案的程式碼:(將hdfs://localhost:9000/jdk-7u65-linux-i586.tar.gz檔案下載到本地/opt/download/doload.tgz)
package cn.qlq.hdfs; import java.io.FileOutputStream; import java.io.IOException; import org.apache.commons.compress.utils.IOUtils; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; public class HdfsUtil { public static void main(String a[]) throws IOException { //to upload a file Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(conf); Path path = new Path("hdfs://localhost:9000/jdk-7u65-linux-i586.tar.gz"); FSDataInputStream input = fs.open(path); FileOutputStream output = new FileOutputStream("/opt/download/doload.tgz"); IOUtils.copy(input, output); } }
直接執行報錯:
原因是程式不認識 hdfs://localhost:9000/jdk-7u65-linux-i586.tar.gz 這樣的目錄
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:9000/jdk-7u65-linux-i586.tar.gz, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764) at cn.qlq.hdfs.HdfsUtil.main(HdfsUtil.java:21)
解決辦法:
- 第一種: 將hadoop安裝目錄下的etc目錄中的core-site.xml拷貝到eclipse的src目錄下。這樣就不會報錯
執行結果:
[[email protected] download]# ll total 140224 -rw-r--r--. 1 root root 143588167 Apr 20 05:55 doload.tgz [[email protected] download]# pwd /opt/download [[email protected] download]# ll total 140224 -rw-r--r--. 1 root root 143588167 Apr 20 05:55 doload.tgz
- 第二種:直接在程式中修改
我們先檢視hdfs-site.xml中的內容:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop/hadoop-2.4.1/data/</value> </property> </configuration>
程式碼改為:
public static void main(String a[]) throws IOException { //to upload a filed Configuration conf = new Configuration(); //set hdfs root dir conf.set("fs.defaultFS", "hdfs://localhost:9000"); FileSystem fs = FileSystem.get(conf); Path path = new Path("hdfs://localhost:9000/jdk-7u65-linux-i586.tar.gz"); FSDataInputStream input = fs.open(path); FileOutputStream output = new FileOutputStream("/opt/download/doload.tgz"); IOUtils.copy(input, output); }
2.下面程式碼演示了hdfs的基本操作:
package cn.qlq.hdfs; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.IOException; import java.net.URI; import java.net.URISyntaxException; import org.apache.commons.compress.utils.IOUtils; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.LocatedFileStatus; import org.apache.hadoop.fs.Path; import org.apache.hadoop.fs.RemoteIterator; import org.junit.Before; import org.junit.Test; public class HdfsUtil { private FileSystem fs = null; @Before public void befor() throws IOException, InterruptedException, URISyntaxException{ //讀取classpath下的xxx-site.xml 配置檔案,並解析其內容,封裝到conf物件中 Configuration conf = new Configuration(); //也可以在程式碼中對conf中的配置資訊進行手動設定,會覆蓋掉配置檔案中的讀取的值 conf.set("fs.defaultFS", "hdfs://localhost:9000/"); //根據配置資訊,去獲取一個具體檔案系統的客戶端操作例項物件 fs = FileSystem.get(new URI("hdfs://localhost:9000/"),conf,"root"); } /** * 上傳檔案,比較底層的寫法 * * @throws Exception */ @Test public void upload() throws Exception { Path dst = new Path("hdfs://localhost:9000/aa/qingshu.txt"); FSDataOutputStream os = fs.create(dst); FileInputStream is = new FileInputStream("/opt/download/haha.txt"); IOUtils.copy(is, os); } /** * 上傳檔案,封裝好的寫法 * @throws Exception * @throws IOException */ @Test public void upload2() throws Exception, IOException{ fs.copyFromLocalFile(new Path("/opt/download/haha.txt"), new Path("hdfs://localhost:9000/aa/qingshu2.txt")); } /** * download file * @throws IOException */ @Test public void download() throws IOException { Path path = new Path("hdfs://localhost:9000/jdk-7u65-linux-i586.tar.gz"); FSDataInputStream input = fs.open(path); FileOutputStream output = new FileOutputStream("/opt/download/doload.tgz"); IOUtils.copy(input, output); } /** * 下載檔案 * @throws Exception * @throws IllegalArgumentException */ @Test public void download2() throws Exception { fs.copyToLocalFile(new Path("hdfs://localhost:9000/aa/qingshu2.txt"), new Path("/opt/download/haha2.txt")); } /** * 檢視檔案資訊 * @throws IOException * @throws IllegalArgumentException * @throws FileNotFoundException * */ @Test public void listFiles() throws FileNotFoundException, IllegalArgumentException, IOException { // listFiles列出的是檔案資訊,而且提供遞迴遍歷 RemoteIterator<LocatedFileStatus> files = fs.listFiles(new Path("/"), true); while(files.hasNext()){ LocatedFileStatus file = files.next(); Path filePath = file.getPath(); String fileName = filePath.getName(); System.out.println(fileName); } System.out.println("---------------------------------"); //listStatus 可以列出檔案和資料夾的資訊,但是不提供自帶的遞迴遍歷 FileStatus[] listStatus = fs.listStatus(new Path("/")); for(FileStatus status: listStatus){ String name = status.getPath().getName(); System.out.println(name + (status.isDirectory()?" is dir":" is file")); } } /** * 建立資料夾 * @throws Exception * @throws IllegalArgumentException */ @Test public void mkdir() throws IllegalArgumentException, Exception { fs.mkdirs(new Path("/aaa/bbb/ccc")); } /** * 刪除檔案或資料夾 * @throws IOException * @throws IllegalArgumentException */ @Test public void rm() throws IllegalArgumentException, IOException { fs.delete(new Path("/aa"), true); } }