文章目录
-
- 1. 启动 Hadoop 集群
- 2. 使用 HDFS Shell
- 3. 使用 HDFS Web UI
- 4. 安装 Eclipse IDE
-
- 4.1 上传文件
- 4.2 查询文件位置
- 4.3 创建目录
- 4.4 读取文件内容
- 4.5 写入文件
1. 启动 Hadoop 集群
安装集群:https://cloud.tencent.com/developer/article/1872854
启动命令:
代码语言:javascript复制start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
# 第三条可以用下面的命令,上面的显示过期了,以后弃用
mapred --daemon start historyserver
2. 使用 HDFS Shell
- 创建文件夹,创建文件
[dnn@master ~]$ mkdir /opt/hadoop-3.3.0/HelloHadoop
[dnn@master ~]$ vim /opt/hadoop-3.3.0/HelloHadoop/file1.txt
文本内容:
代码语言:javascript复制hello hadoop
i am Michael
代码语言:javascript复制[dnn@master ~]$ vim /opt/hadoop-3.3.0/HelloHadoop/file2.txt
文本内容:
代码语言:javascript复制learning BigData
very cool
- 创建 HDFS 目录
hadoop fs -mkdir -p /InputData
, -p 多级目录 - 检查是否创建
[dnn@master ~]$ hadoop fs -ls /
Found 4 items
drwxr-xr-x - dnn supergroup 0 2021-03-13 06:50 /InputData
drwxr-xr-x - dnn supergroup 0 2021-03-12 06:53 /InputDataTest
drwxr-xr-x - dnn supergroup 0 2021-03-12 07:12 /OutputDataTest
drwxrwx--- - dnn supergroup 0 2021-03-12 06:19 /tmp
- 上传、查看
[dnn@master ~]$ hadoop fs -put /opt/hadoop-3.3.0/HelloHadoop/* /InputData
[dnn@master ~]$ hadoop fs -cat /InputData/file1.txt
hello hadoop
i am Michael
[dnn@master ~]$ hadoop fs -cat /InputData/file2.txt
learning BigData
very cool
- 查看系统整体信息
hdfs dfsadmin -report
[dnn@master ~]$ hdfs dfsadmin -report
Configured Capacity: 36477861888 (33.97 GB)
Present Capacity: 23138791499 (21.55 GB)
DFS Remaining: 23136948224 (21.55 GB)
DFS Used: 1843275 (1.76 MB)
DFS Used%: 0.01%
Replicated Blocks:
Under replicated blocks: 12
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Low redundancy blocks with highest priority to recover: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (2):
Name: 192.168.253.128:9866 (slave1)
Hostname: slave1
Decommission Status : Normal
Configured Capacity: 18238930944 (16.99 GB)
DFS Used: 929792 (908 KB)
Non DFS Used: 6669701120 (6.21 GB)
DFS Remaining: 11568300032 (10.77 GB)
DFS Used%: 0.01%
DFS Remaining%: 63.43%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 13 06:57:09 CST 2021
Last Block Report: Sat Mar 13 06:49:24 CST 2021
Num of Blocks: 12
Name: 192.168.253.129:9866 (slave2)
Hostname: slave2
Decommission Status : Normal
Configured Capacity: 18238930944 (16.99 GB)
DFS Used: 913483 (892.07 KB)
Non DFS Used: 6669369269 (6.21 GB)
DFS Remaining: 11568648192 (10.77 GB)
DFS Used%: 0.01%
DFS Remaining%: 63.43%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Mar 13 06:57:09 CST 2021
Last Block Report: Sat Mar 13 06:45:42 CST 2021
Num of Blocks: 12
3. 使用 HDFS Web UI
可以看见副本数是 3,Block 大小是 128 Mb
4. 安装 Eclipse IDE
- 下载地址
- 安装指导
4.1 上传文件
编写上传文件的代码:
代码语言:javascript复制/**
*
*/
package com.michael.hdfs;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
/**
* @author dnn
*
*/
public class UploadFile {
/**
*
*/
public UploadFile() {
// TODO Auto-generated constructor stub
}
/**
* @param args
*/
public static void main(String[] args) throws IOException{
// TODO Auto-generated method stub
Configuration conf = new Configuration();
FileSystem hdfs = FileSystem.get(conf);
Path scr = new Path("/opt/hadoop-3.3.0/HelloHadoop/file1.txt");
Path dest = new Path("file1.txt");
hdfs.copyFromLocalFile(scr, dest);
System.out.println("Upload to " conf.get("fs.defaultFS"));
FileStatus files[] = hdfs.listStatus(dest);
for(FileStatus file : files) {
System.out.println(file.getPath());
}
}
}
运行:并未拷贝到 hdfs系统内
代码语言:javascript复制log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Upload to file:///
file:/home/dnn/eclipse-workspace/HDFS_example/file1.txt
查看hdfs系统文件,没有file1.txt
代码语言:javascript复制[dnn@master ~]$ hadoop fs -ls /
Found 4 items
drwxr-xr-x - dnn supergroup 0 2021-03-13 06:54 /InputData
drwxr-xr-x - dnn supergroup 0 2021-03-12 06:53 /InputDataTest
drwxr-xr-x - dnn supergroup 0 2021-03-12 07:12 /OutputDataTest
drwxrwx--- - dnn supergroup 0 2021-03-12 06:19 /tmp
更改:设置默认地址
代码语言:javascript复制 Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://192.168.253.130:9000");//加入这句
FileSystem hdfs = FileSystem.get(conf);
输出:正确了,上传到 hdfs 里了
代码语言:javascript复制Upload to hdfs://192.168.253.130:9000
hdfs://192.168.253.130:9000/user/dnn/file1.txt
代码语言:javascript复制[dnn@master Desktop]$ hadoop fs -ls -R /user
drwxr-xr-x - dnn supergroup 0 2021-03-16 07:43 /user/dnn
-rw-r--r-- 3 dnn supergroup 26 2021-03-16 07:43 /user/dnn/file1.txt
- 在集群上运行 1 、导出 jar 文件
2、bash输入命令执行
代码语言:javascript复制[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_uploadfile.jar com.michael.hdfs.UploadFile
Upload to hdfs://192.168.253.130:9000
hdfs://192.168.253.130:9000/user/dnn/file1.txt
[dnn@master Desktop]$ hadoop fs -ls -R /user
drwxr-xr-x - dnn supergroup 0 2021-03-16 07:59 /user/dnn
-rw-r--r-- 3 dnn supergroup 26 2021-03-16 07:59 /user/dnn/file1.txt
4.2 查询文件位置
代码语言:javascript复制package com.michael.hdfs;
import java.io.IOException;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.BlockLocation;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class FileLoc {
public static void main(String[] args) throws IOException{
// TODO Auto-generated method stub
String uri = "hdfs://master:9000/user/dnn/file1.txt";
Configuration conf = new Configuration();
try {
FileSystem fs = FileSystem.get(URI.create(uri), conf);
Path fpath = new Path(uri);
FileStatus filestatus = fs.getFileStatus(fpath);
BlockLocation [] blklocations = fs.getFileBlockLocations(filestatus, 0, filestatus.getLen());
int blockLen = blklocations.length;
for(int i = 0; i < blockLen; i) {
String [] hosts = blklocations[i].getHosts();
System.out.println("block" i "_location:" hosts[0]);
}
}
catch(IOException e) {
e.printStackTrace();
}
}
}
代码语言:javascript复制[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_filelocation.jar com.michael.hdfs.FileLoc
block0_location:slave2
4.3 创建目录
代码语言:javascript复制package com.michael.hdfs;
import java.io.IOException;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class CreatDir {
public static void main(String[] args) {
// TODO Auto-generated method stub
String uri = "hdfs://master:9000";
Configuration conf = new Configuration();
try {
FileSystem fs = FileSystem.get(URI.create(uri), conf);
Path dfs = new Path("/test");
boolean flag = fs.mkdirs(dfs);
System.out.println(flag ? "create success" : "create failure");
}
catch(IOException e) {
e.printStackTrace();
}
}
}
代码语言:javascript复制[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_mkdir.jar com.michael.hdfs.CreatDir
create success
4.4 读取文件内容
代码语言:javascript复制package com.michael.hdfs;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FSDataInputStream;
public class ReadFile {
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://master:9000");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = FileSystem.get(conf);
Path file = new Path("file1.txt");
FSDataInputStream getIt = fs.open(file);
BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
String content = d.readLine();
System.out.println(content);
d.close();
fs.close();
}
catch(IOException e) {
e.printStackTrace();
}
}
}
代码语言:javascript复制[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_readfile.jar com.michael.hdfs.ReadFile
hello hadoop
4.5 写入文件
代码语言:javascript复制package com.michael.hdfs;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.Path;
import com.michael.hdfs.ReadFile;
public class WriteFile {
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://master:9000");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = FileSystem.get(conf);
byte[] buffer = "hello Michael !!!".getBytes();
String filename = "test_file.txt";
FSDataOutputStream os = fs.create(new Path(filename));
os.write(buffer, 0 , buffer.length);
System.out.println("create: " filename);
os.close();
fs.close();
ReadFile r = new ReadFile();
r.read(filename);
}
catch(IOException e) {
e.printStackTrace();
}
}
}
代码语言:javascript复制package com.michael.hdfs;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FSDataInputStream;
public class ReadFile {
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://master:9000");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = FileSystem.get(conf);
Path file = new Path("test_file.txt");
FSDataInputStream getIt = fs.open(file);
BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
String content = d.readLine();
System.out.println(content);
d.close();
fs.close();
}
catch(IOException e) {
e.printStackTrace();
}
}
public void read(String filename) {
// TODO Auto-generated method stub
try {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://master:9000");
conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
FileSystem fs = FileSystem.get(conf);
Path file = new Path(filename);
FSDataInputStream getIt = fs.open(file);
BufferedReader d = new BufferedReader(new InputStreamReader(getIt));
String content = d.readLine();
System.out.println(content);
d.close();
fs.close();
}
catch(IOException e) {
e.printStackTrace();
}
}
}
代码语言:javascript复制[dnn@master Desktop]$ hadoop jar /home/dnn/eclipse-workspace/HDFS_example/hdfs_writefile.jar com.michael.hdfs.WriteFile
create: test_file.txt
hello Michael !!!