示例:
IP(Hostname) | Server |
---|---|
192.168.140.130 (hadoop01) | Nn Dn Snn Nm |
192.168.140.131 (hadoop02) | Dn Nm Rm |
192.168.140.132 (hadoop03) | Dn Nm |
节点类型参考: Hadoop-2.节点
以下操作需要在三台机器上同步进行:
- 同步时间
- 修改hosts文件
vi /etc/hosts
:192.168.140.130 hadoop01
192.168.140.131 hadoop02
192.168.140.132 hadoop03
- 关闭防火墙
systemctl stop firewalld.service
systemctl disable firewalld.service
firewall-cmd --stat
- 修改主机名
vi /etc/hostname
- 分别配置为
hadoop01
,hadoop02
,hadoop03
- 分别配置为
- 配置免密钥登陆(主要针对 Nn 和 Rm 节点)
ssh-keygen -t rsa
ssh-copy-id hadoop01
ssh-copy-id hadoop02
ssh-copy-id hadoop03
- 安装Java, 并配置环境变量:(Linux-yum安装jdk11)
- 解压 Hadoop 软件包:
tar -zxvf hadoop-3.3.5.tar.gz -C /opt/hadoop
- 修改配置文件:
/opt/hadoop/hadoop-3.3.5/etc/hadoop/
hadoop-env.sh
:export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.18.0.10-2.el9_1.x86_64
core-site.xml
(指定 Nn 的位置, 即 hadoop01)<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop01:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/tmp/hadoop</value> </property> </configuration>
hdfs-site.xml
(指定每个文件有3个副本, 并指定 Snn 的位置, 即 hadoop01)<configuration> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.secondary.http.address</name> <value>hadoop01:50090</value> </property> </configuration>
mapred-site.xml
(指定使用 yarn 方式运行 mr)<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
yarn-site.xml
(指定 Rm 的位置, 即 hadoop02)<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop02</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
workers
(slaves
in Hadoop2)hadoop01
hadoop02
hadoop03
- 格式化:
hdfs namenode -format
(或hadoop namenode -format
) - (可选) 将一台机器配置完的目录复制到其他机器
scp -r /opt/hadoop/hadoop-3.3.5 root@hadoop02:/opt/hadoop/hadoop-3.3.5
scp -r /opt/hadoop/hadoop-3.3.5 root@hadoop03:/opt/hadoop/hadoop-3.3.5
- 配置环境变量
vi /etc/profile
export HADOOP_HOME=/opt/hadoop/hadoop-3.3.5
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
- 启动
start-dfs.sh
(在 Nn 节点启动, 即 hadoop01)start-yarn.sh
(在 Rm 节点启动, 即 hadoop02)
- web 访问端口
- (Nn 节点, 即 hadoop01)
50070
(Hadoop2) |9870
(Hadoop3) - (Rm 节点, 即 hadoop02)
8088
- (Nn 节点, 即 hadoop01)
- 测试
jps
查看节点是否启动- 测试上传文件
hdfs dfs -put /path/to/file /target/path
- 访问
hadoop01:9870
: Utilities -> Browse the file system -> Go! - 检查上传的文件是否已经存在3个副本 (Replication)
- 访问
- 测试圆周率计算(如果报
ClassNotFoundException
, 请参考文章: Hadoop-ClassNotFoundException)hadoop jar /opt/hadoop/hadoop-3.3.5/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.5.jar pi 10 10