之前已经配置过一回hadoop1.x了.但是为了用yarn还是决定改用2.x.这次从头来过重新配置.
依然是brew安装
$brew install hadoop
然后是配置JAVA_HOME
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home
再之后就是host文件了
127.0.0.1 localhost 255.255.255.255 broadcasthost ::1 localhost fe80::1%lo0 localhost 127.0.0.1 XXX
ps: xxx是你的账户名
配置ssh
去mac的偏好设置里的共享里把远程登录给打上勾
设置无密码登陆
$ssh localhost
如果要密码,就:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
再试试应该就可以无密码登陆了
为了方便可以设置环境变量 HADOOP_HOME
到你的hadoop目录下
修改 hadoop/etc/hadoop/hadoop-env.sh
把里面的JAVA_HOME修改成.bash_profile中的一样就可以了
修改 hadoop/etc/hadoop/yarn-env.sh
同样是修改JAVA_HOME
修改 hadoop/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.default.name</name> <value>localhost:9000</value> </property> <!--fs.default.name:用来配置namenode,指定HDFS文件系统的URL,通过该URL我们可以访问文件系统的内容,也可以把localhost换成本机IP地址;如果是完全分布模式,则必须把localhost改为实际namenode机器的IP地址;如果不写端口,则使用默认端口8020。 --> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/Cellar/hadoop/tmp/hadooptmp</value> </property> <!-- hadoop.tmp.dir:Hadoop的默认临时路径,这个最好配置,如果在新增节点或者其 他情况下莫名其妙的DataNode启动不了,就删除此文件中的tmp目录即可。不过如果删除了NameNode机器的此目录,那么就需要重新执行NameNode格式化的命令。该目录必须预先手工创建。--> <property> <name>hadoop.native.lib</name> <value>false</value> <description>Should native hadoop libraries, if present, be used.</description> </property> </configuration>
修改 hadoop/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.data.dir</name> <value>/usr/local/Cellar/hadoop/hdfs/data</value> </property> <!--配置HDFS存储目录,数据存放目录,用于datanode存放数据--> <property> <name>dfs.name.dir</name> <value>/usr/local/Cellar/hadoop/hdfs/name</value> </property> <!--用来存储namenode的文件系统元数据,包括编辑日志和文件系统映像,如果更换地址的话,则需要重新使用hadoop namenode –format命令格式化namenode--> <property> <name>dfs.replication</name> <value>1</value> </property> <!--用来设置文件系统冗余备份数量,因为只有一个节点,所有设置为1,系统默认数量为3--> </configuration>
修改 hadoop/etc/hadoop/mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <!--该项配置用来配置jobtracker节点,localhost也可以换成本机的IP地址;真实分布模式下注意更改成实际jobtracker机器的IP地址--> <!-- <property> <name>mapred.map.tasks</name> <value>20</value> </property> <property> <name>mapred.reduce.tasks</name> <value>4</value> </property> --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- <property> <name>mapreduce.jobhistory.address</name> <value>Master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>Master:19888</value> </property> --> </configuration>
修改 hadoop/etc/hadoop/yarn-site.xml
<?xml version="1.0"?>