본문 바로가기
About Computer/서버관리

Hadoop 1.2.1 Installation & Troubleshooting

by leanu 2014. 2. 3.

Installation

 

  1. Check the firewalls among servers. ( Recommended to open all ports among servers )
  2. Create same account( TEST_ACCOUNT ) to all servers.
  3. Create public key of TEST_ACCOUNT for each server. ( ssh-keygen )
  4. Authorize public key of TEST_ACCOUNT in master server to each slave server TEST_ACCOUNT so that master account can access each slave servers through ssh( FILE : $HOME/.ssh/authorized_keys ). Make sure the value of chmod of authorized_keys file is “0×600″ or it  doesn’t work.
  5. Copy hadoop-1.2.1 to the same directory of each server.
  6. Modify  $HADOOP/conf/masters & $HADOOP/conf/slaves to add master and slave servers. Copying modified file to each server is recommended.
    1. SAMPLE : masters

       
    2. SAMPLE : slaves

       
  7. Modify /etc/hosts to recognize hostname of each servers(sudo is needed.). Recommended to use same content among servers. I used hostname like “hadoop-master1″ or “hadoop-slave3″.
    1. SAMPLE : /etc/hosts

       
  8. Configure hadoop setting for each server.
  9. Run Format($HADOOP/bin/hadoop name node -format) and Start($HADOOP/bin/start-all.sh) at master server.

 

 

Tips

 

Format script

I used two masters with two mounted directories ( /data1 & /data2 ) and five slaves with three mounted directories( /data1 & /data2 & /data3 ). You can change the directory of your own. Locate this script to $HADOOP/bin

 

 

Trouble-shootings

 

1. Unregistered data node

  • This exception occurred because of different content of $HADOOP/conf/masters and $HADOOP/conf/slaves among servers.

 

2. ERROR org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Error getting localhost name. Using ‘localhost’…

  • This issue is because the server has localhost in “/etc/hosts” or “/etc/HOSTNAME” or “rc.conf”. There’re several ways to resolve this issues and I choose “removing a line which includes localhost” in hostname file.

 

3. Bad connect ack with firstBadLink as xxx.xxx.xxx:50010

  • There are several reasons to cause this issue. In my case, I turned off the firewall ( for example “/etc/init.d/iptable stop” ) to resolve this issue (!!Not recommended because turning off the firewall is dangerous for servers which can be accessed from outside. Refer the following site to add IPs and PORTs to the firewall rule. – http://blog.cloudera.com/blog/2009/08/hadoop-default-ports-quick-reference/ )

 

댓글