Hadoop Cluster - Bookmarks and help URLs

1.5. I have a new node I want to add to a running Hadoop cluster; how do I start services on just one node?

This also applies to the case where a machine has crashed and rebooted, etc, and you need to get it to rejoin the cluster. You do not need to shutdown and/or restart the entire cluster in this case.

First, add the new node's DNS name to the conf/slaves file on the master node.

Fine tuning Apache Hadoop Security Settings

Apache Hadoop is equipped with a robust and scalable security infrastructure. These notes are intended to cluster administrators fine-tune the security settings of their clusters.


Quality of Protection:

Security infrastructure for Hadoop RPC uses Java SASL APIs. Quality of Protection (QOP) settings can be used to enable encryption for Hadoop RPC protocols.

Java SASL provides following QOP settings:

Best Practices Selecting Hadoop Hardware

Excerpts of this article are copyright there respective owners: The original article can be found at : Hortonworks

Apache Hadoop worker node hardware @ Yahoo!, a lot of nodes with 6*2TB SATA drives, 24GB RAM and 8 cores in a dual socket configuration. This has proven to be a pretty good configuration. This year, I’ve seen systems with 12*2TB SATA drives, 48GB RAM and 8 cores in a dual socket configurations. We will see a move to 3TB drives this year.

Setting up Apache Hadoop on RHEL6/CentOS 6

Setting up Apache Hadoop on RHEL6/CentOS 6 is simple wiith the recent availability of RPMs for Apache Hadoop it makes it much simpler to setup a basic Hadoop cluster. This will allow you to focus on how to use the features instead of having to learn how they were implemented.

These instructions DO NOT Hadoop settings to make Hadoop fast but it will get you running a Hadoop culster fast. We will leave Hadoop optimization for another day.

Installing Java JRE and JDK version 1.7(1.6) on CentOS 6 and config the system alternatives for "java"

You should simply use the packages from Sun.

Step (1) : Visit Sun’s web site and download the latest version of Java (the *.bin file not the *-rpm.bin) (http://java.sun.com/javase/downloads/index.jsp)

(pay close attention if you want the 32bit or 64bit version)

Example the 64-bit version of the JDK can be found at http://download.oracle.com/otn-pub/java/jdk/7/jdk-7-linux-x64.rpm

Configure Sendmail as a Smarthost and rewriting the from-address

  1. Install both the sendmail and sendmail-cf packages on your Redhat/CentOs server.  Use the command: yum install sendmail sendmail-cf 
  2. To automatically start sendmail to start on system boot we issue the command: chkconfig sendmail on 
  3. To start sendmail NOW we issue the command: service sendmail start 

Creating a Centralized Syslog Server

A centralized syslog server was one of the first true SysAdmin tasks that I was given as a Linux Administrator way back in 1997. My boss at the time wanted to pull in log files from various appliances and have me use regexp to search them for certain key words. At the time Linux was still in its infancy, and I had just been dabbling with it in my free time. So, I jumped at the chance to introduce Linux to the company that I had worked for. Did it work? You bet it did!

How to Fix a "JVM terminated. Exit code=13" Error in Eclipse

    • First, realize that this error is caused when you are attempting to start Eclipse using the wrong version of the Java Virtual Machine (JVM). So you need to realize what JVM you are starting with.
    • If you are using Linux, you can type "which java" on the command line. Otherwise, you can type "java -version". This will give you the Java version.

Pages

Subscribe to kb.kaminskiengineering.com RSS