1. General operational expertise such as good troubleshooting skills, understanding of system’s capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks.
2. Hadoop skills like HDFS, YARN + MapReduce2, Hive, HBase, Pig, Sqoop, Oozie, ZooKeeper, Flume, Ambari, Kafka, Knox, Slider, Solr, Spark etc.
3. The most essential requirements are: They should be able to deploy Hadoop cluster, add and remove nodes, keep track of jobs, monitor critical parts of the cluster,
4. configure name-node high availability, schedule and configure it and take backups.
5. Ready to work on 24/7
1. Good knowledge of Linux as Hadoop runs on Linux.
2 Familiarity with open source configuration management and deployment tools such as Ambari and Linux scripting.
3. Knowledge of Troubleshooting Core Java Applications is a plus.