Hadoop developer

2017-12-16 / Houston, TX

Design, Develop and Test complex MapReduce programs in Java for data analysis on different data formats. Use compression codecs like Snappy, Gzip or LZO according to requirements, in order to conserve storage space on the Hadoop cluster. Import data from Oracle DB to HDFS using Apache Sqoop and automate the process using Shell scripts. Use Hive external tables to import the data into Hive meta-store and use optimization techniques like bucketing, partitioning reduce the run time of the Hive queries. Deep exposure to Kerberized Hadoop ecosystem (including HDFS, Spark, Sqoop, Flume, Hive, Impala, MapReduce, Sentry, Navigator) Handle data from different data sets, joined them and pre-processed using Hive join operations. Experience on Scala or Java Maven Projects in addition to exposure to Web Application Servers preferred Stage data in Hive tables to perform transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS. Must have experience with ETL, UNIX Shell Script and UNIX OS Implement business logic by writing Pig scripts, Pig UDFs and Macros by using various piggybanks other sources. Providing architecture and technology support across a range of technologies including but not limited to Cloud, Mobile/Web frameworks, .net, SQL, Tableau, Informatica, TFS and Collaboration platforms. Unit test all pig scripts using local mode and using Pig Unit testing. Keeping up-to-date on new technology, standards, protocols and tools in areas related to the rapidly changing digital environment Develop Oozie workflow for scheduling & orchestrating the ETL process. Work with our technology and business partners to develop and refine our software delivery process. Monitor Hadoop cluster using Cloudera Manager

Apply for the Job using this form