120 Companies Hiring Hadoop Developers
There has been a consistent growth in the demand for people who know Big Data, Hadoop and related technologies. Demand is basically coming from two sectors, one is where data processing environment is already mature like Data warehousing, Data Integration etc and where they want to move some of these processes to Hadoop.
Here are a few of the many Hadoop Developer job titles that you can apply for in Hadoop:
- Hadoop Developer
- Hadoop Administrator
- Hadoop Tester
- Hadoop Lead Developer
- Hadoop Architect
- Data Scientist
- Hadoop Engineer
Must-have Hadoop Developer Skills
Although there are a number of Hadoop Developer skills that are in demand today, here are a few that you cannot ignore:
- Knowledge of the Hadoop ecosystem and its components
- Ability to write manageable, reliable, and high-performance codes
- Expert knowledge of Hadoop, Pig, HBase, and Hive
- Work experience in HQL (Hibernate Query Language)
- Experience in writing MapReduce jobs and Pig Latin scripts
- Hands-on experience in backend programming using Java, OOAD, JavaScript, and Node.js
- Good knowledge of multi-threading and concurrency
- Having analytical and problem-solving skills and implementing them in Big Data with your acquired Big Data Developer skills
- Good understanding of data loading tools like Flume
- In-depth knowledge of database principles, structures, practices, and theories
- Knowledge of schedulers
You may also like:Â
Top 12 Hadoop Technology Companies
The Biggest Challenge of Hadoop Analytics: It’s all about Query Performance
Relation Between Big Data Hadoop and Cloud Computing
What Is Hadoop, And How Does It Relate To Cloud?
A Guide to Become a Successful Hadoop Developer in 2023
How To Kick Start Your Career With Hadoop And Big Data Training?
13 Reasons Why System/Data Administrators should do Hadoop Training
Top 10 Tips for Hadoop Administration for Starters
Here is the list of companies hiring Hadoop Developers, Hadoop Engineers.
Company | Business | Technical Specs | Uses | |
1 | Social Site | 8 cores and 12 TB of storage | Used as a source for reporting and machine learning | |
2 | Social site | Store and process tweets, log files | ||
3 | Social site | 2X4 and 2X6 cores – 6X2TB SATA | Discovering people you may know | |
4 | Yahoo! | Online Portal | 4500 nodes – 1TB storage, 16 GB RAM | Used for scaling tests |
5 | AOL | Online portal | ETL style processing and statistics generation | Targets machines and dual processors |
6 | EBay | Ecommerce | 532 nodes cluster | Used for Search Optimization and Research |
7 | Alibaba | E-Commerce | Processes 15-node cluster business data | Analyzes vertical search engine |
8 | Cloudspace | IT developer | Specializes in designing and building web applications | |
9 | FOX Audience Network | News TV Channel | 30-70 machine clusters | Used for log analysis and machine learning |
10 | Adobe | Publishing and editing software | 30 nodes running HDFS, 5 to 14 nodes HBase | Social services to structured data storage |
11 | Infosys | IT Consulting | Per client requirements | Client projects in finance, telecom and retail. |
12 | Cognizant | IT Consulting | Per client requirements | Client projects in finance, telecom and retail. |
13 | Accenture | IT Consulting | Per client requirements | Client projects in finance, telecom and retail. |
14 | Hulu | Video Delivery | 13 machine clusters – 8 cores, 4 TB | Used for analysis and log storage |
15 | Last.fm | Online FM Music | 100 nodes, 8 TB storage | Calculation of charts and data testing |
16 | IMVU | Social Games | Clusters up to 4 m1.large EC2 instances – 5TB volume | Informs product development decisions |
17 | Cornell University Web Lab | University | 100 nodes – 2 GB RAM, 72 GB Hard Drive | Generates web graphs |
18 | Mercadolibre.com | Ecommerce | 20 nodes cluster – 53.3 TB Storage | Processes customers and operations log |
19 | Ning | Social Network Platform | 8 cores – 16 GB RAM | Used for reporting and analytics |
20 | Rackspace | Web  hosting services | 30 node cluster – 4-8GB RAM, 1.5TB/node storage | Indexing logs from email hosting system for search |
21 | Rakuten | Ecommerce | 69 node cluster | Analyze log and mine data |
22 | Powerset / Microsoft | Natural Language Search | Used for Data Storage | |
23 | Sling Media | Television service provider | 10-Node cluster | Run algorithms on a number of raw data |
24 | Spotify | Digital music platform | 690 node cluster – 38TB RAM, 28 PB storage | Used for content generation, data aggregation |
25 | Quantcast | Search site | 3000 cores, 3500TB | Customizes data path |
26 | A9.com | Product & Visual Search | 1 to 100 nodes | Search Indices |
27 | Accela Communications | Video Management | 10 1U servers, with 4 cores, 4GB ram and 3 drives | Processing registrations |
28 | Adyard | Ad network | 12 nodes running HDFS | Used for log storage and report generation |
29 | Able Grape | Search Engine | 2 nodes @ 8 CPUs/node | Analyze and index the textual information |
30 | Adknowledge | Ad network | Clusters 50 to 200 nodes | Builds recommender system and click stream analytics |
31 | Aguja | E-Commerce | Clusters 48 cores in total, 4GB RAM and 1 TB storage | Analyzes search logs |
32 | ARA.COM.TR | Search Engine | Clusters 10 to 100 nodes | Used for analytics |
33 | Archive.is | Archiving service | 3 nodes (16Gb RAM, 6Tb storage) | Provides backup for web pages |
34 | Atbrox | Search technology | Clusters using Amazon’s Elastic MapReduce | Used for search and information extraction |
35 | BabaCar | Car rental | Clusters 4 nodes | Analyzes rental bookings |
36 | Basenfasten | Personal Services | Clusters 4 nodes | Storage for logs and digital assets |
37 | Benipal Technologies | Ecommerce | Cluster 35 Node with 50TB cluster storage | Analyzes for image processing |
38 | Beebler | Social site | Clusters 14 node | Matches dating profiles |
39 | Bixo Labs | Elastic Web Mining | Clusters 20 machines | Provides consulting and training |
40 | BrainPad | Data mining and analysis | Summarizes user tracking data | Business analytics and solutions |
41 | Brilig | Online advertising | Clusters 10 nodes, 24 GB RAM, 6 X1TB SATA | Used for digital display advertising |
42 | Brockmann Consult GmbH | Environmental informatics and Geo information services | Clusters 20 nodes, 112 TB disk space total | Analyzes environmental Earth Observation Data products |
43 | Caree.rs | Job site | 15 nodes | Runs Machine learning Algorithms |
44 | CDU now! | Political party | Used for Searching, Filtering and Indexing | |
45 | Charleston | Domain registration | 15 nodes | Used for creating Domain names |
46 | Contextweb | Ad Exchange | 50 machines clusters 400 crores, 140 TB raw storage | Stores ad serving logs |
47 | Cooliris | Iphone/Ipad app | 15 – node cluster, 8 GB RAM, 3-4 TB storage | Browsing photos/videos |
48 | CRS4 | Research Centre | Clusters 400 nodes | Promotes study, development and application of innovative solutions |
49 | Crowdmedia | Digital Content Marketing | 5 Node cluster | Analyzes trends on social networks |
50 | Datagraph | Cloud based database | Cluster sizes of 1 to 20 nodes | Used for processing large database |
51 | Dataium | Customer analytics | Analyzes Data and company/consumer behaviour | |
52 | Deepdyve | Commercial Website | clusters with 5-80 nodes | Provides storage service for index shards |
53 | Detikcom | News portal | Uses 9 nodes | Analyzes search logs, most view news |
54 | DropFire | IT Developer | Integrates, analyzes and deliver company data | |
55 | eCircle | Digital Marketing provider | 60 nodes cluster each >1000 cores, total 5T Ram, 1PB | Handles Market Data |
56 | Enet | Newspaper | 5 nodes cluster | Analyzes data mining and machine learning |
57 | Enormo | Search engine | 4 nodes cluster – 32 cores, 1 TB | Removing duplicate listings and grouping similar ones |
58 | ESPOL University (Escuela Superior Politécnica del Litoral) in Guayaquil, Ecuador | Weblog Blog Repository | 4 nodes cluster | Projects machine learning, social network and network security |
59 | Eyealike | Ecommerce | Used for image content based advertising | |
60 | Explore.To Yellow Pages | Telephone directory | Clusters with 5-80 nodes | Used for internal search, filtering and indexing |
61 | Forward3D | Global digital agency | 19 virtual machine cluster | Used for log analysis and machine learning |
62 | Freestylers | Image retrieval engine | Produces original database | Analyzes similarities of user’s behaviour |
63 | GBIF | Non-profit Biodiversity organization | 18 nodes running a mix | Queries against biodiversity data |
64 | GIS.FCU | University | 3 machine cluster | Stores sensor Data |
65 | Gruter. Corp. | Next-gen Tech company | Clusters 30 machines | Uses for Data Indexing |
66 | Gewinnspiele | Games site | Clusters 6 nodes | Used for high speed Data Mining |
67 | GumGum | Advertising agency | Clusters 9 nodes | Used for Images and Advertising analytics |
68 | Hadoop Korean User Group | Korean Community page | 50 nodes, Pentium 4 PC, HDFS 4TB Storage | Used for development projects |
69 | Hotels & Accommodation | Search engine for hotels | 3 machine clusters – 4 cores, 2 TB | Data search and aggregation |
70 | Hundeshagen | Law firm | 6 node cluster – 4 dual CPUs, 5 TB storage, 4 GB RAM | Used for high speed Data Mining |
71 | ICCS | University | Used for Blog Posts, teaching and general research | |
72 | IIIT, Hyderabad | Research lab | Clusters 10-30 nodes | Retrieves and extracts information and research projects |
73 | Infochimps | Big Data Enterprise | 30 node – AWS EC2 cluster | Analysis of Data on terascale datasets |
74 | Journey Dynamics | Driver Profiling company | Analyzes GPS Data | |
75 | Kalooga | Image gallery services | 20 node cluster | Processing of events and analysis |
76 | Korrelate | Ecommerce | HBase – 5TB data size | Processes events and data for reporting |
77 | Koubei.com | Ecommerce | Processes whole price Data | |
78 | Language, Interaction and Computation Laboratory (Clic – CIMeC) | Research Laboratory | 10 nodes – 8 core, 8GB RAM | Studies verbal and non-verbal communication |
79 | Lineberger Comprehensive Cancer Center – Bioinformatics Group | Cancer Centre Research | 8 dual quad core – 48 TB storage | Used for Database |
80 | Markt24 | Ecommerce | 8GB Ram, 4 cores, 1TB | Filter user behaviour, recommendations from external sites |
81 | MicroCode | Domain registration | 18 node cluster – 1 TB Storage | Used for Customer Relation Management |
82 | Media 6 Degrees | Marketing agency | 20 node cluster – 16 GB, 6 TB | Ad optimization and social graph analysis |
83 | MeMo News – Online and Social Media Monitoring | Social Media | Processes news and unstructured data | |
84 | Neptune | Online Marketer | 200 nodes – 2 TB storage, 4 GB RAM | Stores large structured Data set |
85 | NetSeer | Ad-Network Technology | 50 node cluster | Used for serving and log analysis |
86 | Openstat | Analytics Services | 50 node – generates 25 GB of reports | Runs web and log analytics |
87 | optivo | Email marketing software | Analyzes email campaigns | |
88 | Papertrail | app log management | Feeds customer logs | |
89 | PCPhase | Mobile integration company | 4nodes – 4 cores, 4GB RAM and 500 G storage | Generate reports for a large mobile web site |
90 | Performable | Web Analytics Software | Process marketing, CRM and email data | |
91 | Pharm2Phork Project | Agricultural Traceability | Monitors and tunes workflow processes | |
92 | Pressflip | Personalized Persistent Search | Process documents and data storage | |
93 | Pronux | Software solutions | 4 nodes cluster – 32 cores, 1 TB | Searches and analyzes book-keeping postings |
94 | PokerTableStats | Game site | 2 nodes cluster – 15 cores, 500 GB | Analyzes poker game history |
95 | PSG Tech, Coimbatore, India | College | 5-10 nodes, 4 GB RAM and 16 GB HDD | Used for solving large scale alignment problems |
96 | Rapleaf | Marketing data and software company | 80 node cluster – 4TB storage, 16GB RAM | Simplifies data flow |
97 | Recruit | Advertisement company | 50 nodes – 2TB*4 disk 16GB RAM | Used for analyzing logs and mine data |
98 | Redpoll | machine learning library | 35 nodes – 10TB disk 16GB RAM | deals with large-scale data sets |
99 | Resu.me | Job site | 5 nodes | process user resume data and run algorithms |
100 | Rodacino | Greece news channel | 16 node cluster – 2 quad core CPUs, 6TB storage, 24GB RAM | Used for log and usage analysis |
101 | Rovi Corporation | Digital entertainment | 40 nodes with 24 cores at 2.4GHz and 128GB RAM | Used for crawling news sites |
102 | SLC Security Services LLC | Data Information Provider | 18 node cluster – 1TB storage, 4GB RAM | Used for high speed data mining applications |
103 | Specific Media | Ad agency | Cluster 27-111nodes | Used for log aggregation, reporting and analysis |
104 | Sthenica | Data Solutions provider | 3 node cluster | Monitors social media and personalized marketing |
105 | The Lydia News Analysis Project | University | 17-node and 103-node clusters | Processes daily newspapers as well as historical archives |
106 | Tailsweep | Social media and ad network | 8 node cluster – 8GB RAM, 500GB/node Raid 1 storage | Used for data mining and blog crawling |
107 | Telefonica Research | Research & Development | 6 node cluster – 8GB RAM and 2 TB storage | Used in data mining and user modelling |
108 | Telenav | Mobile phone app | 60-Node cluster – 4GB RAM, 13TB storage | Helps learning algorithms for Statistical Categorization |
109 | Tepgo | E-Commerce Data analysis | 3 node cluster – 4GB RAM and 1 TB storage | Analyzes search and usage logs |
110 | Tynt | Content Management System | Cluster 94 nodes – 752 cores | Assembles web publishers’ summaries |
111 | Universidad Distrital Francisco Jose de Caldas (Grupo GICOGE/Grupo Linux UD GLUD/Grupo GIGA) | Free software working group | 5 node cluster | supports the research project |
112 | University of Freiburg – Databases and Information Systems | Database and information system | 10 nodes cluster – 4GB RAM, 3TB/ node storage | queries on large RDF graphs |
113 | University of Glasgow – Terrier Team | Open source search engine | 30 nodes cluster – 4GB RAM, 1TB/node storage | facilitate information retrieval research & experimentation |
114 | University of Maryland | University | Used in machine translation, language modelling, image processing etc. | |
115 | University of Nebraska Lincoln, Holland Computing Center | University | one medium-sized cluster | Used for research projects |
116 | University of Twente, Database Group | University | 16 node cluster – 8GB main memory, 1TB disk | Used in computer science master’s program |
117 | Visible Measures Corporation | Video ad campaigns | 128 CPU cores – 100 TB of storage | Used for scalable Data pipeline |
118 | Web Alliance | Web marketing and Ecommerce | Allows to store index and search data | |
119 | Webmaster Site | Chat and Ecommerce | 4 node cluster – 2 TB Storage, 32 GB RAM | Used for log analysis and trends prediction |
120 | WorldLingo | Online translator page | 22 nodes, 2TB storage, 8 GB RAM | Stores millions of documents |
Source – Hadoop Wiki
FAQs:
How much do Hadoop developers make?
How much does a Hadoop Developer make in the United States? The average Hadoop Developer salary in the United States is $95,974 as of November 23, 2022, but the salary range typically falls between $87,437 and $106,139.
What is Hadoop developer?
Hadoop developers are responsible for developing and coding Hadoop applications. Hadoop is an open-source framework that manages and stores big data applications that run within-cluster systems. Essentially a hadoop developer creates applications to manage and maintain a company’s big data.
Are Hadoop developers in demand?
This growing adoption and demand for Hadoop services are creating a huge need for skilled Hadoop experts in the industry. Hadoop Developer is one of the many coveted Hadoop roles in demand right now.
What does a Hadoop developer do?
Hadoop developers are responsible for developing and coding Hadoop applications. Hadoop is an open-source framework that manages and stores big data applications that run within-cluster systems. Essentially a hadoop developer creates applications to manage and maintain a company’s big data.