Employer Industry: Data Engineering and Analytics
Why consider this job opportunity
- Opportunity to work with cutting-edge technologies like Trino, Spark, and advanced AI inferencing systems
- Flexible work-from-home policy to support work-life balance
- Access to generous paid time off (PTO) and paid volunteer time
- Comprehensive benefits and competitive compensation packages
- Opportunities for continued career development and collaboration with open-source projects
- Work on large-scale, distributed systems that influence data solutions for top companies worldwide
What to Expect (Job Responsibilities)
- Design and develop features for parallel and distributed query engines to enhance the Cloudera Data Platform (CDP)
- Drive innovation in Hive and build additional components to support its ecosystem
- Optimize SQL queries for performance and scalability
- Write design documentation and improve code quality through testing and automation
- Understand customer workloads to provide effective technical solutions
What is Required (Qualifications)
- Bachelor’s or Master’s degree in Computer Science or equivalent, with 6 years of experience
- Experience with query optimization tools, such as Apache Calcite
- Strong programming skills with a focus on data structures and algorithms; Java experience is desired
- Good understanding of database internals, query processing, and SQL query optimization
- Strong oral and written communication skills, with the ability to work on cross-functional projects
How to Stand Out (Preferred Qualifications)
- Experience contributing to open-source Apache projects like Hive, Impala, or Calcite
- Familiarity with the Hadoop ecosystem and file formats like Parquet and ORC
- Experience with public cloud infrastructures such as Microsoft Azure, AWS, and Google Cloud Platform
- Recognized contributions to open-source projects