Cloudera Logo

Cloudera

Principal Software Engineer - Apache Spark

Reposted 10 Days Ago
In-Office or Remote
9 Locations
Senior level
In-Office or Remote
9 Locations
Senior level
The Principal Engineer will lead a team in developing features for Apache Spark and enhancing data processing systems, focusing on distributed data processing and system improvements.
The summary above was generated by AI

Business Area:

Engineering

Seniority Level:

Director

Job Description: 

At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry.  Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.

Cloudera is seeking an experienced Principal Engineer with strong distributed systems expertise to work on the Cloudera distribution of Apache Spark.   We are looking for senior engineers with experience in large-scale, distributed systems and data processing to help build our enterprise-grade system, designed for customers running Spark on thousands of nodes and processing petabytes of data.

We are looking for a passionate individual that is ready to be a team lead for a team that is already supporting production systems at many of the biggest companies – and is looking to expand and take on even more projects to drive the next gen Data Engineering experience.  You will be working with a distributed team, spread across the United States and Hungary, including multiple committers on Apache Spark. 

As a Principal Software Engineer, you will:

  • Architect, design, and implement resilient and scalable solutions for distributed data processing at massive scale, with a focus on fault tolerance, performance optimization, query planning, and resource management

  • Take ownership of critical distributed systems components, solving complex challenges related to network communication, concurrency, data consistency, and system reliability across clusters of thousands of nodes

  • Develop advanced monitoring, debugging, and performance analysis tools for large-scale distributed systems.

  • Act as a tech lead for Cloudera’s Spark team

  • Work with and contribute to the latest open source technologies, including Apache Spark, Iceberg, and Parquet

  • Develop new features in Scala/Java/Python on a modern platforms

  • Gain a solid understanding and deep technical knowledge of components across the Cloudera Data Engineering Experience stack, but focusing on Iceberg and Spark

  • Debug system level deployment issues, root cause analysis, perform system test analysis and resolve failures

  • Work on improving internal infrastructure

  • Collaborate with other team members and stakeholders

We are excited about you if you have:

  • Bachelor’s degree in Computer Science or equivalent, and 10+ years of experience; OR Master’s degree and 6+ years of experience; OR PhD and 4+ years of experience

  • Experience with systems design and development specifically for large-scale distributed environments

  • Experience leading and delivering complex product enhancements.

  • We use Java/Scala/Python/GoLang in projects, you should have a strong understanding of at least one of the following languages: Java, Scala, GoLang, Rust, C++, Python. And interested to learn the languages we’re using.

  • Passionate about programming, clean coding habits, attention to detail, and focus on quality

  • Strong oral and written communication skills.

  • Strong ability to research and solve problems independently without constant supervision

  • (Most importantly) Open-minded, desire to learn new things and build great products.

You may also have:

  • Experience with large-scale, distributed systems design and development with an understanding of scaling, performance, and scheduling

  • In-depth understanding of distributed query processing and planning

  • Experience with using/developing Apache Spark or other related technologies.

  • In-depth understanding of distributed systems concepts like consensus algorithms, distributed transactions, and fault tolerance

  • Experience working with query automated query optimization

  • Solid experience with at least one cloud service (AWS, Azure, GCP, OpenShift)

  • Contributors to open-source projects.

This role is not eligible for immigration sponsorship

What you can expect from us:

  • Generous PTO Policy 

  • Support work life balance with Unplugged Days

  • Flexible WFH Policy 

  • Mental & Physical Wellness programs 

  • Phone and Internet Reimbursement program 

  • Access to Continued Career Development 

  • Comprehensive Benefits and Competitive Packages 

  • Paid Volunteer Time

  • Employee Resource Groups

EEO/VEVRAA

# LI-SZ1
#LI-Remote

Top Skills

Spark
AWS
Azure
GCP
Java
Openshift
Python
Scala
SQL

Cloudera Boston, Massachusetts, USA Office

53 State St, Boston, MA, United States, 02109

Similar Jobs

45 Minutes Ago
Easy Apply
Remote
USA
Easy Apply
70K-103K
Mid level
70K-103K
Mid level
Consumer Web • Healthtech • Professional Services • Social Impact • Software
As an Associate in Strategy & Operations, you will design and implement solutions to operational challenges, collaborate across teams, and drive impactful initiatives in a high-growth environment.
Top Skills: Data AnalysisData WorkflowsProduct Automation
46 Minutes Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
83K-112K Annually
Senior level
83K-112K Annually
Senior level
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Drive the implementation of Workday functionalities, manage technical program planning, and facilitate agile processes while engaging with stakeholders.
Top Skills: Agile MethodologiesConfluenceJIRAWorkdayZendesk
48 Minutes Ago
Remote
United States
Expert/Leader
Expert/Leader
Agency • Digital Media • eCommerce • Professional Services • Software • Analytics • Consulting
The Senior Director of Product Management will lead product vision and strategy in the Payments domain, ensuring client satisfaction and compliance with regulations while driving execution in an agile environment.
Top Skills: AdobeAzure DevopsDrupalJIRAMagentoShopify

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account