Photon Logo

Photon

Data Engineer- Smithfield, RI

Posted Yesterday
Remote
Hiring Remotely in United States
37K-130K Annually
Senior level
Remote
Hiring Remotely in United States
37K-130K Annually
Senior level
Design, build, and maintain scalable data pipelines on AWS using EMR, Spark, and PySpark. Optimize EMR clusters and Spark jobs, implement ETL/ELT processes, ensure data quality, automate ingestion, and collaborate with data scientists and analysts. Document infrastructure, participate in code reviews, and monitor performance and costs.
The summary above was generated by AI

We are seeking a skilled and passionate Data Engineer to join our team and play a vital role in building and maintaining our data infrastructure. The ideal candidate will have extensive experience with AWS cloud services, particularly EMR, and strong proficiency in Spark and PySpark for data processing and transformation. This role will focus on designing, developing, and optimizing data pipelines to support our growing data needs.

Responsibilities:

  • AWS Data Services:
    • Design, implement, and manage data solutions on AWS, leveraging services such as EMR, S3, Glue, and others.
    • Optimize AWS data infrastructure for performance, scalability, and cost-effectiveness.
    • Implement best practices for data security and compliance on AWS.
  • Apache Spark & PySpark:
    • Develop and maintain scalable data pipelines using Apache Spark and PySpark.
    • Perform data extraction, transformation, and loading (ETL/ELT) processes.
    • Optimize Spark jobs for performance and efficiency.
    • Develop and maintain data quality checks and validation processes.
  • Amazon EMR:
    • Configure and manage EMR clusters for large-scale data processing.
    • Troubleshoot and resolve EMR cluster issues.
    • Optimize EMR cluster configurations for performance and cost.
    • Deploy and monitor spark applications on EMR.
  • Data Pipeline Development:
    • Design and implement robust and reliable data pipelines.
    • Automate data ingestion, processing, and storage processes.
    • Monitor data pipeline performance and troubleshoot issues.
    • Work with various data sources, both structured and unstructured.
  • Collaboration and Communication:
    • Collaborate with data scientists, analysts, and other engineers to understand data requirements.
    • Document data pipelines and infrastructure.
    • Communicate effectively with technical and non-technical stakeholders.
    • Participate in code reviews.
  • Performance Optimization:
    • Analyze query plans and optimize spark jobs.
    • Monitor and tune data processing performance.
    • Identify and resolve performance bottlenecks.

Qualifications:

  • Bachelor's degree in Computer Science, Data Science, or a related field (or equivalent experience).
  • Minimum 6-9 years of experience in a Data Engineering role.
  • Strong experience with Amazon Web Services (AWS) data services, particularly EMR.
  • Proficiency in Apache Spark and PySpark for data processing.
  • Experience with data warehousing and data lake concepts.
  • Strong SQL skills.
  • Experience with scripting languages (e.g., Python).
  • Understanding of data modeling and database design principles.
  • Experience with version control systems (e.g., Git).
  • Strong problem-solving and troubleshooting skills.
  • Excellent communication and collaboration skills.   
  • Experience with other big data technologies (e.g., Hadoop, Hive, Kafka) is a plus.   
  • Experience with data orchestration tools (ie airflow, step functions) is a plus.

Compensation, Benefits and Duration

Minimum Compensation: USD 37,000
Maximum Compensation: USD 130,000
Compensation is based on actual experience and qualifications of the candidate. The above is a reasonable and a good faith estimate for the role.
Medical, vision, and dental benefits, 401k retirement plan, variable pay/incentives, paid time off, and paid holidays are available for full time employees.
This position is not available for independent contractors
No applications will be considered if received more than 120 days after the date of this post

Similar Jobs

Yesterday
Remote
2 Locations
40K-142K Annually
Expert/Leader
40K-142K Annually
Expert/Leader
Agency • Information Technology
Design, build, and maintain operational and analytical data platform capabilities using Java Spring Batch, Python, AWS, Oracle, and Snowflake. Deliver end-to-end development, production rollout, and support; improve financial product data quality, modeling, and SQL performance; collaborate with business partners and squads.
Top Skills: Ansi SqlAWSAws BatchBashCloudFormationControl-MDockerJavaJenkinsKshOraclePythonS3SnowflakeSpring BatchSpring FrameworkSQLTerraformUnix Shell
14 Days Ago
Remote or Hybrid
United States
190K-240K Annually
Senior level
190K-240K Annually
Senior level
Big Data • Cloud • Productivity • Software • Database • Analytics • Automation
As a Senior Data Engineer at Jellyfish, you'll build and maintain data pipelines, optimize orchestration, automate CI/CD processes, and enhance data integration while ensuring high performance and reliability.
Top Skills: AirflowBigQueryDagsterDatabricksDbtPrefectPysparkPythonRedisSnowflakeSQLTerraform
23 Days Ago
Remote
USA
Senior level
Senior level
Automotive • Healthtech • Financial Services
The Senior Data Engineer will design and build scalable data platforms, implement data pipelines, and improve data operations using Azure and various modern tools.
Top Skills: .Net Core.Net FrameworkAirflowAzure Cloud ServicesAzure Data Lake Storage Gen2Azure Devops PipelinesAzure SynapseC#CloudFormationDbtPower BIPythonSnowflakeSQLTerraform

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account