The Big Data Lead will implement ETL pipelines, ensure data integrity, troubleshoot PySpark applications, and integrate with existing frameworks while leading a team.
Responsibilities: • Experience with big data processing and distributed computing systems like Spark. • Implement ETL pipelines and data transformation processes. • Ensure data quality and integrity in all data processing workflows. • Troubleshoot and resolve issues related to PySpark applications and workflows. • Understand source, dependencies and data flow from converted PySpark code. • Strong programming skills in Python and SQL. • Experience with big data technologies like Hadoop, Hive, and Kafka. • Understanding of data warehousing concepts and relational databases like SQL. • Demonstrate and document code lineage. • Integrate PySpark code with frameworks such as Ingestion Framework, DataLens, etc., • Ensure compliance with data security, privacy regulations, and organizational standards. • Knowledge of CI/CD pipelines and DevOps practices. • Strong problem-solving and analytical skills. • Excellent communication and leadership abilities. Qualifications: • 4+ years of experience in big data development, Hadoop , Hive & Spark framework. • Good to have experience in SAS. • Strong Python, PySpark Development and SQL knowledge. • Certification in big data or cloud technologies is preferred.
Top Skills
Hadoop
Hive
Kafka
Pyspark
Python
Spark
SQL
Similar Jobs
Information Technology • Consulting
The Big Data Lead will manage data projects utilizing technologies like Amazon Redshift, Azure Data Factory, and Apache Spark to optimize data processes.
Top Skills:
Amazon RedshiftSparkAzure Data FactoryDatabricksGoogle Cloud PlatformHadoopHiveScalaSnowflake
Information Technology • Consulting
Lead the development of data pipelines and transformations in Azure Databricks, converting Scala programs to PySpark while leveraging various Azure technologies.
Top Skills:
AdfAzure Data Lake Gen 2Azure DatabricksDelta LakePysparkPythonSparkSynapse Analytics
Information Technology • Consulting
The Big Data Lead will manage database development, ETL/ELT processes, and data warehousing, optimizing performance and ensuring data pipelines work reliably.
Top Skills:
AWSAws GlueAws S3AzureAzure BlobAzure Data FactoryAzure DevopsGitJenkinsOracleSnowflakeSQLTalendTeamcity
What you need to know about the Boston Tech Scene
Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.
Key Facts About Boston Tech
- Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
- Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
- Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
- Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories
