Tiger Analytics is a fast-growing advanced analytics consulting firm. Our consultants bring deep expertise in Data Science, Machine Learning and AI. We are the trusted analytics partner for multiple Fortune 500 companies, enabling them to generate business value from data. Our business value and leadership has been recognized by various market research firms, including Forrester and Gartner. We are looking for top-notch talent as we continue to build the best global analytics consulting team in the world.
As a Lead Data Engineer, you will be responsible for designing, building, and maintaining scalable data pipelines on AWS cloud infrastructure. You will work closely with cross-functional teams to support data analytics, machine learning, and business intelligence initiatives. The ideal candidate will have strong experience with AWS services, Databricks, and Apache Airflow.
Key Responsibilities:
- Design, develop, and deploy end-to-end data pipelines on AWS cloud infrastructure using services such as Amazon S3, AWS Glue, AWS Lambda, Amazon Redshift, etc.
- Implement data processing and transformation workflows using Databricks, Apache Spark, and SQL to support analytics and reporting requirements.
- Build and maintain orchestration workflows using Apache Airflow to automate data pipeline execution, scheduling, and monitoring.
- Lead the migration of legacy data systems to modern cloud-based architectures.
- Develop and maintain CI/CD pipelines for data workflows.
- Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver scalable data solutions.
- Optimize data pipelines for performance, reliability, and cost-effectiveness, leveraging AWS best practices and cloud-native technologies.
- 10+ years of experience building and deploying large-scale data processing pipelines in a production environment.
- Hands-on experience in designing and building data pipelines on AWS cloud infrastructure.
- Strong proficiency in AWS services such as Amazon S3, AWS Glue, AWS Lambda, Amazon Redshift, etc.
- Lead the design, development, and optimization of large-scale data pipelines and data lakehouse architectures using Databricks
- Architect and implement batch and real-time streaming solutions leveraging Apache Spark on Databricks
- Hands-on experience with Apache Airflow for orchestrating and scheduling data pipelines.
- Solid understanding of data modeling, database design principles, and SQL and Spark SQL.
- Experience with version control systems (e.g., Git) and CI/CD pipelines.
- Excellent communication skills and the ability to collaborate effectively with cross-functional teams.
- Strong problem-solving skills and attention to detail.
This position offers an excellent opportunity for significant career development in a fast-growing and challenging entrepreneurial environment with a high degree of individual responsibility.
Top Skills
Similar Jobs
What you need to know about the Boston Tech Scene
Key Facts About Boston Tech
- Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
- Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
- Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
- Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories