Capstone Integrated Solutions Logo

Capstone Integrated Solutions

Senior Data Engineer (AWS)

Posted 17 Days Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
Lead design and implementation of AWS-based data lake and pipelines (batch and streaming), build ML models on SageMaker, integrate foundation models via Bedrock, implement entity resolution, support Azure-to-AWS migrations, ensure data quality/security, and collaborate with cross-functional teams to deliver analytics and ML-powered solutions.
The summary above was generated by AI

Capnexus is a comprehensive services provider. Our team consists of outstanding professionals, highly experienced in designing, building, and supporting retail software. We see ourselves as a build-as-a-service provider who follows a repeatable business pattern that can be applied to a variety of platforms and verticals. Having a culture built on outcomes and delivery at the core of the business, Capnexus is providing its customers with a complete suite of services for software development, system analysis, integration, implementation, and support, as well as the option to engage a single team to perform all the services they require. 

Who You Are and What You'll Do: 

Capnexus is looking for a highly skilled Senior AWS Data Engineer to lead data architecture, pipeline development, and data integrations. This is an exciting opportunity to apply advanced cloud data engineering skills on a platform that leverages generative AI to automate and modernize enterprise workflows. 

Responsibilities:

  • Participate in data discovery workshops to inventory source systems including property management platforms, marketing channels, and CRM data, and translate findings into data lake architecture requirements.
  • Design and implement a multi-zone enterprise data lake on Amazon S3 (raw, conformed, enriched, aggregated) with ingest, cleansing, and business layers including schema versioning, checksum validation, business rule validation, and quarantine/notify workflows on failure.
  • Build batch and streaming data ingestion pipelines using AWS Glue, Amazon Kinesis, and containerized ingestion applications across CDP, marketing, and property management data sources.
  • Write PySpark and Python ETL code for AWS Glue jobs to transform, cleanse, and enrich data at scale; apply Apache Iceberg table format for ACID-compliant, schema-evolving data lake tables.
  • Implement data transformation and orchestration frameworks using AWS Glue ETL and AWS Step Functions; configure AWS Glue Data Catalog with crawlers for automated metadata management and discovery.
  • Implement AWS Lake Formation for fine-grained data governance including table-level and column-level permissions, data filters, and resource links — not just IAM-level access controls.
  • Configure Amazon Athena for serverless SQL querying across the data lake with performance optimization (Parquet format, partitioning, column pruning, file size management, caching); implement Amazon DynamoDB for sub-second customer profile lookups, with DAX where latency requirements demand it.
  • Develop and deploy AWS Lambda functions using AWS Lambda Powertools for structured logging, handler routing, and observability; implement error handling patterns including exponential backoff, retries, dead-letter queues, and CloudWatch alarms.
  • Write and maintain Terraform (or CloudFormation/CDK) modules to provision and deploy AWS data infrastructure as part of the CI/CD pipeline — data engineers own their infrastructure deployment, not DevOps.
  • Integrate CI/CD pipelines using GitHub Actions for automated deployment of Glue jobs, Lambda functions, and Step Functions workflows with lint checks and validation gates.
  • Support Azure Data Lake migration: conduct discovery of ADLS assets, schemas, and transformation logic; provision AWS target environments; execute migration via AWS DataSync; perform row-count reconciliation, schema validation, and checksum comparison post-migration.
  • Design and implement entity resolution pipelines to identify, deduplicate, and merge customer records into unified golden records using deterministic and fuzzy matching with lineage tracking and manual review pathways.
  • Build and maintain data models to support Customer 360 views and executive analytics dashboards via Amazon QuickSight.
  • Ensure data quality, validation, and integrity across all pipeline stages; support UAT for data-dependent features.
  • Collaborate with Full Stack, DevOps/MLOps, and AI/ML team members working with Bedrock and SageMaker; contribute to architecture documentation, pipeline runbooks, and data governance documentation.

Qualifications:

  • 5+ years of hands-on data engineering experience with at least 2+ years in AWS cloud environments.
  • Strong proficiency in Python and SQL; hands-on PySpark or Scala coding experience for AWS Glue ETL — this is a coding role, not a configuration role.
  • Hands-on experience with AWS Glue (jobs, crawlers, Data Catalog), AWS Step Functions, AWS Lambda, and Amazon S3 data lake architecture.
  • Proficiency with AWS Lambda Powertools for structured logging, handler management, and observability in production serverless workloads.
  • Working knowledge of Apache Iceberg table format including schema evolution, time travel, and partition management.
  • Hands-on experience with Terraform, AWS CloudFormation, or AWS CDK for infrastructure as code integrated into CI/CD pipelines — candidates who have only consumed pre-made DevOps templates will not meet this requirement.
  • Experience with AWS Lake Formation for fine-grained access control including table-level and column-level permissions, data filters, and resource links.
  • Solid understanding of DynamoDB data modeling and key design patterns for sub-second lookups; familiarity with DAX for caching.
  • Experience with Amazon Athena performance tuning: file formats, partitioning strategies, query optimization, and understanding of when Athena is and is not the right tool.
  • Experience with GitHub Actions or comparable CI/CD tooling for automated deployment of data pipeline code.
  • Strong understanding of data quality patterns: schema validation, checksum validation, business rule validation, quarantine workflows, and lineage tracking.
  • Strong analytical, problem-solving, and communication skills; comfortable working in Agile/Scrum teams alongside AWS Professional Services.

Nice to Have:

  • Experience with Azure Data Lake Storage (ADLS) and Azure-to-AWS migration using AWS DataSync.
  • Familiarity with AWS Entity Resolution service — specifically matching workflows, rule-based and ML-based matching, and output schema features.
  • Exposure to Amazon Bedrock or Amazon SageMaker in a data engineering support capacity (pipeline integration, feature stores, inference data prep).
  • Knowledge of Amazon QuickSight for dataset preparation, SPICE optimization, and embedded dashboard development.
  • Familiarity with Kiro CLI or AI-assisted development tooling for pipeline automation.
  • AWS Certification (Data Analytics Specialty, Database Specialty, or Solutions Architect).
  • Background in real estate, property management, marketing technology, or CRM data platforms.

"Our Culture": 

At Capstone, the central principles that we all adhere to, and the glue that holds us together, are our keystones. Our four keystones are: 

"A Customer Obsessed, Delivery Focused, Culture" 

  • We’re driven to exceed our customers’ expectations by listening, leading, solving problems, and delivering what we promise 
  • We aim to be the most dependable and trusted partner serving our customers. TRUST = CONSISTENCY x TIME 

"A Culture of Learning and Sharing" 

  • We value “Lifetime Learners”; those who are hungry, competitive, curious, and self-motivated in their pursuit of knowledge. 
  • Personal and professional growth depends on teamwork and continuous learning. By sharing knowledge, skills, ideas, and effort, we benefit our customers, ourselves, and our communities. 
  • We recognize that the thoughts, feelings, and backgrounds of others are as important as our own. Everyone has something to learn and everyone has something they can teach. 
  • Knowledge and ability are valued. Sharing knowledge and helping others learn new capabilities is valued exponentially. 

"A Culture of Growth and Scalability" 

  • Growth comes from not establishing barriers in your role. “Cross functional skill sets are valued and help us deliver to our customers in a truly agile fashion. It comes with understanding that when asked to do something new, you will need support, have questions, and make some mistakes along the way. 
  • The most elegant solution is a simple solution. Simple doesn’t mean easy. It’s often more difficult to break a complex problem down into simple, scalable terms. We don’t appreciate, or value, over architected solutions or superfluous coding. 
  • Time is one of our most precious commodities. Scalability implies being respectful of this and passionate about making the most efficient use of each and every one of our team members time. 

"All Work is Strategic" 

  • No matter how small a project or assignment appears, every single engagement is an opportunity for us to prove ourselves, build trust, and develop relationships that last and grow 
  • Every task, interaction, and commitment matters 
  • Big or small, we execute our plans and strategies with focus, commitment, and passion 

 

We offer: 

Job Type: Full-time, 1099

Benefits: 

  • Remote work 

 

Capnexus is an equal opportunity employer. We embrace and celebrate diversity and are committed to creating an inclusive and safe environment for all employees. Experience comes in many forms, and we’re dedicated to adding new perspectives to the team. We encourage you to apply even if your experience doesn’t perfectly align with what we have listed. We look forward to hearing from you. 

No Agencies Please! 

Similar Jobs

Yesterday
In-Office or Remote
Mid level
Mid level
Database
As an AWS Data Engineer, you will develop data pipelines, create data models, troubleshoot issues, and collaborate with teams while utilizing tools like PySpark and SQL.
Top Skills: AirflowAmazon AthenaAmazon EmrAmazon GlueAmazon RedshiftPrestoPysparkSnowflakeSQL
47 Minutes Ago
Remote or Hybrid
California, USA
141K-229K Annually
Senior level
141K-229K Annually
Senior level
Consumer Web • eCommerce • Machine Learning • Software • Sports • Analytics
Drive front-end development for customer-facing web and mobile experiences. Lead feature delivery from concept to production using React, Next.js, TypeScript, Svelte, and Tailwind. Troubleshoot integrations, write unit and integration tests, produce technical documentation, collaborate with product/design/data, and use AI tooling (preferably Claude) to accelerate design and implementation.
Top Skills: ClaudeJavaScriptNext.JsReactSvelteTailwind CssTypescript
47 Minutes Ago
Remote or Hybrid
US
141K-229K Annually
Senior level
141K-229K Annually
Senior level
Consumer Web • eCommerce • Machine Learning • Software • Sports • Analytics
Lead backend and full-stack work on the Payments team, building multi-gateway integrations (Stripe, PayPal), payment APIs, and customer payment UIs. Ensure secure, compliant (PCI-DSS) payment flows, reliability, observability, and scalability across AWS/Kubernetes microservices. Partner cross-functionally to design architecture, implement settlement/reconciliation, and maintain high availability.
Top Skills: .NetAi-Assisted Development ToolsAWSC#DatadogDynamoDBKafkaKubernetesPaypalPci-DssPostgresReactStripeSvelteTypescript

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account