Inrupt

Site Reliability Engineer

Reposted 15 Days Ago

Hybrid

Boston, MA, USA

Mid level

Hybrid

Boston, MA, USA

Mid level

The Site Reliability Engineer manages AWS Kubernetes infrastructure, ensuring operational excellence, security, and scalability, while implementing reliability improvements and best practices.

The summary above was generated by AI

We're seeking an experienced Site Reliability Engineer to take ownership of our AWS-based Kubernetes infrastructure. You'll be responsible for the operational excellence, security, and scalability of our developments and production systems supporting our Enterprise Solid Server (ESS) technology for enterprise clients. You'll have significant autonomy to establish best practices, implement reliability improvements, and build the foundation for our growing infrastructure needs.

Inrupt is headquartered in Boston, MA. This role is ideally based in Boston. Our team operates on a hybrid schedule, working from the office two days a week and enjoying remote flexibility on the remaining days.

Key Responsibilities

Manage day-to-day operations of AWS EKS clusters across development, staging, and production environments
Monitor system health, triage alerts, and respond to incidents (15-minute SLO)
Perform regular patching, upgrades, and maintenance of the infrastructure components.
Maintain and optimize our technology stack: EKS, MSK, RDS, ArgoCD, Traefik, Sysdig, Mezmo, Terraform
Manage AWS services, including VPC, RDS, MSK (Kafka), S3, and networking infrastructure
Implement and maintain comprehensive monitoring dashboards, alerting, and centralized logging
Maintain Terraform-based infrastructure automation and practice GitOps principles
Manage data infrastructure lifecycle: RDS databases, Kafka clusters, Redis caching, S3 buckets
Implement security baselines, manage RBAC, conduct vulnerability scanning, and remediation
Design and test disaster recovery strategies with defined RTO/RPO
Support ArgoCD deployments and troubleshoot application deployment issues
Create and maintain documentation and troubleshooting guides
Provide architectural reviews and capacity planning aligned with business objectives
Optimize infrastructure costs while maintaining performance and reliability
Establish on-call rotation and incident response procedures with post-mortem analysis
Work closely with the engineers to ensure operational requirements are built into our products
Work closely with engineers to ensure that non-functional requirements are met by the proposed architecture, design, and development choices.

About You

Required:

Experience managing production Kubernetes clusters, preferably AWS EKS
Deep knowledge of cloud platform services (e.g EC2, EKS, VPC, RDS, S3, IAM, CloudWatch)
Strong Terraform experience for infrastructure automation
Experience with monitoring platforms (Sysdig, Datadog, or similar) and logging systems
Hands-on experience with ArgoCD or similar tools
Strong understanding of networking: VPCs, security groups, load balancers, DNS
Database administration experience (PostgreSQL), including backups and performance tuning
Experience with message queue systems (Kafka/MSK preferred)
Proficiency in Python, Bash, or Go for automation
Excellent communication skills with the ability to explain complex technical concepts clearly
Ownership mindset with strong problem-solving and analytical skills
Experience with security best practices and compliance frameworks (SOC2, GDPR)

Preferred:

Service mesh experience (Istio, Linkerd, Consul)
FinOps practices and cost optimization experience
Chaos engineering and resilience testing
Multi-region infrastructure experience
AWS certifications (Solutions Architect, DevOps Engineer, or Security)
CKA (Certified Kubernetes Administrator) certification
Experience supporting government or highly regulated industries

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories