Censys

Senior Site Reliability Engineer

Posted Yesterday

Easy Apply

Remote

Hiring Remotely in US

145K-190K Annually

Senior level

Easy Apply

Remote

Hiring Remotely in US

145K-190K Annually

Senior level

As a Senior Site Reliability Engineer, you'll design, build, and deploy tools to enhance developer efficiency, manage cloud services, and ensure production reliability through collaboration with development teams.

The summary above was generated by AI

Company Background

Censys’ mission is to be the one place to understand everything on the internet. Frustrated by the lack of trustworthy Internet intelligence, we set out to create the industry’s most comprehensive, accurate, and up-to-date map of the Internet. Today, Censys delivers real-time Internet intelligence and actionable threat insights to global governments, over 50% of the Fortune 500, and leading threat intelligence providers worldwide.

Location

This is a fully remote position within the United States.

Role Summary

As a Senior Site Reliability Engineer (SRE) on the Infrastructure and Ops platform team, you will help design, build, and deploy the tools used to empower our development teams and production applications. We’re looking for talented engineers to help grow our operational maturity, as well as equally enjoy mastering cloud-native technologies to build and support our microservice architecture growth and reliability.

As a Developer Efficiency and Experience focused SRE, you will be responsible for helping improve the efficiency of engineering and our development teams by supporting the SDLC and workflows of our developers, ranging from writing supporting application code, automation, and most importantly, empowering developers to have the confidence to create, deploy, and manage their services end-to-end inside the platform.

What you'll do

Build and maintain tooling to support our applications in Kubernetes and in the Google Cloud Platform.
Work with development teams to help them build, ship, and deploy services and applications with ease and confidence, and promote service resilience and reliability.
Help ensure smooth operations of our production environments, and work with developers to help debug complex issues as they arise. This includes creating and facilitating the capturing and monitoring of the 4 golden signals in our applications.
Help to create a self-service platform by working with the rest of the SRE and infrastructure team to accelerate and promote developer velocity, including service catalogs, repository tooling and documentation. We believe in the self-service model and treat the development team as our internal customers, including listening to feedback, seeking out improvements, and quickly iterating to continually provide value.
Participate in a shared on-call rotation schedule. We believe in service end-to-end ownership, and as such, both development teams and SRE participate in on-call. Our SRE team is responsible for maintaining and being on call for our infrastructure environments and ensuring primary site uptime.

Required Qualifications

5+ years of experience in an SRE role or similar.
Experience deploying, managing, and debugging applications in a Kubernetes environment. We leverage Helm and Crossplane heavily to deploy our applications.
Experience building, securing, and managing container images.
Experience working with Cloud-based environments, and interacting with Cloud services such as CloudSQL databases, Pub/Sub, Memorystore, and others.
Familiarity with Infrastructure-as-code Tools, such as Terraform, Crossplane, or similar.
Experience with tools and solutions used to monitor the 4 golden signals (latency, traffic, errors, and load), including Prometheus, Grafana, and OpenTelemetry.
Familiarity with a monorepo, trunk-based development model with monolithic build tooling and CI/CD, with a strong desire to achieve Continuous Deployment. Familiarity with CI/CD systems such as GitHub Actions, ArgoCD, or similar.
Ability to communicate and support developers with empathy to support their day-to-day roles, seeking ways to automate and promote self-service as necessary to continually enable developers to move with higher velocity and confidence through the entire SDLC.

Preferred Qualifications

Experience building and supporting a gRPC microservice architecture. Familiarity with Kubernetes Service Mesh, such as Istio or similar, to support our microservice architecture observability, multi-cluster routing, and network efficiency is highly desirable.
Ability to interface with application code to help assist in introducing best practices, golden path standardization, shared libraries, etc. The majority of our applications are written in Go. Python and Scala are present to a lesser degree.
Familiarity with Application Security tooling, such as dependency scanning, static analysis, and other linting tooling to help shift security left in the SDLC and CI process, and bridge engineering practices with our Security Operations team.
Familiarity and comfort with Linux-based environments.

Qualities

Have a passion for clean, concise architecture and enjoy working in a GitOps-based environment.
Comfortable with projects that have a large degree of uncertainty and risk
Desire to collaborate with and advise product management and leadership to balance long-term maintainability of software against rapid development, as well as clearly communicate BCDR implications.
Understands and practices the principles of continuous delivery to ensure quick, safe, and sustainable development in the face of changing priorities and uncertainty

What will make you stand out

Basic understanding of infrastructure operations, including load-balancers, ingresses, routing, DNS, and VPC design. We operate several data center environments across the globe in addition to our cloud infrastructure.
Not being afraid to dig into code to better understand how our applications work to better facilitate testing, integration, and development environments, to help instrument metrics, or to improve service reliability.
Deep understanding of how to optimize and support web-based applications and help protect public-facing assets with tooling such as anti-DDoS and Web Application Firewall technologies.

For high cost of living areas (Seattle, San Francisco Bay Area, or New York City), our target salary range for this role is between $166,000 USD and $1203,000 USD + bonus eligibility and equity.

For all other US locations, the expected salary range for this position is $145,000 USD - $190,000 USD, plus bonus eligibility and equity.

In addition to our great compensation package, our benefits are effective on day one and include, but are not limited to: 401k match, health, vision, dental, and more! Please see our careers page for more details.

This is a fully remote position within the United States.

Note to external recruiters/agencies: We are not currently engaging with third-party agencies for this role and will not accept unsolicited outreach. We kindly ask that you do not submit resumes or candidate profiles to our team.

California Privacy Rights Notice
Pursuant to the California Consumer Privacy Act (CCPA), we are providing you with notice that we collect personal information from job applicants for business purposes, including evaluating your candidacy for employment, conducting interviews, and, if applicable, completing the hiring process. The categories of information we may collect include identifiers (such as name and contact information), professional or employment-related information (such as work history, education, and references), and other information you provide in your application. We do not sell or share your personal information. For more information on how we use and protect your personal information, and your rights under the CCPA, please refer to our Privacy Policy.

Top Skills

Argocd

Cloudsql

Crossplane

Github Actions

Google Cloud Platform

Grafana

Helm

Kubernetes

Memorystore

Opentelemetry

Prometheus

Pub/Sub

Python

Scala

Terraform

Similar Jobs

Circle (circle.so)

Senior Site Reliability Engineer

7 Days Ago

Easy Apply

Remote

United States

Easy Apply

130K-140K Annually

Senior level

130K-140K Annually

Senior level

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software

The Senior Site Reliability Engineer will manage system incidents, improve monitoring and logging, optimize database infrastructure, and collaborate on scaling systems efficiently.

Top Skills: AWSClickhouseKubernetesMySQLPostgresRedis

MongoDB

Senior Site Reliability Engineer

20 Days Ago

Easy Apply

Remote or Hybrid

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.

Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform

DFIN

Senior Site Reliability Engineer

20 Days Ago

Remote or Hybrid

United States

Senior level

Fintech • Software

The Senior Site Reliability Engineer ensures fast, stable SaaS products through automation, collaboration, monitoring, and implementing AI tools to enhance performance and reliability.

Top Skills: Ai ToolsAnsibleAppdynamicsAWSAzureAzure DevopsBashC# .NetCosmosDatadogDynatraceHarnessJavaJenkinsKubernetesNew RelicPowershellPythonSaaSSQLTerraform

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories