Socure

Staff Software Engineer-SRE

Reposted 2 Days Ago

Remote

Hiring Remotely in USA

180K-215K Annually

Senior level

Remote

Hiring Remotely in USA

180K-215K Annually

Senior level

Lead the architecture and development of Entity Resolution APIs; design batch and streaming data pipelines; collaborate on integrated machine learning models and high-performance distributed systems.

The summary above was generated by AI

Why Socure?

At Socure, we’re on a mission—to verify 100% of good identities in real time and eliminate identity fraud from the internet.

Using predictive analytics and advanced machine learning trained on billions of signals to power RiskOS™, Socure has created the most accurate identity verification and fraud prevention platform in the world. Trusted by thousands of leading organizations—from top banks and fintechs to government agencies—we solve real, high-impact problems at scale. Come join us!

Job Overview

We are looking for a Site Reliability Engineer (SRE) who will be supporting our Identity Graph initiatives.

Identity Graph Intelligence at Socure builds and maintains the core layer that connects and resolves identities across billions of data points. Our work powers Socure’s industry-leading identity verification and fraud prevention solutions by creating a unified, accurate, and real-time view of individuals. The team focuses on scalability, reliability, and advanced data engineering to support mission-critical applications for our customers

At Socure, you’ll join a high-performing engineering team dedicated to driving the reliability, scalability, and performance of our systems. You will collaborate cross-functionally with software engineers, technical support, and security teams to build and maintain robust, automated, and resilient infrastructure powering our critical applications. SREs play an essential role in architectural decision-making, incident response, and promoting a culture of continuous improvement, automation, and operational excellence.

What you'll do:

Design, build, and maintain scalable infrastructure to support high availability and performance.
Develop tools and automation to eliminate manual operations and increase system reliability.
Monitor production systems, respond to incidents, conduct root cause analyses, and lead post-mortem reviews.
Collaborate with development and platform teams to implement best practices for deployment, observability, and reliability.
Drive incident management and participate in an on-call rotation to ensure 24/7 availability for mission-critical platforms.
Establish and improve SLAs, SLOs, and SLIs to track and enhance system reliability and performance.
Champion a culture of continuous improvement, resilience, and automation across engineering and operations.
Build and maintain CI/CD pipelines and infrastructure-as-code to streamline deployments and accelerate development cycles.
Develop monitoring dashboards and alerts using tools such as Prometheus, Grafana, Datadog, or Splunk.
Support security and compliance efforts by implementing infrastructure hardening and best practices aligned with frameworks (SOC 2, ISO 27001).
Mentor junior engineers and act as a technical resource for improving reliability within cross-functional teams.

What you bring:

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
8+ years of experience in software development, site reliability engineering, DevOps, or infrastructure engineering—preferably in high-scale, high-availability environments.
Proficiency in programming/scripting languages (Python, Java, Go, Bash, or Terraform).
Proven hands-on experience building tools and automation for infrastructure and operations.
Deep understanding of microservices architecture, RESTful APIs, and cloud platforms such as AWS
Expertise in Kubernetes, Docker, and container orchestration in production environments.
Experience with observability tools (Prometheus, Grafana, Datadog, ELK stack).
Strong knowledge of distributed systems, performance optimization, and operational excellence.
Experience with SQL and NoSQL databases, caching layers, and troubleshooting complex production issues.
Familiarity with CI/CD pipelines (Jenkins, GitHub Actions, ArgoCD, CircleCI) and infrastructure-as-code (Terraform).
Solid understanding of networking, security, and compliance frameworks.
Excellent communication and collaboration skills; able to work effectively across engineering, operations, and support teams.
Strong problem-solving skills, detail orientation, and a proactive mindset for continual reliability improvement.

Socure is an equal opportunity employer and values diversity of all kinds at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Follow Us!

YouTube | LinkedIn | X (Twitter) | Facebook

Top Skills

AWS

Elasticsearch

Java

Kafka

Kubernetes

Python

Scala

Spark

Sqs

Vespa

Similar Jobs

Kustomer

Software Engineer

9 Days Ago

In-Office or Remote

Senior level

Artificial Intelligence • Enterprise Web • Machine Learning • Natural Language Processing • Software • Conversational AI • Automation

As a Site Reliability Engineer, you will build systems and abstractions, maintain cloud security, optimize CI/CD processes, and manage on-call practices while collaborating across teams and driving best practices.

Top Skills: Aws CloudElasticsearchGoJavaScriptMongoDBNode.jsReactRedis

Affirm

Staff Software Engineer

18 Days Ago

Easy Apply

Remote

United States

Easy Apply

200K-275K

Senior level

200K-275K

Senior level

Big Data • Fintech • Mobile • Payments • Financial Services

Manage technical strategy and engineering operations ensuring application reliability. Collaborate cross-functionally and develop talent within the team while advocating for quality and ownership.

Top Skills: AWSKotlinKubernetesMySQLPythonSpark

CrowdStrike

Senior Engineering Manager

8 Minutes Ago

Remote or Hybrid

CA, USA

160K-250K Annually

Senior level

160K-250K Annually

Senior level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

Lead and develop high-performing Site Reliability Engineering teams, drive cross-functional collaboration, and oversee platform reliability and engineering excellence initiatives.

Top Skills: AWSAzureGCPOci

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories