Socure Logo

Socure

Staff Software Engineer-SRE

Reposted 2 Days Ago
Remote
Hiring Remotely in USA
180K-215K Annually
Senior level
Remote
Hiring Remotely in USA
180K-215K Annually
Senior level
Lead the architecture and development of Entity Resolution APIs; design batch and streaming data pipelines; collaborate on integrated machine learning models and high-performance distributed systems.
The summary above was generated by AI
Why Socure?

At Socure, we’re on a mission—to verify 100% of good identities in real time and eliminate identity fraud from the internet.

Using predictive analytics and advanced machine learning trained on billions of signals to power RiskOS™, Socure has created the most accurate identity verification and fraud prevention platform in the world. Trusted by thousands of leading organizations—from top banks and fintechs to government agencies—we solve real, high-impact problems at scale. Come join us!

Job Overview

We are looking for a Site Reliability Engineer (SRE) who will be supporting our Identity Graph initiatives.

Identity Graph Intelligence at Socure builds and maintains the core layer that connects and resolves identities across billions of data points. Our work powers Socure’s industry-leading identity verification and fraud prevention solutions by creating a unified, accurate, and real-time view of individuals. The team focuses on scalability, reliability, and advanced data engineering to support mission-critical applications for our customers

At Socure, you’ll join a high-performing engineering team dedicated to driving the reliability, scalability, and performance of our systems. You will collaborate cross-functionally with software engineers, technical support, and security teams to build and maintain robust, automated, and resilient infrastructure powering our critical applications. SREs play an essential role in architectural decision-making, incident response, and promoting a culture of continuous improvement, automation, and operational excellence.

What you'll do:
  • Design, build, and maintain scalable infrastructure to support high availability and performance.

  • Develop tools and automation to eliminate manual operations and increase system reliability.

  • Monitor production systems, respond to incidents, conduct root cause analyses, and lead post-mortem reviews.

  • Collaborate with development and platform teams to implement best practices for deployment, observability, and reliability.

  • Drive incident management and participate in an on-call rotation to ensure 24/7 availability for mission-critical platforms.

  • Establish and improve SLAs, SLOs, and SLIs to track and enhance system reliability and performance.

  • Champion a culture of continuous improvement, resilience, and automation across engineering and operations.

  • Build and maintain CI/CD pipelines and infrastructure-as-code to streamline deployments and accelerate development cycles.

  • Develop monitoring dashboards and alerts using tools such as Prometheus, Grafana, Datadog, or Splunk.

  • Support security and compliance efforts by implementing infrastructure hardening and best practices aligned with frameworks (SOC 2, ISO 27001).

  • Mentor junior engineers and act as a technical resource for improving reliability within cross-functional teams.

What you bring:
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.

  • 8+ years of experience in software development, site reliability engineering, DevOps, or infrastructure engineering—preferably in high-scale, high-availability environments.

  • Proficiency in programming/scripting languages (Python, Java, Go, Bash, or Terraform).

  • Proven hands-on experience building tools and automation for infrastructure and operations.

  • Deep understanding of microservices architecture, RESTful APIs, and cloud platforms such as AWS

  • Expertise in Kubernetes, Docker, and container orchestration in production environments.

  • Experience with observability tools (Prometheus, Grafana, Datadog, ELK stack).

  • Strong knowledge of distributed systems, performance optimization, and operational excellence.

  • Experience with SQL and NoSQL databases, caching layers, and troubleshooting complex production issues.

  • Familiarity with CI/CD pipelines (Jenkins, GitHub Actions, ArgoCD, CircleCI) and infrastructure-as-code (Terraform).

  • Solid understanding of networking, security, and compliance frameworks.

  • Excellent communication and collaboration skills; able to work effectively across engineering, operations, and support teams.

  • Strong problem-solving skills, detail orientation, and a proactive mindset for continual reliability improvement.

Socure is an equal opportunity employer and values diversity of all kinds at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Follow Us!

YouTube | LinkedIn | X (Twitter) | Facebook

Top Skills

AWS
Elasticsearch
Go
Java
Kafka
Kubernetes
Python
Scala
Spark
Sqs
Vespa

Similar Jobs

9 Days Ago
In-Office or Remote
2 Locations
Senior level
Senior level
Artificial Intelligence • Enterprise Web • Machine Learning • Natural Language Processing • Software • Conversational AI • Automation
As a Site Reliability Engineer, you will build systems and abstractions, maintain cloud security, optimize CI/CD processes, and manage on-call practices while collaborating across teams and driving best practices.
Top Skills: Aws CloudElasticsearchGoJavaScriptMongoDBNode.jsReactRedis
18 Days Ago
Easy Apply
Remote
United States
Easy Apply
200K-275K
Senior level
200K-275K
Senior level
Big Data • Fintech • Mobile • Payments • Financial Services
Manage technical strategy and engineering operations ensuring application reliability. Collaborate cross-functionally and develop talent within the team while advocating for quality and ownership.
Top Skills: AWSKotlinKubernetesMySQLPythonSpark
8 Minutes Ago
Remote or Hybrid
CA, USA
160K-250K Annually
Senior level
160K-250K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead and develop high-performing Site Reliability Engineering teams, drive cross-functional collaboration, and oversee platform reliability and engineering excellence initiatives.
Top Skills: AWSAzureGCPOci

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

  • Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
  • Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
  • Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
  • Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account