Invisible AI

Senior Site Reliability Engineer (SRE)

Reposted Yesterday

Remote

110K-170K

Senior level

Remote

110K-170K

Senior level

As a Senior Site Reliability Engineer, you will develop and maintain scalable infrastructure, automate processes, manage deployments, and ensure system reliability across devices.

The summary above was generated by AI

At Invisible AI, we are building the future of computer vision. Today, our core focus is on developing an end-to-end platform that can digitize manufacturing operations. We deploy edge AI cameras to digitize all steps of manual assembly work which helps people-driven manufacturing be accurate, reliable, and safe. Coming from the world of self-driving cars, the founders of Invisible AI have years of experience in building and deploying large-scale AI & Machine Learning pipelines. Join us and help build a company that will deliver the endless possibilities of computer vision to real-world customers!

As a Site Reliability Engineer, you will build the technology to enable our platform to deploy, run, and monitor Invisible AI’s software at scale across tens of independent deployments and thousands of devices. The SRE works closely with all other engineering teams and owns internal tools to enable faster development and deployment, like secure ephemeral debug environments, streamlined access controls, CI/CD systems, and a custom in-house device management platform for device configuration and software releases.

Responsibilities:

Design, build, and maintain scalable and resilient infrastructure on the edge.
Develop automation and infrastructure-as-code solutions using Terraform, Ansible, and scripting languages (Python, Bash).
Deploy and manage containerized applications using Docker and related technologies.
Ensure system observability by building and optimizing monitoring systems, particularly using Prometheus.
Troubleshoot and optimize Linux-based systems (e.g., Red Hat, CentOS, Ubuntu).
Collaborate with security teams to implement robust security practices and ensure compliance with best practices.
Work closely with software engineers to improve system performance, reliability, and deployment pipelines.
Support and maintain networking infrastructure, including troubleshooting protocols and configurations.
Manage cloud and on-premise infrastructure, with a focus on automation and scalability.
Contribute to incident response, postmortems, and process improvements.

Requirements:

8+ years of experience in Site Reliability Engineering and building/managing infrastructure at scale, particularly on edge devices.
Strong experience with Python scripting (able to read and write code fluently).
Comfortable working with Linux systems, Docker, and infrastructure-as-code tools like Terraform and Ansible.
Hands-on experience with observability stacks (e.g., Prometheus, Grafana).
Deep understanding of SLAs/SLOs/SLIs and how to operationalize them.
Strong systems thinking: understands how distributed systems work and how to make them resilient.
Experience with CI/CD pipelines, incident management, and system hardening.
Deep understanding of networking concepts and protocols.
Familiarity with cloud platforms (AWS, Azure, Google Cloud) is a plus.
Experience with Windows Services/VMs is a plus.
Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent experience.

Our compensation package plays a big part in how we value your impact on our mission. Our base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The estimated base salary guideline range for this role is between $110,000-$170,000 and may be modified. This will vary based on various factors, including market and individual qualifications objectively assessed during the interview process. In addition to base salary, your compensation package will include additional components such as equity, sales incentive pay (for sales roles), and benefits. Invisible AI is an equal-opportunity employer. We do not discriminate based on age, ethnicity, gender, nationality, religious belief, or sexual orientation.

Top Skills

Ansible

AWS

Azure

Docker

GCP

Grafana

Linux

Prometheus

Python

Terraform

Similar Jobs

CrowdStrike

Senior Site Reliability Engineer

Yesterday

Remote or Hybrid

CA, USA

155K-255K Annually

Senior level

155K-255K Annually

Senior level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

Lead and develop high-performing Site Reliability Engineering teams, drive cross-functional collaboration, and oversee platform reliability and engineering excellence initiatives.

Top Skills: AWSAzureGCPOci

ServiceNow

Machine Learning Engineer

7 Days Ago

Remote or Hybrid

Santa Clara, CA, USA

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

Design and implement infrastructure for AI workloads, optimize GPU clusters, improve SRE practices, and mentor colleagues in a senior engineering role.

Top Skills: AnsibleGitlab CiGoHelmJavaKubernetesPrometheusPythonSplunk

Cisco Meraki

Senior Site Reliability Engineer

16 Days Ago

Easy Apply

Remote or Hybrid

Easy Apply

147K-215K Annually

Senior level

147K-215K Annually

Senior level

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI

The role involves developing and managing scalable cloud infrastructure, automating tasks, and leading technical projects in a 24/7 on-call environment.

Top Skills: AnsibleApache AirflowArgoAWSDebianDockerIaasLuigiPythonRubyScalaTerraformUbuntu

What you need to know about the Boston Tech Scene

Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.

Key Facts About Boston Tech

Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories