Site Reliability Engineer
Who We Are – MassMutual
At MassMutual, we help millions of people find financial freedom, offer financial protection, and plan for the future. We’re passionate about helping millions of people find financial freedom and this passion has driven our approach to developing meaningful experiences for our customers. We do this by building trust with our customers by being knowledgeable problem solvers and prioritize their needs above all else. We Live Mutual. MassMutual was named a Top Place to Work in 2019 according to The Boston Globe and No. 84 on the Fortune 500 list of largest companies.
Are you a creative thinker and strong problem solver? Would you describe yourself as someone who is constantly analyzing every change for its risk and what its impact could be down the road, not just today. And what that means for the larger infrastructure? If yes … keep reading, this opportunity is for you!
Meet the Team – Site Reliability Engineering
It is an exciting time at MassMutual! MassMutual is going through a Digital Transformation, continuing to be a leading customer-centric company. The Site Reliability Engineering team is comprised of highly skilled problem solvers who are motivated to create innovative solutions that exceed the changing needs of our customers and move MassMutual – and the industry – forward! The team culture is collaborative, cross-functional, and uses old and new technologies combined with the work/life balance.
To continue our digital transformation, we are looking for a software and systems savvy engineer to join the Site Reliability team that is tasked with designing, building, and integrating solutions across technical and business capability domains with cost and strategic implications. Solutions may consist of proven or unproven technologies or multiple implementation technologies at once within domains that experience rapid change.
Our Site Reliability Engineer would need advanced knowledge of infrastructure, scripting skills, and engineering disciplines. This role requires equal parts development and operations with a software engineer mindset. Using technical and operational skills, this role will increase application reliability at scale. Also, this role would be responsible for deploying and supporting infrastructure within multi-cloud provider strategy.
Objectives and Responsibilities
- Write clean, high-performance, and well tested, infrastructure code with a focus on reusability and automation (i.e. Shell, Python, GoLang, Puppet, etc…)
- Develop monitoring, define SLAs, SLOs and error budgets for mission critical platforms while helping to coordinate product launches and reliability exercises
- Collaborate and contribute with other enterprise teams on the company’s Cloud journey, including the impact on infrastructure, networks and security
- Work closely with Architects and provide support to senior staff, ensuring designs align with the technological and business directions across the company
- Support IT deployments with involvement Platform as a Service (PaaS), Software as a Service (SaaS), or Infrastructure as a Service (IaaS).
- Manage central platforms as a service for growth and scale
- Implement enhancements to the company's digital and data infrastructure, supporting internal customer's operational needs.
- Bachelor's Degree in Computer Science or equivalent and 2+ years of relevant work experience
- 1 - 3 years with cloud environments and provisioning automation
- Prior experience with Linux, troubleshooting and coding/scripting using high-level languages
- Prior experience with infrastructure systems that support enterprise data science and analytics capabilities, including streaming and real-time analytics (Kafka, Spark Streaming, and Snowplow).
- Deep understanding of common scripting languages (Ruby, Python, Bash, Go). Powershell is a plus.
- Experience working with at least one object-oriented language (Java, C#, etc...)
- Involvement in some on-premise to cloud migration
- Experience managing a full application stack with high availability requirements is preferred.
- Full Stack troubleshooting experience including networking, operating system (Debian, CentOS), Apache, HA Proxy, Nginx, RDBMS is preferred.
- Experience leveraging monitoring tools such as Splunk, New Relic, Nagios for troubleshooting is preferred.
- Experience with AWS and/or Azure stack - particularly in the areas of networking (VPCs, security groups), VMs (EC2), databases (RDS), load balancing (ELB, ALB) is preferred.
- Experience with performance tuning on Linux kernels is preferred.
- Strong written and verbal communication skills
- Able to thrive in a collaborative and cross-functional environment
Why Join Us
We’ve been around since 1851. During our history, we’ve learned a few things about making sure our customers are our top priority. In order to meet and exceed their expectations, we must have the best people providing the best thinking, products and services. To accomplish this, we celebrate an inclusive, vibrant and diverse culture that encourages growth, openness and opportunities for everyone. A career with MassMutual means you will be part of a strong, stable and ethical business with industry leading pay and benefits. And your voice will always be heard.
Recognized as a 2019 World’s Most Ethical Company by Ethisphere, MassMutual is guided by a single purpose: We help people secure their future and protect the ones they love. As a company owned by our policyowners, we are defined by mutuality and our vision to put customers first. It’s more than our company structure – it’s our way of life. We are a company of people protecting people. Our company exists because people are willing to share risk and resources, and rely on each other when it counts. At MassMutual, we Live Mutual.
MassMutual is an Equal Employment Opportunity employer Minority/Female/Sexual Orientation/Gender Identity/Individual with Disability/Protected Veteran. We welcome all persons to apply. Note: Veterans are welcome to apply, regardless of their discharge status.