Cloud Engineering – Site Reliability Engineer (SRE)

Sorry, this job was removed at 10:26 a.m. (EST) on Monday, March 26, 2018
Find out who's hiring in Watertown.
See all Developer + Engineer jobs in Watertown
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

The Opportunity: We are seeking highly motivated individuals to help us run modern infrastructure services in the cloud. As we build cloud native services, you will be responsible for owning and operating the services as they scale in the healthcare space.
Position Summary:
Site Reliability Engineering (SRE) is designed to combine software and systems engineering to build and run modern infrastructure. The role is designed to increase the reliability of core services that are used both internally and externally. You will focus on automation and scripting to build rapid and repeatable processes and work with product owners to deliver feedback to influence our roadmap based on the customer experience.
Responsibilities may include, but are not limited to:
35% [Primary Function] Technical Execution

  • Take on responsibility for the end to end lifecycle of modern infrastructure services
  • Maintain and support infrastructure services in both development, integration and production environments
  • Review services before they go live in production
  • Enforce rigor on incident response and postmortems, build a culture of retrospect both success and failures
  • Design proactive monitoring and metrics against supported environment
  • Focus on automation to improve scale and reliability
  • Produce accurate, unambiguous technical design specifications to the appropriate detail
  • Deliver customer value in the form of high quality hardware, software components and services in adherence with IaaS and Release Engineering policies on Security, performance, longevity and Integration.
  • Identifies and proposes alternative technology in order to create scalable implementations and achieve results.
  • Coordinate and troubleshoot complex technical issues until resolution.
  • Accurately estimate the effort of development tasks; help to guide and provide feedback to the team and be more accurate in estimating.
  • Understand and follow engineering conventions, architectures, and best practices; implement new conventions where necessary, teaching those methodologies to more junior members of the team.
  • Provide high level T-shirt sizing for the work required to build smaller software components and services.
  • Scale systems to meet business demand.
  • Deploy systems to meet availability targets (HA/DR).
  • Develop automated tests utilizing test infrastructure to validate code, when applicable.
  • Adhere to DOD (story definition of done) including unit tests, functional testing, code reviews, no regressions, bug fixes, documentation and adhere to best coding practices.
  • Perform peer code reviews in order to ensure quality standards.
  • Identify and prioritize what technical debt will be eliminated.


30% Contributions to the Team

  • Act as the subject matter expert for area of assignment
  • Identify opportunities to influence the roadmap of infrastructure services.
  • Lead agile ceremonies to improve team performance.
  • Participates in team member interview process as needed; influences final hiring decisions.
  • Act as a scrum master for agile scrum teams as needed.


20% Mentorship of Others

  • Advise and mentor more junior team members to maximize overall productivity and effectiveness of the team.


15% Cross functional Coordination and Communication

  • Foster collaboration across the Technology and Product organizations.
  • Coordinate efforts within own team and immediate team members.
  • Cultivates strong business relationships with business stakeholders.
  • Explains solutions in a way that both technical and product audiences can grasp; shares insights with peers.
  • Share business and technical learnings with the broader dev and product organizations.
  • Collaborate with members of product and UX teams to design solutions, as appropriate.


Education, Experience, & Skills Required:

  • 1-5 years of experience in an engineering role
  • Hands on experience in the public cloud, specifically Amazon Web Services (AWS)
  • Experience in an Agile environment preferred
  • Bachelor’s Degree or equivalent
  • Significant software engineering skills and computer science experience
  • Knowledge of scripting in Python/Bash
  • Experience with container schedulers such as Kubernetes, Mesosphere, Docker Swarm or ECS
  • Experience with modern logging stacks such as ElasticSearch or Graylog
  • Understanding of metrics collectors such as Graphite or Prometheus
  • Experience with DevOps tooling


Behaviors & Abilities Required:

  • Ability to learn and adapt in a fast-paced environment, while producing quality code
  • Ability to work collaboratively on a cross-functional team with a wide range of experience levels
  • Ability to analyze existing services and identify technical debt to work toward increasing sustainability
  • Finds creative way to execute even when there is no historical context or known path forward
  • Ability to design roadmaps and relevant solutions for end-users to access interfaces
  • Ability to assess the benefits, risks and success factors of potential applications
  • Strong mentoring and coaching skills that encourage growth for more junior members

The Opportunity: We are seeking highly motivated individuals to help us run modern infrastructure services in the cloud. As we build cloud native services, you will be responsible for owning and operating the services as they scale in the healthcare space.
Position Summary:
Site Reliability Engineering (SRE) is designed to combine software and systems engineering to build and run modern infrastructure. The role is designed to increase the reliability of core services that are used both internally and externally. You will focus on automation and scripting to build rapid and repeatable processes and work with product owners to deliver feedback to influence our roadmap based on the customer experience.
Responsibilities may include, but are not limited to:
35% [Primary Function] Technical Execution

  • Take on responsibility for the end to end lifecycle of modern infrastructure services
  • Maintain and support infrastructure services in both development, integration and production environments
  • Review services before they go live in production
  • Enforce rigor on incident response and postmortems, build a culture of retrospect both success and failures
  • Design proactive monitoring and metrics against supported environment
  • Focus on automation to improve scale and reliability
  • Produce accurate, unambiguous technical design specifications to the appropriate detail
  • Deliver customer value in the form of high quality hardware, software components and services in adherence with IaaS and Release Engineering policies on Security, performance, longevity and Integration.
  • Identifies and proposes alternative technology in order to create scalable implementations and achieve results.
  • Coordinate and troubleshoot complex technical issues until resolution.
  • Accurately estimate the effort of development tasks; help to guide and provide feedback to the team and be more accurate in estimating.
  • Understand and follow engineering conventions, architectures, and best practices; implement new conventions where necessary, teaching those methodologies to more junior members of the team.
  • Provide high level T-shirt sizing for the work required to build smaller software components and services.
  • Scale systems to meet business demand.
  • Deploy systems to meet availability targets (HA/DR).
  • Develop automated tests utilizing test infrastructure to validate code, when applicable.
  • Adhere to DOD (story definition of done) including unit tests, functional testing, code reviews, no regressions, bug fixes, documentation and adhere to best coding practices.
  • Perform peer code reviews in order to ensure quality standards.
  • Identify and prioritize what technical debt will be eliminated.


30% Contributions to the Team

  • Act as the subject matter expert for area of assignment
  • Identify opportunities to influence the roadmap of infrastructure services.
  • Lead agile ceremonies to improve team performance.
  • Participates in team member interview process as needed; influences final hiring decisions.
  • Act as a scrum master for agile scrum teams as needed.


20% Mentorship of Others

  • Advise and mentor more junior team members to maximize overall productivity and effectiveness of the team.


15% Cross functional Coordination and Communication

  • Foster collaboration across the Technology and Product organizations.
  • Coordinate efforts within own team and immediate team members.
  • Cultivates strong business relationships with business stakeholders.
  • Explains solutions in a way that both technical and product audiences can grasp; shares insights with peers.
  • Share business and technical learnings with the broader dev and product organizations.
  • Collaborate with members of product and UX teams to design solutions, as appropriate.


Education, Experience, & Skills Required:

  • 1-5 years of experience in an engineering role
  • Hands on experience in the public cloud, specifically Amazon Web Services (AWS)
  • Experience in an Agile environment preferred
  • Bachelor’s Degree or equivalent
  • Significant software engineering skills and computer science experience
  • Knowledge of scripting in Python/Bash
  • Experience with container schedulers such as Kubernetes, Mesosphere, Docker Swarm or ECS
  • Experience with modern logging stacks such as ElasticSearch or Graylog
  • Understanding of metrics collectors such as Graphite or Prometheus
  • Experience with DevOps tooling


Behaviors & Abilities Required:

  • Ability to learn and adapt in a fast-paced environment, while producing quality code
  • Ability to work collaboratively on a cross-functional team with a wide range of experience levels
  • Ability to analyze existing services and identify technical debt to work toward increasing sustainability
  • Finds creative way to execute even when there is no historical context or known path forward
  • Ability to design roadmaps and relevant solutions for end-users to access interfaces
  • Ability to assess the benefits, risks and success factors of potential applications
  • Strong mentoring and coaching skills that encourage growth for more junior members

Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Location

311 Arsenal Street, Watertown, MA 02472

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about athenahealthFind similar jobs