Site Reliability Engineer
As an Application Performance Management Analyst on our Production Support Level 2 team, you will be responsible for all aspects of managing and monitoring application performance of our Managed Accounts platform. This position requires a great eye for detail and a desire to coordinate with internal operations and technology teams to improve platform reliability for our clients. The APM Analyst will work with multiple internal teams to maintain and enhance our monitoring paradigm with an eye towards scalability and proactive issues detection and prevention.
Responsibilities:
- Utilize industry standard APM tools to quickly identify source of client issues
- Perform initial triage and troubleshooting of application issues
- Perform detailed postmortem analysis of APM anomalies, including recommendations for instrumentation of additional monitoring to prevent future occurrences
- Translate client issues into well documented, actionable bugs for engineering resolution
- Identify product improvements to assist in higher quality system monitoring
- Participate in Client Outage Review board to find gaps in existing monitoring
- Propose changes, as needed, to instrumentation of existing application performance monitoring tools
- Evangelize proactive application monitoring across all departments
- Provide mentoring to other Production Support teams to ensure understanding of how to utilize APM tools
Required Experience and Skills:
- Familiarity with dashboarding and querying in Splunk
- SQL knowledge (T-SQL Preferred)
- Eye for detail and exceptional troubleshooting skills
- Excellent documentation and communication skills
- Ability to work effectively with multiple different groups of varying technical skills
- Ability to handle problem situations quickly, inventively, and resourcefully
Preferable skills:
- Experience/Minor in Statistics
- MSCS is a plus
- Experience as a member of a Site Reliability team
- Experience with database and/or application monitoring tools (Appdynamics, Solarwinds Ignite, Grafana, etc.)
- Database Management experience
- Tomcat and/or JBOSS experience
- Previous SaaS application support experience
- Relevant financial industry systems experience (e.g. Portfolio Management, Trading, Middle/Back Office Operations)
Education:
- BS degree in Computer Science, Information Systems, Data Analytics or equivalent
The Team:
Level 2 Support is a small team acting as a primary escalation point for the Production Support team addressing platform configuration issues. The team is responsible for helping maintain VestmarkOne applications for production clients. We work closely together to resolve complex configuration issues and ensure that those changes are persisted correctly by following established code management practices. The team handles many aspects of product configuration and provides the opportunity to grow professional skills in many areas.