Director, Network Operations
About the role:
Everbridge is looking for a highly-skilled and proven individual to lead our global network operations centers (NOCs). The Everbridge NOCs are responsible for the 24x7 availability, performance and security management of the Everbridge Critical Event Management (CEM) platform and applications. The Everbridge CEM platform is used by over 4,500 global customers to keep people safe and business running during public safety threats such as active shooter situations, terrorist attacks or severe weather conditions, as well as critical business events such as IT outages, cyber-attacks, and supply-chain interruptions. As a result, the candidate must have prior demonstrable success in leading regional NOC teams for web based or software-as-a-service solution in a fast-paced, high profile company. The successful candidate will have a proven track record of identifying, resolving, and preventing customer impacting problems in an environment where every second of customer impact counts. Candidates should also have experience designing, planning, supporting, tuning and operating technology solutions including enterprise-level monitoring and trending services, modern data-driven analysis techniques, excellent triage capabilities, and creative scaling techniques.
About the team:
As the leader of our Network Operations Centers, you will join a team of dedicated, intelligent, fast-paced engineers that are customer focused. You’ll work in a cutting-edge cloud environment that will power our company’s impressive growth. You will bring a data driven approach and sense of urgency to solving problems with a focus on customers. You will lead a geographically distributed team and be a change-agent to take the team to the next level.
- Lead, grow, mentor and inspire our existing regional NOC engineers.
- Analyze and architect world-class telemetry systems with an emphasis on empowering self-service solutions.
- Support and champion work stream improvements to actively reduce signal-to-noise ratios and make every monitoring alert “actionable”.
- Build a training program and incorporate weekly planned and unplanned drills to continually strive to improve the NOCs operational mean-time-to-identify, mean-time-to-escalate/communicate, and mean-time-to-repair degradations.
- Always act with a sense of urgency to resolve platform problems knowing that the use of our CEM platform to manage emergencies can save lives.
- Own our technology vendor/provider management program and hold our vendors and service providers accountable to their availability, performance, and security SLAs to achieve Everbridge’s SLAs to our customers.
- Improve our ITIL-based service level and problem management processes and procedures with emphasis on data driven decisions.
- Partner closely with the site reliability engineering, software development, product management and technical support teams to continually achieve our internal Service Level Objectives and increase our focus on customer satisfaction.
- Participate in the evaluation of new software, automation, and infrastructure solutions.
- Champion improvements to our ITIL-based standards and focus the team to reduce incident and problem ticket counts and severities.
- Proven experience leading regional NOCs large technology in a fast-paced, high profile SaaS or web-scale company.
- Proven track record leading NOC engineers with identifying, resolving, and preventing customer impacting problems in an environment where every second of customer impact really counts.
- Strong skill in monitoring & telemetry solutions.
- Strong skill in multi-tiered & multitenant architectures.
- Strong understanding of networking fundamentals and global routing techniques.
- Skill in data analysis techniques and teaching teams to use data to implement solutions and quickly triage problems.
- Skill in writing and implementing SQL or No-SQL queries.
- Experience with large-scale Linux production environments as part of a software-as-a-service offering.
- Skill in Amazon Web Services and supporting a public/private cloud environment.
- Ability set and track team progress towards achieving platform availability, performance and security SLOs on a daily and weekly basis.
- Experience designing, deploying, extending and scaling monitoring & trending solutions (e.g. DataDog, SumoLogic, Graphite, Grafana, Elasticsearch, Logstash, Kibana, InfluxDB, OpenTSDB, Graylog, Nagios)
- Ability to manage competing priorities in a complex environment
- S. Citizen
- Able to pass a Federal drug screening
- Bachelor's degree or equivalent
- Application virtualization, containerization, and service-oriented-architecture technologies (Terraform, Consul, Nomad, Vault (HashiCorp suite), Docker, Kubernetes, Mesos, CoreOS/rkt)
- Experience with big data systems and distributed systems
- Hands-on experience with infrastructure as code tools and concepts (e.g. Saltstack)
Our team makes a difference during the most difficult times and challenging situations. Our people are dedicated to solving problems. Our software was built to save lives. Our unifying mission is to keep people safe and businesses running.
Headquartered in the great cities of Boston and Los Angeles, with operations across the world, our team of 750+ dedicated employees support more than 4,200 global customers every day in their most crucial moments. During public safety threats such as active shooter situations, terrorist attacks, or severe weather conditions—as well as during critical business like IT outages or cyber-attacks—customers rely on our SaaS-based platform to quickly and reliably aggregate and assess threat data, locate employees and first responders, automate a pre-defined communications processes, and track progress on those response plans.
Our culture is all about “Making a Difference,” and we are proud to serve:
- 9 of the 10 largest U.S. cities
- 8 of the 10 largest U.S.-based investment banks
- 7 of the top 10 U.S. technology and telecom companies
- 25 of the 25 busiest North American airports
- 7 of the 10 largest U.S. healthcare systems
- 6 of the 10 largest U.S. retailers
As we continue to grow and transform the field of critical event management, we need passionate, committed individuals to help us carry out our mission. Click here to learn more about what we do. If you think you have what it takes to make a difference, apply to be a part of our award-winning team.
Everbridge is an Equal Opportunity/Affirmative Action Employer. All qualified Applicants will receive consideration for employment without regard to race, creed, color, religion, or sex including sexual orientation and gender identity, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.
Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities
The contractor will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by the employer, or (c) consistent with the contractor’s legal duty to furnish information.