The role involves monitoring and managing Cloudflare's global network, maintaining data center operations, troubleshooting issues, and coordinating with teams and contractors.
Available Locations: Bengaluru
About The Role
In this role, you will be focused on monitoring, managing and maintaining the Cloudflare global network. You'll work closely with Cloudflare's SRE (Site Reliability Engineering) team, Network Engineering team, Network Deployment Engineering team and with various vendors and partners (including hardware vendors, datacenter and network providers, and ISPs) to maintain and improve our global infrastructure. This role ensures the maximum uptime performance, and security of critical systems and services. This is a highly visible position that requires deep technical understanding of datacenter infrastructure, networking (physical), and basic experience with data analysis.
To be successful in this position, you should have excellent technical skills, communication skills, and be able to navigate a range of challenges and constraints (e.g. schedule adherence, time zones, and cultures). You will have the opportunity to maintain a faster, safer Internet for our millions of users and the billions of web surfers that visit their sites each month.
Who you are
You are detail-oriented, eager to learn, and excited to grow your career in a fast-paced, global infrastructure environment. You bring foundational knowledge of data center environments, networking, and Linux systems, and are motivated to support mission-critical operations. You're comfortable following established processes, working across time zones, and collaborating with global teams. You will be working with partners to support infrastructure to a number of remote locations. You will have had experience managing operational environments, and used to developing new approaches to improve efficiency or operational stability.
What You'll Do
Required Experience
Other Responsibilities May Include
Examples of desirable skills, knowledge and experience
Bonus Points
About The Role
In this role, you will be focused on monitoring, managing and maintaining the Cloudflare global network. You'll work closely with Cloudflare's SRE (Site Reliability Engineering) team, Network Engineering team, Network Deployment Engineering team and with various vendors and partners (including hardware vendors, datacenter and network providers, and ISPs) to maintain and improve our global infrastructure. This role ensures the maximum uptime performance, and security of critical systems and services. This is a highly visible position that requires deep technical understanding of datacenter infrastructure, networking (physical), and basic experience with data analysis.
To be successful in this position, you should have excellent technical skills, communication skills, and be able to navigate a range of challenges and constraints (e.g. schedule adherence, time zones, and cultures). You will have the opportunity to maintain a faster, safer Internet for our millions of users and the billions of web surfers that visit their sites each month.
Who you are
You are detail-oriented, eager to learn, and excited to grow your career in a fast-paced, global infrastructure environment. You bring foundational knowledge of data center environments, networking, and Linux systems, and are motivated to support mission-critical operations. You're comfortable following established processes, working across time zones, and collaborating with global teams. You will be working with partners to support infrastructure to a number of remote locations. You will have had experience managing operational environments, and used to developing new approaches to improve efficiency or operational stability.
What You'll Do
- Monitor for network and data center issues working with Infrastructure, Network, and SRE teams to support the day-to-day health of data center operations.
- Identify and respond to incident, outage and performance issues to ensure data center and network availability through proactive support and remote coordination.
- Perform first level of troubleshooting of issues by following SOPs, and helping to coordinate and track tasks with remote hands/contractors (e.g. hardware support/check cabling).
- Conduct root cause analysis for recurring issues and recommend preventive measures.
- Creating and maintaining documentation related to SOPs and participating in development and refinement of monitoring best practices.
- Support and reconfigure network infrastructure where required.
- Use tools like JIRA to update task status and progress reports.
- Providing feedback to internal teams to support internal tools and external vendor partnerships.
Required Experience
- English language proficiency (written and verbal) is mandatory
- Over 2 years of experience in a technical support, IT operations, or data center environment (internship or junior role experience acceptable)
- Exposure to basic networking concepts (cabling, ports, troubleshooting). Experience with Juniper, Cisco and DWDM network equipment
- Familiarity with Linux-based systems and command-line tools
- Experience working with or coordinating third-party contractors (e.g. remote hands, field engineers)
- Familiarity with work required to stand up infrastructure in remote colocation facilities
- Experience running and improving operational processes.
- Familiarity with day-to-day tasks common to Data Center Operations e.g decommissioning and power)
- Comfortable handling basic program management responsibilities (prioritization, planning, scheduling, status reporting) such as JIRA
- Incident management
Other Responsibilities May Include
- Assist in improving documentation and procedures for remote site operations
- Participate in on-call rotations or incident response support
- Collaborate with global teammates across time zones and cultures
- Assist with the definition, documentation and implementation of consistent processes across all region
- Limited travel may be required for team offsites
Examples of desirable skills, knowledge and experience
- Bachelor's degree; technical background in engineering, computer science, or MIS
- Direct experience executing on complex data center/infrastructure projects
- Previous experience installing / maintaining data center (and other IT) infrastructure and DCIM tools
- Experience running and improving operational processes in a rapidly changing environment
- Strong verbal and written communication skills, problem-solving skills, attention to detail, and interpersonal skills
- Must be proactive with proven ability to learn fast and execute on multiple tasks simultaneously
- Ability to manage MS excel and Google spreadsheets
- Comfortable handling multiple responsibilities (prioritization, planning, scheduling, status reporting) such as JIRA
- Must be a team player
Bonus Points
- Multi-lingual; experience working with infrastructure in multiple countries
- Comfortable with remote "lights-out" and out-of-band access to data center resources
- Linux certifications (RHCSA etc.)
- Network certifications (CCNA, JNCIA or higher)
- Configuration management systems such as Saltstack, Chef, Puppet or Ansible
- Scripting or software development experience in Bash, Python or Go-lang
- Familiarity with load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Apache
- Experience in working within a large scale SaaS vendor
Top Skills
Ansible
Apache
Bash
Chef
Cisco
Dwdm
Go-Lang
Haproxy
JIRA
Juniper
Linux
Nginx
Puppet
Python
Saltstack
Varnish
Cloudflare Boston, Massachusetts, USA Office
Boston, MA, United States
Similar Jobs at Cloudflare
Cloud • Information Technology • Security • Software • Cybersecurity
Lead and manage a team of engineers focused on Data Localization products, ensuring quality delivery, team growth, and alignment with company strategies while collaborating with various stakeholders.
Cloud • Information Technology • Security • Software • Cybersecurity
The role involves developing and operating distributed systems, focusing on data localization solutions using technologies like Rust and Go.
Top Skills:
ClickhouseDockerGoKubernetesPostgresRustUnix/Linux
Cloud • Information Technology • Security • Software • Cybersecurity
As a Hardware Systems Engineer, you'll troubleshoot and maintain Cloudflare's server fleet, validate firmware updates, and enhance automation tools.
Top Skills:
BashBitbucketGitGrafanaIpmiJIRALinuxPrometheusPythonRedfishSaltTeamcityX86 Server Hardware
What you need to know about the Boston Tech Scene
Boston is a powerhouse for technology innovation thanks to world-class research universities like MIT and Harvard and a robust pipeline of venture capital investment. Host to the first telephone call and one of the first general-purpose computers ever put into use, Boston is now a hub for biotechnology, robotics and artificial intelligence — though it’s also home to several B2B software giants. So it’s no surprise that the city consistently ranks among the greatest startup ecosystems in the world.
Key Facts About Boston Tech
- Number of Tech Workers: 269,000; 9.4% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Thermo Fisher Scientific, Toast, Klaviyo, HubSpot, DraftKings
- Key Industries: Artificial intelligence, biotechnology, robotics, software, aerospace
- Funding Landscape: $15.7 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Summit Partners, Volition Capital, Bain Capital Ventures, MassVentures, Highland Capital Partners
- Research Centers and Universities: MIT, Harvard University, Boston College, Tufts University, Boston University, Northeastern University, Smithsonian Astrophysical Observatory, National Bureau of Economic Research, Broad Institute, Lowell Center for Space Science & Technology, National Emerging Infectious Diseases Laboratories