Site Reliability Engineer (Production Systems Engineer) The Rubicon Project
THIS JOB HAS EXPIRED Site Reliability Engineer (Production Systems Engineer)
WE ARE CURRENTLY CONSIDERING QUALIFIED APPLICANTS FOR THIS ROLE WHO ARE BASED IN LOS ANGELES.
The mission of the Technical Operations team is to equip the business with framework support. We are responsible for the building and maintenance of data centers, inventory management, monitoring services development, platform integration and management, capacity planning, global traffic management, release management, application and development support, network support and troubleshooting.
A member of the Technical Operations Engineer team must be able to ask the right questions, understand the business and technical requirements of internal and external clients, and then translate those needs into actionable items for execution in conjunction with other cross-functional team members. We are geared to support stability, scalability and outstanding service across the entire organization.
The Site Reliability Engineer (Production Systems Engineer) supports our global high performance infrastructure in a DevOps environment. Our requirement for low latency networking requires understanding and passion for high performance computing. You will be part of a team that supports real time, high volume complex advertising and bidding network, along with supporting a large Hadoop data warehouse.
Ensure the operational integrity of the global production infrastructure
Build and expand infrastructure capacity at remote Data Centers and POPs
Perform deployments and maintenance to implement code, architecture and configurations changes
Provide support and diagnose issues to other teams related to our infrastructure
Participate in 24/7 on-call rotation with other team engineers
Develop and maintain new health checks for system and application-level monitoring
Comfortable operating in a Linux environment (LAMP STACK)
Strong verbal and written communication skills
Ability to adapt to a rapidly evolving code base on a global system
Comfortable collaborating and supporting a diverse team of engineers
Respond, diagnose, and trouble shoot system reported incidents
Scripting in bash, perl, python or other unix like shell
Flexible working hours and 24/7 on-call support
Large Installation Systems Administration with Centos or Red Hat Linux
Monitoring tools such as Graphite, Nagios, and Opsview
Scripting in bash, perl, python or other scripting languages
System Management with Puppet, Chef or other managed systems
System installs with PXE, kickstart, and Cobbler
Managing code stored in a revision control system
Exposure to Hadoop Environment
Atlassian tools including Bamboo, Fisheye, Crucible, Crowd, Jira, and Confluence
VMware or other virtual system management
Experience writing code in any language
||Los Angeles, CA |
THIS JOB HAS EXPIRED