Site Reliability Operations Analyst Apigee
THIS JOB HAS EXPIRED The Site Reliability Operations Analyst will be part of a great team responsible for ensuring high site availability, service reliability and optimal service performance. Our team is responsible for every service offered, involving some of the largest deployments of cutting-edge technologies in the world. Through small teams based in Palo Alto, California and Bangalore, India, the SROA team provides 24/7 oversight and support of the infrastructure and services that power the Apigee platform service.
Learning new and exciting things is part of your DNA. The Apigee technology stack employs many interesting open source technologies that offer many opportunities to contribute in solving complex problems and delivering cutting edge solutions.
The Site Reliability Operations Analyst will...
- Perform front line monitoring/support and initial response for automated and manually generated events for all Apigee properties
- Execute against Site Reliability work queues to SLA
- Collaborate with fellow SREs and other teams on investigating and resolving complex problems
- Provide input to the design and improvement of automation and tools for systems management to support the SROA charter
- Document work in clear, concise English: ticket updates, runbook articles and email responses
- Respond to ambiguous situations and assist with adding definition or deriving solutions to fill gaps
Think you might be our next Site Reliability Operations Analyst? You bring to the table...
- Troubleshooting experience in solving application and hardware issues
- Ability to communicate effectively at all levels to ensure problems are characterized properly and transitioned as needed for timely resolution of issues and requests
- Experience with distributed unix/linux systems administration and performance tuning (1+ year)
- AWS experience or similar cloud service provider experience a plus (1+ years)
- Working knowledge of TCP/IP networking and switching
- Working knowledge of Bash scripting
- Knowledge of , and experience working with network management systems and monitoring tools such as Nagios and Graphite a plus
- Troubleshooting skills that range from diagnosing low-level hardware issues to large- scale failures within or across datacenter clusters
- Working experience with Incident Management, Change Management and Problem Management
- Ability to work independently
- Prior experience in a fast-paced, high stress environment, resolving multiple interruption-driven priorities simultaneously preferred
- Flexibility to perform periodic on-call duty
- BA/BS degree in computer science or related field (2+ years relevant work experience in lieu of degree)
About Apigee, The API Company
Apps are changing the way we live, and APIs are the secret ingredient that makes apps work. Apigee gives businesses and developers everything they need to be successful in the app economy. Hundreds of companies including AT&T, eBay, Pearson, Gilt Groupe, and Walgreens use Apigee to reach new customers and drive innovation through APIs.
Apigee's API Platform enables businesses and developers to deliver well designed, scalable APIs and apps, drive developer adoption, and extract business value from their API ecosystem.
About Apigee People
Apigee hires smart people who love to solve hard problems and have fun. We?re passionate. We love APIs, we love our customers, and we love application developers. We work as a team, fast and focused, learning as we go. We respect one another, our customers and everyone we do business with.
||Palo Alto, CA |
THIS JOB HAS EXPIRED