Changing the world through digital experiences is what Adobe’s all about. We give everyone from emerging artists to global brands everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen.
We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity.
We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours!
As a Site Reliability Engineer with the Ethos Operations Team, you will develop and support Ethos, Adobe's internal container deployment and hosting platform using Kubernetes that powers microservices in all parts of Adobe's business, including Creative Cloud.
Our team strives to ensure the highest level of availability for Adobe's services that are hosted on our platform in Kubernetes clusters around the world and providing support for pre-production and production services.
There are many opportunities to delve into complex distributed systems issues, reviewing logs and metrics, to determine root cause and identify ways that we can improve the platform.
What you'll Do
Create and extend automation for large-scale platform management and contribute to internal and external code-bases.
Our team employs everything-as-code methodologies across configuration, infrastructure, orchestration, monitoring, and elsewhere
We are part of a global operations team that provides 24x7 on-call platform support for both alerts and critical client issues, investigating and resolving client issues while minimizing impact.
You will participate in blameless post-mortems to share takeaways, discover gaps, embrace transparency, and improve reliability across our services.
We work closely with internal clients, building positive and collaborative relationships across teams within Adobe that use the platform.
What you need to succeed
Bachelor's Degree in Computer Science or a related field or equivalent practical experience and demonstrated ability
A proven understanding of public cloud infrastructure and services, including Microsoft Azure and Amazon AWS, containerization, and Linux OS and networking.
Ability to discover and master new technologies as well as scripting and programming languages to improve and extend the reliability and functionality of the platform.
Initiative to identify areas for improvement in the platform and procedures and propose workable solutions.
Communication skills are vital. We need team members that document their work and ensure that they understand and are understood when working with other time-zones.