Sensei ML Devops Engineer
Adobe
San Jose
hace 2 días

Our Company

Changing the world through digital experiences is what Adobe’s all about. We give everyone from emerging artists to global brands everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen.

We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity.

We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours!

The challenge

Machine Learning is critical part of Adobe’s Cloud offering. Adobe Clouds enable customers to create and manage digital content.

In Creative Cloud, creative professionals and novice users alike need to manage the lifecycle of their digital assets, libraries, and documents, from brushes to colors, images, photos, videos, 3D assets and beyond.

In Experience Cloud, it is all about optimizing the digital experience and digital transformations for enterprises. Adobe Cloud also includes the Adobe Stock image marketplace and the Behance community that leverage deep machine learning to enable content quality, search, discovery, organization, contributor moderation, and more.

We are building a new machine learning platform, called Adobe Sensei, that powers machine learning and AI across our Adobe Cloud product lines.

This platform will span thousands of applied researchers, millions of users, and billions of digital assets. Become part of this growing team at Adobe and make a phenomenal impact in the area of computer vision, user understanding, language understanding, and digital experience optimization.

The objective is to make machine learning offerings a world-class, leading-edge, differentiating technology in the Adobe Cloud ecosystem.

How can you participate? We’re looking for an Operations Engineer, who is passionate about building highly reliable and scalable framework for ML applications.

The ideal candidate will be a hands-on person who has strong technical and communication skills and will provide innovative technical solutions promptly.

Someone with a proven record of automating and optimizing large-scale cloud infrastructure for efficient and reliable ML / AI processing.

This is an opportunity to make a huge impact in a fast-paced, startup-like environment in great company. Join us!

Responsibilities

  • Drive and improve the complete ML lifecycle operations - from inception and design, through deployment, and refinement
  • Automate the SLC processes by building and maintaining software modules, scripts, deployment frameworks, tracers, monitors, and self-healing / auto remediation tools
  • Maintain business continuity by identifying and driving opportunities to make systems highly resilient and minimize human intervention
  • Assist our ML and software engineering team to ensure accurate monitoring and metrics are being built into applications before going to production
  • Maintain up-to-date documentation on deployments, processes, and standard operating procedures / runbooks
  • Collaborate with research, architects, and product management to define and establish product improvements.
  • Explore and research new and emerging ML technologies for optimizations and bring them to the Adobe Sensei platform.
  • What you need to succeed

  • BS / MS in Computer Science or related field with 5+ years of industry experience in DevOps / SRE role
  • Hands-on experience with programming (python, java, or C++), maintaining, and monitoring large scale distributed ML platforms / systems
  • Experience in building, troubleshooting, and automating SLC processes for cloud applications, microservices or REST APIs.
  • Experience using Kubernetes along with orchestration, automation, configuration management solutions and monitoring stacks like Jenkins, Terraform, Prometheus, and Grafana
  • Experience in public cloud infrastructure (Azure / AWS preferred), particularly in the areas of networking (VPCs, security groups), VMs (EC2), databases (HBase, RDS), load balancing (ELB, ALB)
  • Experience in one or more state-of-the-art machine learning and deep learning processes, tools and pipelines (Airflow, MLFlow, KubeFlow)
  • Strong leadership with proficient verbal and written communication skills.
  • Reportar esta oferta
    checkmark

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    Inscribirse
    Mi Correo Electrónico
    Al hacer clic en la opción "Continuar", doy mi consentimiento para que neuvoo procese mis datos de conformidad con lo establecido en su Política de privacidad . Puedo darme de baja o retirar mi autorización en cualquier momento.
    Continuar
    Formulario de postulación