Sr. Automation Engineer
Heredia, Heredia, Costa Rica
hace 2 días

Sr. Site Reliability Engineer

Job Description :

Sr. SRE will be dedicated full-time to creating software that improves the reliability of systems in production, fixing issues, responding to incidents and usually taking on call responsibilities.


Implementing an SRE team will greatly benefit both IT operations and software development teams. Not only can SRE drive deeper reliability to systems in production but it will likely help IT, support and development teams spend less time working on support escalations, and give them more time to build new features and services.

Sr. SRE will be mainly responsible for below activities :

  • Build, Deploy and Release Supports
  • Sr. SRE is responsible to write Build and Deploy automation for various application and infrastructure using Maven, ant, Groovy, Terraform , ansible etc.

    He must be creating Automated CICD and CT Pipelines for smooth and frequent releases of Software. SR. SRE will play hand on DevOps Automation Engineer role.

  • Building software to help operations and support teams through Automation
  • Sr. SRE will be in charge of proactively building and implementing services to make IT and support better at their jobs.

    This can be anything from adjustments to monitoring and alerting to build, deploy and various environments uptime support.

    A site reliability engineer can be tasked with building a homegrown tool from scratch to help with weaknesses in software delivery or incident / problem management.

    Sr. SRE must be hand on in Terraform, Ansible, Python, Jenkins, CICD concepts to help building reliable DevOps tools.

  • Fixing support escalation issues
  • Similarly to the point above, a site reliability engineer can expect to spend time fixing support escalation cases. But, as your SRE operations mature, your systems will become more reliable and you’ll see fewer critical incidents in production leading to fewer support escalations.

    Because an SRE Team touches so many different parts of the engineering and IT organization, they can be a great source of knowledge and can be helpful for routing issues to the right people and teams.

    Sr. SRE will be responsible to fulfill this need.

  • Optimizing on-call rotations and processes
  • More times than not, site reliability engineers will need to take on-call responsibilities. Sr. SRE is responsible to improve system reliability through the optimization of on-call processes.

    SRE teams will help add automation and context to alerts leading to better real-time collaborative response from on-call responders.

    Additionally, Sr. SRE would update runbook, tools and documentation to help prepare on-call teams for future incidents.

  • Documenting tribal knowledge
  • SRE teams gain exposure to systems in both staging and production, as well as all technical teams. They take part in work with software development, support, IT operations and on-call duties meaning they build up a great amount of historical knowledge over time.

    Instead of siloing this knowledge into the mind of one team or one person, site reliability engineers can be tasked with documenting much of what they know.

    Constant upkeep of documentation and runbooks can ensure that teams get the information they need right when they need it.

    Sr. SRE member will be responsible to handle this and help through automation to ease these kind of tasks.

  • Conducting post-incident reviews
  • Without thorough post incident reviews you have no way to identify what’s working and what’s not. Sr. SRE need to keep teams honest and ensure that everyone software developers and IT professionals are conducting post-incident reviews, documenting their findings and taking action on their learning.

    Then, site reliability engineers are often tasked with action items for building or optimizing some part of the SDLC or incident lifecycle to bolster the reliability of their service.

    Requirements :

  • Bachelors Degree in Computer Engineering,
  • Hands on experience Configuring and Administering SCM(GIT, SVN), Build (Maven, CMake, Make files), CI(Jenkins), CD Automation Tools.
  • Configuring and maintaining SDLC Environments.
  • Experience with Oracle, MongoDB, etc
  • Experience in Agile Methodologies and processes.
  • Strong Automation, analytical skills, and ability to follow through to completion.
  • Strong verbal / written interpersonal skills.
  • Automation Experience with Build / deployment, Software Configuration / Continuous Integration / Continuous Delivery / Release Management related tasks in a JavaEE / C++ Environment.
  • Experience in automating manual processes using Python, Ruby, Unix Shell (bash, ksh), perl, Ant, etc.
  • Installing, Configuring, Administering, and Tuning of JavaEE Application Servers, and WebServers.
  • Installing / maintaining / Administering software on Unix Linux, Windows servers.
  • Experience with Cloud Platforms and virtualization Technologies.
  • Deploying and automating JavaEE applications in cloud environment using Chef, Ansible, RPM, etc.
  • Must have :

  • Must have 8+ year experience with 4-5+ year’s experience in DevOps technologies such as terraform, ansible , maven, ant, Docker , git etc.
  • Must be very hands on in writing Ansible, Python, Terraform, maven, ant, groovy, shell scripting etc.
  • Must be very hands on in CICD pipeline creation using Jenkins.
  • Good to have hands on experience working on AWS cloud and / or Google Cloud
  • Good to have hands on in chef / puppet.
  • Must have good working experience on building / deploying dockers, Microservices and Enterprise application using Automation
  • Inscribirse
    Mi Correo Electrónico
    Al hacer clic en la opción "Continuar", doy mi consentimiento para que neuvoo procese mis datos de conformidad con lo establecido en su Política de privacidad . Puedo darme de baja o retirar mi autorización en cualquier momento.
    Formulario de postulación