Data Engineer
San Jose, Costa Rica
hace 1 día

Job Description :

Cloudera Data Engineers are focused on working internally on Cloudera’s Data Warehouse implementation for purposes of ingesting data for Business Operations and Analytics.

A Data Engineer is responsible for the design and implementation of data pipelines, drawing on a range of Big Data technologies to support data ingestion and transformation workflows with varying complexity, requirements, and SLAs.

A Data Engineer at Cloudera will build on and develop fluency in Big Data ingest and transformation as well as being responsible for data quality and integrity for the pipelines.

Data Engineers are expected to deliver data pipelines that meet specifications developed by our Data Architects, while maintaining reusability and flexibility for changes over time in an agile development environment.

Data models from these pipelines will inform the daily actions of business groups across the Company.

A successful candidate will contribute both to the management of Cloudera’s internal data asset and to methodologies and tools for use by Cloudera solutions architects at customer sites.

Key Responsibilities

Work with the Data Architects to understand the data needs of the lines-of-business.

Work with the data owners to get access to data and ingest it into the data store in a reliable way.

Build transformations primarily using SQL to enrich data, following the specifications and standards from the Data Architecture team.

Create real-time data flows to meet the needs of operational systems.

Monitor the flow of data to keep data services reliable.


5+ years of experience as a Data Engineer

Strong written and oral communication skills required

Comfortable coding in SQL, scripting languages, Java, Python

Some experience with ETL, Business Intelligence, and Data Processing

Proven ability to architect and implement reliable data pipelines

Hands-on experience with Hadoop Technologies. At least one of Impala, Hive, Kafka, MapReduce, or Spark

Experience monitoring critical data pipelines for correctness

Debugging skills to address issues in data pipelines

Nice to have

Past exposure to advanced numerical methods

Experience optimizing data storage in HDFS / Parquet / Avro, Kudu, or HBase

Experience with third party tools like Streamsets and Trifacta

Experience with Cloudera Labs’ Envelope Spark Framework

  • Posted 17 Days Ago
  • 200807
  • Reportar esta oferta

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    Mi Correo Electrónico
    Al hacer clic en la opción "Continuar", doy mi consentimiento para que neuvoo procese mis datos de conformidad con lo establecido en su Política de privacidad . Puedo darme de baja o retirar mi autorización en cualquier momento.
    Formulario de postulación