We are looking for a Senior Big Data Engineer II to utilize their industry knowledge, technical skills, and passion for data to work closely with executive stakeholders and our data engineering team developing solutions that support and optimize business operations.
We are solving a variety of Big Data challenges and modernizing legacy data loaders as well as exploring the benefits and tradeoffs of other cutting-edge tools.
In this role, you will be responsible for a variety of duties including Apache Spark / Pyspark and Hadoop platform components, understanding our data, application design and development, and ensuring accuracy and consistency of data.
5+ years of experience in data analysis and / or management in an enterprise environment within finance, operations, or analytics.
Experience with Apache, Data Warehouse, Spark, Pyspark, among other big data tools.
Experience with tools like MS SQL Server, Oracle, Postgres, Hadoop, Hive, Presto, etc.
Experience with at least one language (e.g. Python, C#, Scala, Java).
Experience with Linux / Unix.
Data streaming experience with Kafka / Kinesis or similar tools.
Experience with storage platforms like HDFS, S3.
Experience with NoSQL databases like HBase / Cassandra / MongoDB.
Experience with cloud platforms for data processing like Cloudera Public Cloud / Azure Databricks / AWS.
Other Skills :
Strong knowledge of statistics, including hands-on experience with Python, R, SAS, Matlab, Machine Learning, AI.
Knowledge of C#, Object-Oriented Programming, GraphQL Query API.
Knowledge of Microsoft SQL Server and Stored Procedures
Experience working with large datasets (1B+ Records)
Knowledge of Tableau / BI Reporting Tools
Identify and onboard new data sources. Collaborate with data vendors and internal stakeholders to define requirements and build interfaces.
Troubleshoot and resolve issues with data feeds.
Build Spark data pipelines using Python / Pyspark / Hadoop / Hive / Presto.
Review your coworker’s projects and provide mentoring to other members of the team.
Oversee and help architect ETL pipelines and data warehouse / data mart solutions.
Understand and be able to troubleshoot the various platforms and technologies that run our ETL processes.
Develop and maintain comprehensive controls to ensure data quality and completeness.
Develop logical data models and processes to transform, cleanse, and normalize raw data into high-quality datasets aligned with our analytical requirements