Data Engineer - DE22-01176


**Remote**
New York, New York
9 Locations

Last Day to Apply: May 10, 2022

Position: Data Engineer 
Location: Remote Role
Duration: 9 Months (Possible Extension)

Responsibilities:

  • The Team: The Data Platform team in our Company's Animal Health IT (MAHI-IT) designs and implements end to end data solutions to support customer facing applications in animal traceability, monitoring, well-being, and more.
  • We seek a Data Engineer to help the team setting up, maintaining, optimizing, and scaling data pipelines from multiple sources and across different functional teams in a cloud environment.
  • Assist in developing best practices for deploying, monitoring, and scaling data pipelines in the cloud
  • Identify requirements for ingestion, transformation, and storage of data
  • Design and implement optimal and scalable data pipelines
  • Use cloud tools to integrate data from multiple data source into the data lake and design and implement ways to expose it
  • Identify opportunities for automation and optimization of data pipelines and re-design of data architecture and infrastructure for great scalability and optimal delivery
  • Implement cloud/ data infrastructure required to extract, transform, and load data from multiple sources
  • Identify required security and governance procedures to keep the data safe in a cloud environment
  • Assist in developing and executing testing plans to help with QA efforts.

Requirements/ Qualification:

  • Bachelor's degree in Data Engineering, Computer Science, or related field.
  • Experience designing and implementing data engineering pipelines.
  • Advanced knowledge in Python and PySpark .
  • Working knowledge of one or more SQL languages.
  • 3+ years of hands-on experience with developing data warehouse solutions and data products.
  • 1+ year of hands-on experience developing a distributed data processing platform with Hadoop, Hive, Spark, Airflow, Kafka, etc.
  • 3+ years of hands-on experience in modeling and designing data schemas.
  • Advanced experience with programming languages: Python, Pyspark, Scala, etc.
  • Knowledge of scripting languages: Perl, Shell, etc.
  • Practice working with, processing, and leading large data sets.
  • Experience with cloud tools for ingesting and processing data

Preferred Experience And Skills:

  • Experience with AWS tools big data platforms – S3, EMR, EKS, Lambda, etc.
  • Experience with data ingestion and transformation tools like Streamsets and Databricks
  • Experience working with DevOps teams
  • Experience with container technologies such as docker and Kubernetes
  • Experience with data warehousing tools like Snowflake and Redshift
Skip to the main content