Data Engineer (python or scala)

Posted on Indeed on May 29, 2021
Job Summary
As a Data Engineer here at Horizon your objective will be to assist in building out Horizon’s data platform. Utilizing your skills with Python and AWS, you will be tasked with creating automated highly robust and performant data pipelines that will be able to scale with the ever-increasing data needs of data analysts and client teams. These data pipelines will serve vital business operations such as client reporting, analytics/data science, activation and etc.

Job Duties
  • (45%) Build robust and scalable data integration (ETL) pipelines using SQL, EMR, Python and Spark
    • Design solution based on needs gathered after discussing with end users/stakeholders
    • Code/implement solution based on design adhering department best practices and processes
    • Drive solution from initial implementation, testing/QA and final delivery to end users/ stakeholders
    • Maintain delivered solutions in respect to changing requirements or unexpected failures
  • (30%) Mentor and manage junior members of the team
    • Perform code reviews on work submitted by junior developers
    • Advocate and enforce MTD coding standards (e.g. "Clean Code") and industry best practices
    • Ensure junior developers are on target will deliverables and do not have obstacles blocking them
  • (15%) Technical Knowledge and leadership
    • Expected to participate with other senior engineers and tech leaders to drive evolution of tech standards/processes
    • Work to understand new practices and technologies relevant to data engineering field and help drive adoption within team
  • (10%) Collaborate with Software Solution team members and other staff to validate desired outcomes for code prior to, during, and post development
    • Train other technical staff to understand how to access/utilize delivered solution
    • Flesh out test cases and "edge"/outlier cases to test for
  • (5%) Help onboard new engineers

Job Requirements
  • Bachelor’s degree in Computer Science or related major from 4 year University is required. A masters is preferred
  • 3 years experience building data pipelines and implementing feeds for data warehouse
  • Strong communications skills both written and verbal
    • Strong technical understanding to be able to contribute in meetings to discuss best practices and/or technical solutions to business problems
    • Able to communicate effectively with non-technical co-workers and stakeholders. Be able to explain technical concepts and issues at a level that non-technical people can understand
    • Able to understand requirements and business needs from client teams and stake holders and translate those to technical requirements
  • Strong background coding in Python3
  • Extensive experience processing large datasets utilizing Spark and EMR Clusters
  • AWS
    • EMR
    • S3
    • EC2
  • Deep understanding of database design and data structures.
  • Code Repository (Github, Bitbucket)
  • Linux/Shell scripting

Nice to Haves
  • Experience with Snowflake
  • Experience with Airflow
  • Experience with AWS
    • Cloudformation
    • Athena
    • Lambda

Let us know

Help us maintain the quality of jobs posted on RemoteTechJobs and let us know if:

Error on reporting

Related jobs

Liberty Mutual Insurance

More jobs by this company