Your Role at MediaLab
MediaLab Engineering supports a growing number of applications across multiple business verticals which have been acquired through a careful selection process to integrate with its internal Ad platform that processes billions of impressions a month. The technical effort involved is massive and we need go-getters looking to make a substantial impact.
As a Data Engineer, you will be a key member of the team that builds and maintains data platform solutions at MediaLab. The Data team works with all of the MediaLab brands (Genius, Imgur, Kik, Worldstar Hip-Hop, Amino, Whisper) as well as business teams across the organization. We own a streaming pipeline which handles thousands of events per second and billions per day; a BigQuery instance comprising dozens of schemas, thousands of tables, and over a petabyte of storage; and hundreds of DAGs in Airflow and Dataform. We work in python, SQL and golang and strive to write clean, efficient and reusable code, making use of common tooling to ensure code is well-formed and that tests pass before deployment.
What You'll Do
- Maintain and optimize go-based pub/sub consumers handling complex transformation logic
- Improve the efficiency of multiple repositories dedicated to ingesting and parsing data from a wide variety of sources
- Help improve our monitoring efforts and troubleshoot problems with ETL jobs and event pipelines as they arise
- Support requests from business teams to find and ingest sources of data that we currently lack
- Work with the engineering teams on all of our brands to integrate their application data to the data warehouse/data lake
- Contribute to setting up and adjusting deployment workflows
What We're Searching For
- 1-2 years of professional experience as a data engineer, or software engineer with significant exposure to working with data
- Must be comfortable with python. Golang experience is a big plus.
- Must have experience with at least one major cloud provider (GCP and/or AWS preferred)
- Must have experience using containers; kubernetes experience is preferred
- Very solid SQL skills; must understand how to make complex queries run efficiently
- Must be able to write clear documentation
- Experience with some CI/CD platform
- Experience with terraform or infra as code
- Experience with at least one major data warehouse (BigQuery, Redshift, Snowflake, etc) is a big plus
- Experience working with at least one pub/sub tool (kafka, kinesis, google pub/sub, rabbitmq, etc) is a plus
- Experience in the ad tech or digital publishing industries is a nice-to-have