Are you an experienced Data Engineer, who wants to work with really Big Data?
Opera’s petabyte-scale data is gathered with distributed data ingestion APIs, processed on Google Cloud Platform using e.
g. Dataflow and Dataproc and kept in BigQuery. Apache Airflow helps us orchestrate Data Engineering jobs needed to use all of that anonymized data to improve our products, for 360+ million of users globally.
Are you interested in joining the team that enables all data-related activities in Opera?
We are an open, collaborative, multicultural environment and look forward to meeting candidates who can bring in new ideas, skills and solutions to Opera.
Responsibilities :
Help to operate entire data platform
Build data collection systems for Opera's products
Build automation tools
Take part in overall design and architecture discussions
Adapt to and learn new technologies
Be part of the on-call operations.
Requirements :
Expert knowledge of Python
Knowledge of SQL
Experience in Linux (Debian) administration
Experience with distributed systems and / or parallel processing
Able to quickly iterate new ideas in a fail-fast approach
Fluency in English, both written and spoken
Ideals :
Experience with Big Data concepts and tools (Google Pub Sub / Apache Kafka / Apache Beam / Apache Spark / Map Reduce / Apache Airflow / Apache Avro)
Working knowledge of cloud and virtualization / containerization
Familiarity with SCM tools, preferably Puppet
Familiarity with Java
Experience with monitoring solutions (Nagios, Icinga2, Grafana, InfluxDB, Google Stackdriver)