Your tasks :
Development and maintenance of distributed systems processing large amounts of data (most real-time) for the needs of our RTB platform;
Optimization of the developed software in terms of efficiency and resource consumption;
Ensuring the reliability and scalability of the solutions built;
Creating performance and correctness tests for new system components;
Analysis of new technologies in terms of their applicability in production conditions;
Development of tools for monitoring and analyzing the operation of the production system;
Continuous optimization of existing tools and processes.
Selected technologies used :
Kafka Streams, Flume, Logstash;
Docker, Jenkins, Graphite;
Google Big Query, Elastic.
Selected issues that we have dealt with recently :
Replacement of the framework in the data processing component (transition from Storm to Kafka Streams);
Creating a data stream merger based on the Kafka Client API;
Creating a user profile synchronizer between DCs based on Kafka Streams;
Creating a component that calculates aggregates based on the Kafka Client API and Bloom filters;
Implementation of Logstash for loading and Elastic for querying indexed data (transition from Flume + Solr);
Creating end-to-end monitoring of data correctness and delay;
Replacement of the data streaming component to BigQuery and HDFS (from Flume to a proprietary solution based on Kafka Client API);
Continuous system maintenance, detection and resolution of performance problems, as well as scaling due to the growing amount of data.
Our expectations :
Proficiency in programming;
Excellent understanding of how complex IT systems work (from the hardware level, through software, to algorithmics);
Good knowledge of basic methods of creating concurrent programs and distributed systems (from thread level to continental level);
Practical ability to observe, monitor and analyze the operation of production systems (and draw valuable conclusions from it);
The ability to critically analyze the solutions created in terms of efficiency (from estimating the theoretical performance of the designed systems to detecting and removing actual performance problems in production);
Readiness to work in the DevOps model.
Additional advantages will be :
Experience in creating distributed systems;
Good knowledge of selected Big Data technologies such as Hadoop, Kafka, Storm, Spark or Flink;
Knowledge of application profiling methods and tools (preferably Java, both from the JVM and Linux level)
We offer :
Work in a team of enthusiasts who are willing to share their knowledge and experience;
Extremely flexible working conditions - we do not have core hours, we do not have holiday limits, you can work fully remotely;
Access to the latest technologies and the possibility of real use of them in a large-scale and highly dynamic project;
Hardware and software you need.
Do you have questions about the project, team, style of work? Visit our tech blog : http : / / techblog.rtbhouse.com / jobs /