Site Reliability Engineer
PubNub
Katowice, Poland
4 d. temu

PubNub powers apps that bring people together in realtime for remote work, play, learning, and health. Thousands of companies use PubNub’s Realtime Communication Platform and its APIs as the foundation for online chat, live events, geolocation, remote control, and live updates, at massive global scale.

Since 2010, PubNub has invested in the tools and global infrastructure required to serve customers like Adobe, DocuSign, Peloton, and RingCentral, delivering SOC 2 Type 2 security and reliability while meeting regulatory needs like HIPAA and GDPR.

PubNub has raised over $70M from notable investors like Sapphire, Scale, Relay, Cisco, Bosch, Ericsson, and HPE.

We are an all-star technical team comprising of folks who have been part of successful acquisitions in enterprise and consumer software companies.

If you like hyper scale systems and engineering projects that redefine limits, PubNub is for you.

PubNub is proud to be an EEO employer.

Job Summary :

As a member of PubNub's Engineering organization, you will work alongside Engineers and Architects in designing, developing, operating and scaling PubNub’s Data Stream Network, with a focus on improving the reliability, scale and efficiency of our global Data Stream Network.

The infrastructure you will manage creates billions of events and produces terabytes of data on a daily basis. You will have the unique opportunity to help architect PubNub's infrastructure to solve challenging problems related to distributed systems, real-time messaging, and large scale data management.

Responsibilities :

  • Design processes for improving operational stability of PubNub services
  • Identify, document and help improve performance and operational efficiency challenges
  • Assist in rationalizing PubNub's infrastructure as code and automation tooling
  • Create tooling with documentation to scale our distributed systems
  • Ensure and enforce best application and network security practices
  • Participate in incident management on-call rotation and drive root cause analysis
  • Collaborate with engineering teams, product owners and other stakeholders to develop tooling and CI / CD patterns
  • Help define Service Level Objectives to assess release readiness of all services
  • Support, monitor and manage cloud infrastructure and environments (AWS EC2, DNS, load balancers, and databases)
  • Experience & Skills Required :

  • 3+ years of cloud platform experience. AWS preferred
  • 3+ years of programming (Python, GO, Java, or equivalent)
  • Configuration management and automation tools such as Ansible, Terraform, etc
  • Experience with CI / CD tools and implementing best practices
  • Solid principles in cloud resources such as networking, load balancing, DNS, and security
  • BS or MS in Computer Science or a related technical field
  • Preferred :

  • Containerization experience (Docker, etc)
  • Container orchestration systems management (Kubernetes, etc)
  • Experience developing, supporting or operating large-scale, distributed SaaS products
  • Desire to automate tedious tasks and eliminate inefficiencies
  • A passion for system stability, performance, scalability or customer success
  • Previous participation in Incident Management teams
  • Zgłoś tę pracę
    checkmark

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    Aplikuj
    Mój adres email
    Klikając przycisk "Kontynuuj", wyrażam zgodę neuvoo na przetwarzanie moich danych i wysyłanie powiadomień e-mailem, zgodnie z zasadami przedstawionymi przez neuvoo. W każdej chwili mogę wycofać moją zgodę lub zrezygnować z subskrypcji.
    Kontynuuj
    Formularz wniosku