Senior Site Reliability Engineer (SRE) for Cloud Services

Senior Site Reliability Engineer (SRE) for Cloud Services

Samsung R&D Institute Poland

Lokalizacja
Warszawa+1
Site Reliability Engineer
DevOps
Kubernetes
AWS
Terraform
CI/CD
Linux
Observability
Software engineering
Agile

Hexjobs Insights

Poszukiwany Site Reliability Engineer do zespołu Samsung Ads. Kluczowe obowiązki obejmują projektowanie skalowalnych systemów, monitorowanie SLO, automatyzację i współpracę z zespołami. Wymagana znajomość Kubernetes, AWS, Terraform.

Słowa kluczowe

Site Reliability Engineer
DevOps
Kubernetes
AWS
Terraform
CI/CD
Linux
Observability
Software engineering
Agile

Benefity

  • Prywatna opieka medyczna (możliwość dodania członków rodziny za darmo)
  • Karta Multisport
  • Ubezpieczenie na życie
  • Karta lunchowa
  • Częściowy zwrot kosztów kursu języka angielskiego
  • Możliwość nauki języka koreańskiego za darmo
  • Różnorodne zniżki (produkcja Samsung, teatry, restauracje)
  • Nielimitowany dostęp do Centrum Nauki Kopernik dla siebie i przyjaciół
  • Możliwość testowania nowych produktów Samsung

What you will do

About our Team

We seek a highly skilled and motivated Site Reliability Engineer (SRE) to join our Samsung Ads team. As an embedded SRE, you will collaborate with other SREs and development teams to enhance the reliability, scalability, and performance of diverse business services. You will act as a subject matter expert, driving best practices in observability, automation, and system design within a dynamic, multi-team environment.

Samsung Ads is a globally distributed organization that has evolved into a dynamic and innovative business over the past six years. As a key part of Samsung’s ecosystem, it drives growth through cutting-edge advertising solutions. Join us to contribute to the future of our ad tech platforms.

Role and Responsibilities

  • Design and implement systems with built-in reliability, self-healing capabilities, and scalable architectures.
  • Define and monitor Service Level Objectives (SLOs) and error budgets to ensure system resilience.
  • Collaborate with Product Owners and development teams to translate requirements into technical solutions prioritizing operability.
  • Enhance observability systems (metrics, logging, tracing) to proactively identify and address system health issues.
  • Develop automation tools and CI/CD pipelines to streamline deployment and operational workflows.
  • Participate in on-call rotations and lead blameless post-incident reviews to drive improvement.
  • Optimize resource utilization and performance across hardware, software, and cloud environments.
  • Establish backup, disaster recovery plans, and conduct regular readiness exercises.
  • Manage technical relationships with vendors and evaluate new technologies for scalability.

Technologies in use

  • AWS
  • Kubernetes, Rancher and AWS EKS
  • Terraform
  • Grafana, Sloth, Loki, Tempo
  • Okta, Prometheus, Sumo Logic, HashiCorp Vault
  • GitHub Actions, ArgoCD, Argo Rollout

What we offer

We offer

Team:

o  Friendly working atmosphere

o  Wide range of trainings

o  Opportunity to work in multiple projects

o  Working with the latest technologies on the market

o  Monthly integration budget

o  Possibility to attend local and foreign conferences

o  Start of work between 7 a.m. and 10 a.m.

Equipment:

o  PC workstation/Laptop + 2 external monitors

Benefits:

o  Private medical care (possibility to add family members for free)

o  Multisport card

o  Life insurance

o  Lunch card

o  A partial reimbursement of the cost of an English language course

o  Possibility to learn Korean for free

o  Variety of discounts (Samsung products, theaters, restaurants)

o  Unlimited free access to Copernicus Science Center for you and your friends

o  Possibility to test new Samsung products

Location:

o  Office in Warsaw Spire near metro station

o  Hybrid model – 3 days from the office per week

o  Attractive relocation package

Requirements

Skills and Qualifications

  • At least 6 years of experience in DevOps or SRE roles.
  • Expertise in Kubernetes administration (CKA/CKAD/CKS preferred) and AWS.
  • Proficiency in software engineering (Go, Python, Java, or similar) and Bash scripting.
  • Experience with Infrastructure as Code (Terraform) and CI/CD tools (GitHub Actions, Jenkins, etc.).
  • Strong Linux system administration and troubleshooting skills.
  • Knowledge of basic network protocols and distributed systems.
  • Practical experience with observability tools (Prometheus, Sumo Logic) and microservices.
  • Strong analytical skills and ability to solve complex problems creatively.
  • Self-organized with the ability to manage multiple priorities in a fast-paced environment.
  • Deep understanding of SRE, DevOps, and Agile principles.
  • Excellent communication skills in English (verbal and written) for global collaboration.
  • B.Sc. or M.Sc. in Informatics, Telecommunication, or a related field.
  • Flexibility, collaboration, resourcefulness, and a positive attitude.

Zaloguj się, aby zobaczyć pełne szczegóły

Utwórz darmowe konto, aby uzyskać dostęp do pełnego opisu oferty i zaaplikować.

Wyświetlenia: 1
Opublikowanaokoło 18 godzin temu
Wygasaza 29 dni
Źródło
Logo

Podobne oferty, które mogą Cię zainteresować

Na podstawie "Senior Site Reliability Engineer (SRE) for Cloud Services"

Nie znaleziono ofert, spróbuj zmienić kryteria wyszukiwania.