Senior Site Reliability Engineer (SRE) for Cloud Services

What you will do

About our Team

We seek a highly skilled and motivated Site Reliability Engineer (SRE) to join our Samsung Ads team. As an embedded SRE, you will collaborate with other SREs and development teams to enhance the reliability, scalability, and performance of diverse business services. You will act as a subject matter expert, driving best practices in observability, automation, and system design within a dynamic, multi-team environment.

Samsung Ads is a globally distributed organization that has evolved into a dynamic and innovative business over the past six years. As a key part of Samsung’s ecosystem, it drives growth through cutting-edge advertising solutions. Join us to contribute to the future of our ad tech platforms.

Role and Responsibilities

Design and implement systems with built-in reliability, self-healing capabilities, and scalable architectures.
Define and monitor Service Level Objectives (SLOs) and error budgets to ensure system resilience.
Collaborate with Product Owners and development teams to translate requirements into technical solutions prioritizing operability.
Enhance observability systems (metrics, logging, tracing) to proactively identify and address system health issues.
Develop automation tools and CI/CD pipelines to streamline deployment and operational workflows.
Participate in on-call rotations and lead blameless post-incident reviews to drive improvement.
Optimize resource utilization and performance across hardware, software, and cloud environments.
Establish backup, disaster recovery plans, and conduct regular readiness exercises.
Manage technical relationships with vendors and evaluate new technologies for scalability.

Technologies in use

AWS
Kubernetes, Rancher and AWS EKS
Terraform
Grafana, Sloth, Loki, Tempo
Okta, Prometheus, Sumo Logic, HashiCorp Vault
GitHub Actions, ArgoCD, Argo Rollout

What we offer

We offer

Team:

o Friendly working atmosphere

o Wide range of trainings

o Opportunity to work in multiple projects

o Working with the latest technologies on the market

o Monthly integration budget

o Possibility to attend local and foreign conferences

o Start of work between 7 a.m. and 10 a.m.

Equipment:

o PC workstation/Laptop + 2 external monitors

Benefits:

o Private medical care (possibility to add family members for free)

o Multisport card

o Life insurance

o Lunch card

o A partial reimbursement of the cost of an English language course

o Possibility to learn Korean for free

o Variety of discounts (Samsung products, theaters, restaurants)

o Unlimited free access to Copernicus Science Center for you and your friends

o Possibility to test new Samsung products

Location:

o Office in Warsaw Spire near metro station

o Hybrid model – 3 days from the office per week

o Attractive relocation package

Requirements

Skills and Qualifications

At least 6 years of experience in DevOps or SRE roles.
Expertise in Kubernetes administration (CKA/CKAD/CKS preferred) and AWS.
Proficiency in software engineering (Go, Python, Java, or similar) and Bash scripting.
Experience with Infrastructure as Code (Terraform) and CI/CD tools (GitHub Actions, Jenkins, etc.).
Strong Linux system administration and troubleshooting skills.
Knowledge of basic network protocols and distributed systems.
Practical experience with observability tools (Prometheus, Sumo Logic) and microservices.
Strong analytical skills and ability to solve complex problems creatively.
Self-organized with the ability to manage multiple priorities in a fast-paced environment.
Deep understanding of SRE, DevOps, and Agile principles.
Excellent communication skills in English (verbal and written) for global collaboration.
B.Sc. or M.Sc. in Informatics, Telecommunication, or a related field.
Flexibility, collaboration, resourcefulness, and a positive attitude.

Zaloguj się, aby zobaczyć pełne szczegóły

Utwórz darmowe konto, aby uzyskać dostęp do pełnego opisu oferty i zaaplikować.

Opublikowana	około 18 godzin temu
Wygasa	za 29 dni
Źródło