Aplikuj teraz

Data Engineer (Spark) (Praca zdalna)

Addepto

Warszawa +6 więcej
21000 zł/mth.
Zdalna
🐍 Python
Spark
🤖 Airflow
🐳 Docker
📊 Big Data
🌐 Zdalna

Requirements

Expected technologies

Python

Spark

Airflow

Docker

Big Data

Optional technologies

Java

Scala

Kubeflow

MLFlow

Databricks

dbt

Kafka

Iceberg

Kubernetes

Operating system

Windows

macOS

Our requirements

  • At least 3 years of commercial experience implementing, developing, or maintaining Big Data systems, data governance and data management processes.
  • Strong programming skills in Python (or Java/Scala): writing a clean code, OOP design.
  • Hands-on with Big Data technologies like Spark, Cloudera, Data Platform, Airflow, NiFi, Docker, Kubernetes, Iceberg, Hive, Trino or Hudi.
  • Excellent understanding of dimensional data and data modeling techniques.
  • Experience implementing and deploying solutions in cloud environments.
  • Consulting experience with excellent communication and client management skills, including prior experience directly interacting with clients as a consultant.
  • Ability to work independently and take ownership of project deliverables.
  • Fluent in English (at least C1 level).
  • Bachelor’s degree in technical or mathematical studies.

Optional

  • Experience with an MLOps framework such as Kubeflow or MLFlow.
  • Familiarity with Databricks, dbt or Kafka.

Your responsibilities

  • Develop and maintain a high-performance data processing platform for automotive data, ensuring scalability and reliability.
  • Design and implement data pipelines that process large volumes of data in both streaming and batch modes.
  • Optimize data workflows to ensure efficient data ingestion, processing, and storage using technologies such as Spark, Cloudera, and Airflow.
  • Work with data lake technologies (e.g., Iceberg) to manage structured and unstructured data efficiently.
  • Collaborate with cross-functional teams to understand data requirements and ensure seamless integration of data sources.
  • Monitor and troubleshoot the platform, ensuring high availability, performance, and accuracy of data processing.
  • Leverage cloud services (AWS) for infrastructure management and scaling of processing workloads.
  • Write and maintain high-quality Python (or Java/Scala) code for data processing tasks and automation.
Wyświetlenia: 21
Opublikowanaokoło miesiąc temu
Wygasaza 12 dni
Tryb pracyZdalna
Źródło
Logo

Podobne oferty, które mogą Cię zainteresować

Na podstawie "Data Engineer (Spark)"