Senior Data Engineer (Praca zdalna)

Simon-Kucher Core Business Services Sp. z o. o.

Warszawa, Mokotów
23500 zł/mth.
Zdalna
🐍 Python
SQL
Apache Spark
Kafka
☁️ AWS
🌐 Zdalna

Requirements

Expected technologies

Python

SQL

Apache Spark

Kafka

AWS

Optional technologies

Spark

Flink

Microsoft Azure

Our requirements

  • 6+ years in data engineering at scale; strong Python/SQL; Spark or Flink; Parquet/columnar formats.
  • Experience with big data processing frameworks like Apache Spark and messaging systems like Kafka.
  • Orchestration (Airflow/Argo/Step Functions); IaC (Terraform/CDK) basics.
  • Data modeling for analytics/ML; partitioning, z-ordering/clustering; performance tuning.
  • Experience with probabilistic linkage/fuzzy matching (e.g., blocking, string similarity, Fellegi–Sunter-style models).
  • Proficiency with cloud-native services on AWS for data processing and storage. Knowledge of Azure is nice to have.
  • Databases: Experience with various SQL and NoSQL databases.
  • Security & governance (PII handling, encryption, IAM, row/column-level policies).
  • Open table formats (Iceberg/Delta/Hudi); dbt; Kafka/MSK/Kinesis; Great Expectations/Deequ.
  • Ability to analyze complex datasets and design efficient data solutions.
  • Strong ability to collaborate effectively with cross-functional teams.
  • Excellent communication skills to explain technical concepts to both technical and non-technical stakeholders.
  • Strong understanding of AI/ML concepts for effective collaboration with data scientists.

Your responsibilities

  • Develop and maintain data architecture: create and manage robust data architectures that support high-volume, high-throughput SaaS applications, focusing on reliability and scalability.
  • Design and implement batch/stream pipelines (CDC, API, files) with schema evolution, idempotency, and data quality gates.
  • Integrate internal and external data sources, structured and unstructured (e.g., pricing databases, market benchmarks, CRM).
  • Model core entities and features; choose storage layouts and partitioning; build reusable data products.
  • Implement entity resolution and fuzzy matching; evaluate and tune matching quality.
  • Implement ETL/ELT processes: develop processes for extracting, transforming, and loading data from multiple sources into data warehouses or lakes for analytical use.
  • Ensure data quality and security: implement data validation, cleansing routines, and security measures, including encryption and access controls, to ensure data accuracy, privacy, and compliance with regulations.
  • Own orchestration, lineage, and observability; define SLAs and error budgets.
  • Partner with Product Owner to translate customer needs into scalable data and ML solutions.
  • Partner with ML/MLOps on feature pipelines.
  • Work with Cloud Platform Engineer to deploy and manage services securely.
Wyświetlenia: 8
Opublikowana24 dni temu
Wygasaza około 3 godziny
Tryb pracyZdalna
Źródło
Logo
Logo

Podobne oferty, które mogą Cię zainteresować

Na podstawie "Senior Data Engineer"