Senior Data Engineer

Senior Data Engineer

Webellian Sp. z o.o.

Warszawa, Masovian
Hybrydowa
B2B
Data Engineering
SQL
PostgreSQL
Databricks
PySpark
Apache Airflow
Python
Data Governance
Data Integration
ETL

Hexjobs Insights

Role: Senior Data Engineer. Responsibilities include building data pipelines, maintaining Databricks workflows, and architecting PostgreSQL models. Requirements: 6+ years experience, SQL, Databricks, Python. Hybrid work in Warsaw.

Słowa kluczowe

Data Engineering
SQL
PostgreSQL
Databricks
PySpark
Apache Airflow
Python
Data Governance
Data Integration
ETL

Technologies we use

About the project

Your responsibilities

  • Design and build scalable data pipelines for ingestion, transformation, and serving of structured and unstructured data — supporting both batch and real-time AI workloads.
  • Develop and maintain Databricks-based data processing workflows: Delta Lake table management, PySpark transformations, notebook orchestration, and Unity Catalog governance.
  • Architect and optimise PostgreSQL data models: schema design, indexing strategies, partitioning, query performance tuning, and integration patterns for AI service consumption.
  • Build and maintain data orchestration workflows using Apache Airflow, Databricks Workflows, or equivalent — ensuring reliable scheduling, dependency management, and failure recovery.
  • Implement data quality frameworks: validation rules, anomaly detection, data contracts, and automated alerting on pipeline health and data freshness.
  • Design and manage feature engineering pipelines: transforming raw data into ML-ready feature sets, integrating with feature stores, and versioning feature definitions.
  • Own data integration patterns between operational PostgreSQL databases and the Databricks lakehouse: CDC (Change Data Capture), event-driven ingestion via Kafka, and batch export strategies.
  • Implement data governance standards: lineage tracking, cataloguing, access control, PII handling, data retention policies, and audit logging.
  • Collaborate with ML Engineers to design and deliver data pipelines supporting model training, batch inference, and real-time feature serving.
  • Monitor and operate data infrastructure: pipeline observability dashboards, SLA tracking, incident response, and root-cause analysis for data issues.
  • Champion Claude Code as an active daily tool for pipeline development, SQL generation, data exploration scripting, and documentation.

Our requirements

  • 6+ years of professional data engineering experience, with a strong track record of delivering production data pipelines at scale.
  • Expert-level SQL and strong PostgreSQL expertise: advanced query optimisation, schema design, indexing, partitioning, and understanding of MVCC and connection management.
  • Strong Databricks experience: Delta Lake, PySpark, Databricks Workflows, Unity Catalog, and performance tuning of large-scale Spark jobs.
  • Proficiency in Python for data pipeline development: pandas, PySpark, data validation libraries (Great Expectations or equivalent), and scripting for automation.
  • Experience with data orchestration frameworks: Apache Airflow, Databricks Workflows, or equivalent DAG-based scheduling tools.
  • Solid understanding of data integration patterns: CDC with Debezium or equivalent, Kafka-based event streaming, and batch ingestion strategies.
  • Hands-on experience with data lakehouse architecture: medallion architecture (Bronze/Silver/Gold), Delta Lake ACID transactions, and table optimisation.
  • Experience implementing data quality frameworks and data contracts in production pipelines.
  • Familiarity with Azure data services: Azure Data Factory, Azure Event Hubs, Azure Data Lake Storage, or equivalent cloud-native data tooling.
  • Hands-on proficiency with Claude Code: using it daily for pipeline development, SQL authoring, data exploration, and documentation tasks.
  • Strong communication skills: able to collaborate with data consumers (ML Engineers, analysts, product teams) to understand requirements and deliver reliable data products.

This is how we organize our work

This is how we work

This is how we work on a project

What we offer

  • Contract under Polish law: B2B or Umowa o Pracę
  • Benefits such as private medical care, group insurance, Multisport card
  • English classes available
  • Hybrid work (at least 1 day/week on-site) in Warsaw (Mokotów)
  • Opportunity to work with excellent professionals
  • High standards of work and focus on the quality of code
  • New technologies in use
  • Continuously learning and growth
  • International team
  • Pinball, PlayStation & much more (on-site)

Benefits

Wyświetlenia: 1
Opublikowana3 dni temu
Wygasaza 27 dni
Rodzaj umowyB2B
Tryb pracyHybrydowa
Źródło
Logo

Podobne oferty, które mogą Cię zainteresować

Na podstawie "Senior Data Engineer"

Nie znaleziono ofert, spróbuj zmienić kryteria wyszukiwania.