In-depth knowledge in programming languages such as Python incl. Spark/Scala.
Experience with ETL tools and lakehouse architectures (3+years) through Databricks, Apache SparkSql and similar
Strong SQL skills for data manipulation and querying Pipeline efficiency optimization skills
Hands on experience with Agile technical practices, source versioning and Agile project management tools (Azure Repos, GitLab, Azure DevOps, Jira, Confluence, other)
Database Knowledge: Familiarity with relational and non-relational databases (Oracle, MSSQLServer).
Agile planning skills – Kanban and Scrum release/sprint planning.
Problem-Solving: Proven ability to solve complex data engineering challenges and optimize system performance.
Good communication and interpersonal skills to collaborate effectively with team members and stakeholders
College degree in STEM-related discipline.
Your responsibilities
Cross-functional Team: Work within a cross-functional Agile team alongside Product Owner, Data Architect and Data Analysts to deliver on release goals for a global program
Data Workload Design: Design scalable and efficient data workloads architectures that integrate our transactional systems with a new lakehouse data model.
Data Integration: Develop, test and maintain data pipelines (ETL) for integrating diverse data sources into a unified format embedding best practices and standards.
Data Management: Manage and optimize data to ensure efficient data storage, retrieval, and processing.
Data Quality Management: Implement automated data quality checks and ensure data integrity throughout the migration process.
Documentation: Create and maintain comprehensive documentation for data processes, ensuring knowledge transfer, observability and supportability.
Performance Monitoring: Monitor and optimize data performance to meet defined service-level agreements.
Troubleshooting: Identify and resolve data-related issues in a timely manner, collaborating with relevant teams.