5+ years of proven commercial experience in implementing, developing, or maintaining Big Data systems.
Strong programming skills in Python or Java/Scala: writing a clean code, OOP design.
Experience in designing and implementing data governance and data management processes.
Familiarity with Big Data technologies like Spark, Cloudera, Airflow, NiFi, Docker, Kubernetes, Iceberg, Trino or Hudi.
Proven expertise in implementing and deploying solutions in cloud environments (with a preference for AWS).
Excellent understanding of dimensional data and data modeling techniques.
Excellent communication skills and consulting experience with direct interaction with clients.
Ability to work independently and take ownership of project deliverables.
Master’s or Ph.D. in Computer Science, Data Science, Mathematics, Physics, or a related field.
Fluent English (C1 level) is a must.
Your responsibilities
Design and develop scalable data management architectures, infrastructure, and platform solutions for streaming and batch processing using Big Data technologies like Apache Spark, Hadoop, Iceberg.
Design and implement data management and data governance processes and best practices.
Contribute to the development of CI/CD and MLOps processes.
Develop applications to aggregate, process, and analyze data from diverse sources.
Collaborate with the Data Science team on data analysis and Machine Learning projects, including text/image analysis and predictive model building.
Develop and organize data transformations using DBT and Apache Airflow.
Translate business requirements into technical solutions and ensure optimal performance and quality.