Senior Data Engineer (AI Consumer Intelligence Platform)

Senior Data Engineer (AI Consumer Intelligence Platform) (Praca zdalna)

Kratos Growth

New York
B2B
💼 B2B
🐍 Python
PySpark
SQL
☁️ Azure
📊 Databricks
🚢 Kubernetes
NLP
🧠 ML
📊 DataOps
ETL/ELT

Podsumowanie

Senior Data Engineer role focusing on designing data pipelines and infrastructure for NLP/ML systems. Requires extensive experience in data engineering and technologies like Python and Azure.

Słowa kluczowe

PythonPySparkSQLAzureDatabricksKubernetesNLPMLDataOpsETL/ELT

Benefity

  • Fully remote work
  • Performance bonus
  • Equity opportunities
  • Work with cutting-edge NLP/ML technologies
  • Path to technical leadership

Opis stanowiska

Our client is hiring Data EngineersJoin a rapidly growing AI Consumer Intelligence Platform Delivering Insights for the World’s Biggest BrandsHiring Company BackgroundWe're an AI-powered consumer intelligence platform that processes 50+ billion data points monthly - Google searches, social conversations, product reviews, and videos - to deliver actionable consumer insights for Fortune 500 brands in days instead of months. Our clients include global leaders in beverages, personal care, and consumer packaged goods.The Role As a Senior Data Engineer, you'll architect and scale production data pipelines that power our NLP and ML systems processing billions of multilingual data points daily. Reporting to our newly appointed CTO, you'll own the complete data lifecycle—from ingestion and transformation through deployment and observability—while defining infrastructure standards for a growing engineering team.This is a high-ownership role at an early stage: no legacy code politics and no entrenched hierarchies. You'll convert MVPs into scalable products, establish DataOps/DevOps standards, and design governance mechanisms that prevent technical debt. Your architectural decisions will directly impact how Fortune 500 companies access real-time consumer intelligence.Tech Stack- Core: Python, PySpark, SQL- Cloud & Infrastructure: Azure ecosystem, Databricks- Deployment: Kubernetes, containerization, observability tooling- NLP/ML: Large Language Models, LLM APIs, Spacy/NLTK/CoreNLP/TextBlob- Data: Robust pipelines for multi-language text at scaleWhat You'll Do • Design, build, and maintain production data pipelines processing 10M+ text records daily across multiple languages• Architect scalable NLP data infrastructure using PySpark, Databricks, and Azure services• Integrate Large Language Model APIs into production pipelines for text analysis and enrichment• Establish DataOps standards including CI/CD, testing frameworks, and deployment automation• Implement observability and alerting for pipeline health, data quality, and system performance• Collaborate with data scientists to productionize ML models and NLP systems• Define data governance frameworks and quality SLAs for enterprise client delivery• Mentor team members and contribute to technical hiring as the team scalesRequired QualificationsExperience- 5+ years building and maintaining production ETL/ELT data pipelines - 2+ years working with text/NLP data (tokenization, embeddings, multilingual processing)- 3+ years of data products shipped to production that serve active business uers Technical Skills - Python: 4+ years in production environments (required)- PySpark: 2+ years (or, Spark with 1+ years + strong Python at 4+ years)- SQL: 3+ years including complex queries and performance optimization - Databricks: 1+ years production use (notebooks, Delta Lake, job scheduling)- Cloud Platform: 2+ years with Azure (preferred) or equivalent AWS/GCP experience- Containers/Kubernetes: Experience deploying containerized applications to KubernetesEducation Bachelor's degree in Computer Science, Data Science, Engineering, or related quantitative field. Preferred Qualifications- 3+ years Databrick experience, including Delta Lake architecture and Unity Catalog (strongly preferred)- Azure ecosystem depth: Data Factory, Databricks, Blob Storage, DevOps- LLM integration experience: OpenAI, Anthropic, or Azure OpenAI API integration in production- LLM fine-tuning experience - Experience with observability tools: DataDog, Grafana, or Azure Monitor - Processing experience at scale: 1B+ records- Multilingual text processing: 3+ non-English languages with Unicode and tokenization handling- NLP libraries: Spacy, NLTK, CoreNLP, TextBlob (familiarity)- Consumer insights, CPG/FMCG, or advertising technology experience- Experience at early-stage, high-growth companiesWhat We Offer- Competitive compensation (with performance bonus and equity opportunities)- Fully remote (4+ hour overlap with U.S. Eastern Time Zone desired)- Modern Stack: Work with cutting-edge NLP/ML technologies at scale- High ownership & impact: Directly shape architecture decisions for a platform serving Fortune 500 clients- Growth Trajectory: Join as a foundational team member with clear path to technical leadership

Zaloguj się, aby zobaczyć pełny opis oferty

Wyświetlenia: 10
Opublikowana6 dni temu
Wygasaza 24 dni
Rodzaj umowyB2B
Źródło
Logo

Podobne oferty, które mogą Cię zainteresować

Na podstawie "Senior Data Engineer (AI Consumer Intelligence Platform)"

Nie znaleziono ofert, spróbuj zmienić kryteria wyszukiwania.