Work Experience
Lead Data Engineer
Blueforte GmbH
Sep 2023 - Present
Lead Data Engineer
Blueforte GmbH
Sep 2023 - Present
Leading the design and implementation of scalable data and MLOps architectures for enterprise clients.
- Optimized ML pre-processing workloads with an event-driven architecture on Azure with Databricks, PySpark, and EventHub, reducing pipeline runtimes from 2.5 hours to less than 1 hour.
- Built and hosted ML & GenAI apps with LangChain, LangGraph, and React/Node.js.
- Implemented robust CI/CD pipelines with automated testing in Azure DevOps for Python packages, enforcing software engineering best practices and improving team velocity.
- Developed a custom VSCode plugin for local Python/PySpark job debugging, reducing development cycles for the whole team.
- Built a custom monitoring & alerting solution with Python & Grafana on Azure Cloud.
DevOpsPySparkBashEventHubDatabricksTerraformAzurePythonLangChainTypeScript
Machine Learning Engineer
Lidl Digital
Mar 2021 - Aug 2023
Machine Learning Engineer
Lidl Digital
Mar 2021 - Aug 2023
Developed and productionized ML models for forecasting, pricing, and personalization.
- Training & fine-tuning ML models with xgboost, scikit-learn, Tensorflow/Keras.
- Productionizing and serving ML models (Forecasting, Fraud Detection, Recommendation, Search) with FastAPI, Docker, and Helm on Kubernetes in Google Cloud.
- Setting up PySpark pipelines for ML training workloads and Feature Stores in Databricks and MLFlow.
PythonFastAPIPySparkDatabricksKubernetesTensorFlowscikit-learnAirflowMLflowDockerHelm
Data Engineer
Olympus Europa SE & Co. KG
Feb 2018 - Feb 2021
Data Engineer
Olympus Europa SE & Co. KG
Feb 2018 - Feb 2021
Developed data integration workflows and reporting pipelines for European operations.
- Developed of ETL/ELT pipelines with Python, Airflow, and BigQuery.
- Wrote of web apps with Django and Flask for internal use.
- Set up data modelling in the DWH (Data Vault 2.0).
- Created CI/CD pipelines (GitLab functions) and managed the Airflow Kubernetes cluster.
- Managed security and networks on Google Cloud (firewall, virtual private cloud, load balancer).
PythonAirflowBigQueryGoogle CloudData Vault