PythonSQLScalaTerraformApache AirflowApache KafkadbtPySparkDatabricksGCPAzureAWSBigQuerySnowflakeDelta LakeDockerKubernetesGitHub ActionsPostgreSQLNeo4jMongoDBCassandraPower BIGrafanaFastAPIStreamlitDataikuClaude CodeGitHub CopilotPythonSQLScalaTerraformApache AirflowApache KafkadbtPySparkDatabricksGCPAzureAWSBigQuerySnowflakeDelta LakeDockerKubernetesGitHub ActionsPostgreSQLNeo4jMongoDBCassandraPower BIGrafanaFastAPIStreamlitDataikuClaude CodeGitHub Copilot
👋Hello there!

I'm
Ganpat Patel,
Data Engineer.

Designing scalable data systems that transform raw data into meaningful decisions. Based in Paris 🇫🇷

Data Engineer Associate🧱
Data Engineer Professional🧱
Azure Fundamentals☁️
Azure Data Fundamentals🗄️
Data Engineer Associate🧱
Data Engineer Professional🧱
Azure Fundamentals☁️
Azure Data Fundamentals🗄️
Data Engineer Associate🧱
Data Engineer Professional🧱
Azure Fundamentals☁️
Azure Data Fundamentals🗄️
Ganpat Patel
3.5+
Years Experience
📍
Paris, France
Open to relocation

01 — About

Building the backbone
of data-driven decisions.

I am Ganpat Patel, a Data Engineer based in Paris. I design and build robust, scalable data platforms — from real-time ingestion across diverse sources to orchestrated workflows using Airflow, distributed processing with PySpark and Databricks and modern cloud platforms across GCP, Azure and AWS.

Recently at Deezer, I built a CI/CD migration framework, transitioning pipelines from Jenkins to GitHub Actions, while auditing and redesigning existing workflows. In parallel, I worked on a GCP-based monitoring solution to improve FinOps visibility and cloud cost tracking. Prior to that at Mu Sigma, I led a team managing a large suite of ADF pipelines and production data assets, spearheaded a data integrity initiative recalibrating hundreds of Hive tables to resolve historical data corruption, and developed a Neo4j graph analytics MVP to identify disease trends in pet healthcare.

3.5+
Years of Experience
4+
Certifications
170+
Production Pipelines Managed
2000+
Production Tables Maintained
150+
Data Assets Restored
3+
Monitoring Solutions Built

Spoken Languages

🇬🇧English
B2 — IELTS
🇫🇷French
B1 — TCF
🇮🇳Hindi
C2 — Native
🇮🇳Rajasthani
C2 — Native
🇮🇳Kannada
A2 — Spoken only

Beyond the DataInterests & Passions

🗺️Geography
🌍Geopolitics
📜History
✈️Traveling
🏏Cricket
🤯Paradox Theories

02 — Skills

My Skills

Technical Stack

PPython
Python
Language
S
SQL
Language
SScala
Scala
Language
TTerraform
Terraform
IaC
AApache Airflow
Apache Airflow
Orchestration
AApache Kafka
Apache Kafka
Streaming
d
dbt
Transformation
PPySpark
PySpark
Processing
DDatabricks
Databricks
Platform
D
Delta Lake
Storage
GGCP
GCP
Cloud
AAzure
Azure
Cloud
AAWS
AWS
Cloud
BBigQuery
BigQuery
Warehouse
SSnowflake
Snowflake
Warehouse
DDocker
Docker
Infrastructure
KKubernetes
Kubernetes
Infrastructure
GGitHub Actions
GitHub Actions
DevOps
PPostgreSQL
PostgreSQL
Database
NNeo4j
Neo4j
Graph DB
MMongoDB
MongoDB
Database
CCassandra
Cassandra
Database
PPower BI
Power BI
Visualisation
GGrafana
Grafana
Monitoring
DDataiku
Dataiku
Platform
FFastAPI
FastAPI
API
SStreamlit
Streamlit
App
CClaude Code
Claude Code
Platform
GGitHub Copilot
GitHub Copilot
Platform

Soft Skills

Problem Solving
Execution
Critical Thinking
Strategy
Storytelling
Communication
Collaboration
Teamwork
Ideation
Creativity
Adaptability
Execution
Stakeholder Communication
Communication
Ownership
Leadership
Prioritization
Execution
Leadership
Leadership
Time Management
Execution
Decision-Making
Strategy
Attention to Detail
Execution
Business Acumen
Strategy

03 — Certifications

Verified Credentials

Click any card to verify on the issuer's official site.

04 — Experience

Where I've Worked

Experience
Education
Experience

Deezer S.A.

Data Engineer Intern

📍 Paris, Ile-de-France, France

Sep 2025 – Feb 2026
Designed and maintained reliable real-time data pipelines feeding downstream systems for reporting and analytics.
Managed scalable GCP infrastructure using Terraform to keep deployments reproducible and consistent across environments.
Improved the robustness and monitoring of Airflow workflows under production constraints to maintain system reliability.
Contributed to the Syslog-to-Kafka migration through code refactoring, bug fixing, and hackathon collaboration.
OrchestrationSQLAirflowDevOpsGCPTerraformPython

Mu Sigma

Trainee Decision Scientist-3

📍 Bengaluru, Karnataka, India

Jan 2024 – Jul 2024
Led daily ETL processes and maintained the availability of 1500+ Hive assets consumed for reporting and insight generation.
Monitored and managed 40+ Azure Data Factory pipelines, resolving failures to keep critical workflows on track.
Coordinated with upstream source teams and business stakeholders to ensure timely delivery and data integrity.
Drove enhancements to existing pipeline structures and continuous ETL improvements for better operational efficiency.
OrchestrationPyCharmSQLAzure DevOps ServerDevOpsMicrosoft AzureAzure Data FactoryPySparkAzure Data LakeServiceNowGitPython

Mu Sigma

Trainee Decision Scientist-2

📍 Bengaluru, Karnataka, India

Oct 2022 – Jan 2024
Supported data-driven decision-making across multiple engagements using analytical tools and engineering workflows.
Managed ETL refresh processes and solved downstream reporting discrepancies to ensure on-time data availability.
Recalibrated 150+ historical data assets to fix corruption caused by backfills without proper upsert handling.
Delivered a Neo4j graph analytics proof of concept and analytical reports that surfaced actionable client insights.
PyCharmReactDevOpsAzure DatabricksPower BIAzure Data FactoryData AnalysisPySparkAzure Data LakeNeo4jGitAdvanced Excel

Tata Consultancy Services

Assistant System Engineer

📍 Bengaluru, Karnataka, India

Aug 2021 – Sep 2022
Worked with a major European energy organization on IT Service Management using the ServiceNow platform.
Handled incidents, changes, requests, reports, knowledge, service catalog, flows, workflows, notifications, and scripting.
Configured lists, forms, filters, import sets, transform maps, plugins, and related development activities.
Strengthened platform expertise by clearing the ServiceNow CSA certification during the role.
ServiceNow AdministrationScriptingIT Service ManagementService DeskServiceNow
Internship

Data Science Intern

Exposys Data Labs

📍 Bengaluru, Karnataka, India

Jul 2020 – Aug 2020

Focused on customer segmentation using K Means clustering to analyze customer behavior and build practical business-facing ML solutions. The internship provided hands-on experience in data analysis, pattern discovery, and real-world machine learning application.

Churn ManagementData ScienceData AnalysisPython
Internship

Data Science Intern

Great Learning

📍 India

Jun 2020 – Aug 2020

Completed a 9-week program with 7 weeks of training and 2 weeks of project work, delivering 2 mini-projects and 1 major project. The experience strengthened practical problem-solving, teamwork, and deadline-driven delivery in data science.

Data ScienceData CleaningExploratory Data AnalysisPython
Internship

Machine Learning Intern

Verzeo

📍 India

Jul 2019 – Aug 2019

Completed a 2-month machine learning internship supported by the Microsoft Technology Associate program, working on predictive models, NLP use cases, and deep learning tasks. The experience built a strong foundation in preprocessing, model selection, and evaluation.

Machine LearningNatural Language ProcessingPython
Education

Master's in Computer Science

Data Science & Analytics

EPITA

📍 Paris, France

Sep 2024 – Apr 2026

Completed an MSc in Computer Science with a specialization in Data Science and Analytics. The program built depth across data analysis, machine learning, deep learning, and big data systems through hands-on projects focused on large-scale datasets, predictive models, and production-ready data solutions.

Data AnalysisMachine LearningDeep LearningBig Data

MBA in Business Analytics

Executive Programme

Manipal Academy of Higher Education

📍 Manipal, India

Jun 2023 – Aug 2025

Completed an MBA in Business Analytics with structured training in business strategy and data-driven decision making. Built a strong foundation in analytics, big data, digital and web analytics, predictive modeling, and data visualization through industry-oriented case studies and practical projects.

Grade: 9.21
PythonBusiness AnalyticsData Science

Bachelor of Engineering

Computer Science & Engineering

Visvesvaraya Technological University

📍 Belagavi, India

Aug 2017 – Aug 2021

Completed a BE in Computer Science and Engineering with a strong foundation in algorithms, data structures, programming, software engineering, and database management. Projects and coursework strengthened problem-solving skills and prepared me for real-world software and data challenges.

Grade: 7.92 CGPA
Code TroopersNSSEco ClubClass Committee
AlgorithmsData StructuresSoftware EngineeringDatabases

05 — Projects

What I've Built

RECENT

Feb 2025 – Present

The Flight Detective

Streamlit app predicting flight prices with Airflow automating data ingestion and scheduled prediction jobs. Integrated Great Expectations for data validation and Grafana for real-time monitoring.

StreamlitFastAPIAirflowPostgreSQLGrafanaGreat Expectations
View on GitHub
RECENT

Apr 2025 – Present

Fake News Detection — Deep Learning & LLMs

Evaluating deep learning models (LSTM, GRU, TextCNN) for fake news classification. Using real-time web data and LLM-based models (RoBERTa) to improve classification accuracy.

DLLLMsRoBERTaLSTMGRUNLP
View on GitHub

May 2025

Optimized Exhibition Opening

Optimised frameglass pairing and ordering using greedy algorithms with MinHash and Locality Sensitive Hashing (LSH). Enhanced tag diversity using disjoint tags technique.

Greedy AlgorithmMinHashLSHPythonOptimisation
View on GitHub

2022 – 2024

Azure ADF Pipeline Suite

40+ production pipelines managing 1500+ Hive data assets at Mu Sigma. Built monitoring, alerting and ETL discrepancy resolution — improved success rate by 6%.

Azure ADFPySparkSQLDatabricksPython
View on GitHub

2023

Neo4j Knowledge Graph POC

Graph analytics MVP modelling disease trends in pet healthcare for a Fortune 500 analytics client. Built using Neo4j and Cypher queries for relationship analysis.

Neo4jCypherPythonGraph Analytics
View on GitHub

2023

Hash Value Recalibration

2-phase data integrity project recalibrating 150+ Hive data tables to resolve historical data corruption. Coordinated with cross-functional stakeholders for validation.

ETLPythonHiveAzure
View on GitHub

2021

TapShip

Platform connecting farmers to national markets — end-to-end from logistics data ingestion to delivery tracking. Built during university as a full-stack project.

FastAPIPostgreSQLJavaScript
View on GitHub

2021

LaLiga Stats Engine

Statistical analysis pipeline scraping, transforming and visualising La Liga match data across multiple seasons using Python and SQL.

ScrapySQLPythonTableau
View on GitHub

Let’s connect to
build something meaningful.

Open to full-time Data Engineering opportunities across Europe. Based in Paris, with flexibility for remote work and relocation.

ganpat.patel.012@gmail.com
LinkedIn ↗GitHub ↗+33 7 45 36 97 27

© 2026 Ganpat Patel