Design and build a modern data platform in the Azure Cloud for an e-commerce company. Implementation of scalable data pipelines and a lakehouse architecture.
- Typical tasks:
- Design and implementation of ETL/ELT processes with Azure Data Factory/Databricks.
- Setup and management of data infrastructure (Data Lake, Synapse).
- Development of data models and ensuring data quality.
- Automation of deployments using IaC (Terraform).
- Performance tuning and coordination with stakeholders.
- Relevant technologies, tools & methods:
- Azure Data Factory, Synapse, Databricks, Data Lake Storage.
- Apache Spark, Delta Lake.
- Python (PySpark), SQL.
- Terraform, Azure DevOps.
- Typical KPIs & success metrics:
- Data latency, pipeline uptime (>99.9%), query performance, cost efficiency.
- Key challenges & risks:
- High data volumes, GDPR compliance, avoiding data silos.
- Deliverables:
- Production-ready data pipelines, structured data lake/warehouse, IaC scripts, technical documentation.