Position: Data Engineer
Overview of the Role
An established organisation is currently undergoing a data transformation and is seeking a skilled Data Engineer to join its growing data team. This role plays a key part in building and deploying modern data solutions based on Azure Databricks, enabling faster and more informed business decisions.
You'll work hands-on with Azure Databricks, Azure Data Factory, Delta Lake, and Power BI to design scalable data pipelines, implement efficient data models, and ensure high-quality data delivery. This is a great opportunity to shape the future of data within the organisation while working with advanced cloud technologies.
Key Responsibilities and Deliverables
Design, develop, and optimise end-to-end data pipelines (batch & streaming) using Azure Databricks, Spark, and Delta Lake.
Implement Medallion Architecture to structure raw, enriched, and curated data layers efficiently.
Build scalable ETL/ELT processes with Azure Data Factory and PySpark.
Support data governance initiatives using tools like Azure Purview and Unity Catalog for metadata management, lineage, and access control.
Ensure consistency, accuracy, and reliability across data pipelines.
Collaborate with analysts to validate and refine datasets for reporting.
Apply DevOps and CI/CD best practices (Git, Azure DevOps) for automated testing and deployment.
Optimise Spark jobs, Delta Lake tables, and SQL queries for performance and cost-effectiveness.
Troubleshoot and proactively resolve data pipeline issues.
Partner with data architects, analysts, and business teams to deliver end-to-end data solutions.
Stay current with emerging data technologies (e.g., Kafka/Event Hubs for streaming, Knowledge Graphs).
Promote best practices in data engineering across the team.
Skills & Experience
Hands-on experience with Azure Databricks, Delta Lake, Data Factory, and Synapse.
Strong understanding of Lakehouse architecture and medallion design patterns.
Proficient in Python, PySpark, and SQL, with advanced query optimisation skills.
Proven experience building scalable ETL pipelines and managing data transformations.
Familiarity with data quality frameworks and monitoring tools.
Experience working with Git, CI/CD pipelines, and in Agile environments.
Ability to write clean, maintainable, and well-documented code.
Exposure to Power BI or similar data visualisation tools.
Knowledge of IoT data pipelines is a plus.