Jan Krol
Data Expert
Experience
Data Expert
Manufacturing
Data Expert
Intralogistics
- Provided consulting and implementation of AWS infrastructure to support global process operations in Transport & Logistics
- Provisioned and operated servers, OS environments, and databases in AWS
- Identified and presented optimization potentials in commercial and technical terms
- Administered and maintained provided systems
- Developed maintenance and monitoring concepts
- Advised development projects on system use, configuration, and optimization
- Consulted on architectures and operational concepts using AWS Cloud
- Trained internal employees on new AWS services and working methods
Services: AWS Glue, Redshift, EMR, SageMaker, Python
Data Expert
Logistics
- Developed and implemented a standardized big data architecture for group-wide platform services in the Transport & Logistics sector on Azure
- Automated solutions using Infrastructure as Code (Terraform, Ansible)
- Presented and discussed sub-project architectures on Azure
- Implemented real-time data streaming with Apache Kafka and monitoring solutions
- Advised on Azure platform strategy and reference architectures
- Developed mechanisms for proactive elimination of vulnerabilities in Azure and Kubernetes clusters
- Conceptualized container orchestration platforms with Kubernetes CI/CD
- Created user and authorization concepts according to group specifications
- Managed operational services within an agile team
Services: Azure Purview, Azure Synapse Analytics, Azure Data Factory, Azure Databricks, Terraform, GitLab Runner, Azure DevOps
Data Expert
E-Commerce
- Strategically developed and migrated analytics data pipelines into a Data Lakehouse architecture on AWS
- Enhanced the Big Data Lake environment and ensured stringent data quality and GDPR compliance
- Performed exploratory analysis and algorithm development through data provisioning and preparation (AWS Glue, Spark, Lambda)
- Developed ETL jobs and data pipelines to provide ready-to-consume data sources (AWS Glue, Redshift, Spark, PySpark)
- Conducted regression testing and quality checks in data pipelines and the data lake
- Implemented high-performance streaming data processing with Kinesis, Kafka, and Lambda
- Orchestrated and connected multiple data sources
- Automated deployments using DevOps best practices (CodeBuild, CodePipeline, GitHub Actions)
- Built infrastructure with IaC (AWS CDK)
- Monitored data quality, compliance, and costs
Services: AWS Glue, Kinesis, Kafka, Apache Spark, Data Catalog, S3, Athena, Redshift, Lambda, ECS, Step Functions
Data Expert
E-Commerce
- Guided internal e-commerce product teams in developing, implementing, and maintaining high-performance data processing and integration systems
- Migrated existing data services, pipelines, and assets to a new event-based serverless architecture
- Developed and executed Lambda functions and PySpark jobs
- Designed architecture and integration with Kafka for real-time processing and analysis of event data
- Implemented PySpark transformations, filtering, and aggregations
- Ensured efficient and reliable connection with Kafka, configured security settings, and integrated with other components
- Established extensive testing and monitoring mechanisms
- Delivered a high-performance, scalable event system enabling data-driven decision-making
Services: AWS Glue, Apache Spark, Data Catalog, S3, Athena, Redshift, Lambda, ECS, Step Functions
Data Expert
Transport & Logistics
- Integrated logistics data streams with Event Hub and Kafka using PySpark Structured Streaming
- Designed and implemented a pipeline for capturing, processing, and forwarding data streams
- Utilized PySpark Structured Streaming for efficient real-time data processing
- Configured and initialized PySpark streaming jobs and defined necessary data structures
- Conducted comprehensive testing and monitoring to ensure smooth data transmission and high data quality
- Enabled robust and efficient integration of logistics data streams with Event Hubs
- Delivered real-time utilization of logistics data for analysis and further processing
Services: Azure Synapse Analytics, Purview Data Catalog, Apache Spark, Event Hub, Structured Streaming, GraphFrame, Azure Storage v2, Power BI
Data Expert
Transport & Logistics
- Spearheaded development of a robust data strategy and governance framework to streamline and enhance data handling capabilities
- Constructed a sophisticated data management platform on Databricks
- Designed and implemented an efficient data hub ingestion platform
- Led the design and establishment of an organization-wide data strategy aligned with business goals
- Developed a comprehensive data governance framework ensuring data accuracy, privacy, and compliance
- Oversaw deployment and customization of the data management platform on Databricks
- Enhanced data processing, analysis, and reporting capabilities with Power BI
- Engineered a robust data hub with advanced ingestion pipelines based on AWS EventBridge
- Optimized data flow from diverse sources to centralized storage systems (Data Lake House on Azure)
- Collaborated with cross-functional teams to integrate the data management platform with existing IT infrastructure
- Conducted training sessions and workshops to foster a data-driven culture and enhance data literacy
Services: Azure Databricks, Databricks Data Catalog, AWS EventBridge, Kinesis, Event Hub, Structured Streaming, Apache Spark
Data Expert
Transport & Logistics
- Served as the technical lead managing a team of 3 offshore developers while implementing scalable and robust data solutions in Azure Databricks
- Introduced Databricks Live Tables for schema and table management
- Implemented Databricks Asset Bundle following an Infrastructure as Code mindset
- Designed and refined the medallion data architecture to optimize data processing workflows
- Collaborated closely with multiple business units to ensure data solutions met their specific requirements
- Established coding standards and best practices for the development team
- Conducted code reviews and provided technical guidance
- Facilitated knowledge transfer and technical upskilling sessions
- Developed scalable ETL pipelines in Azure Databricks
- Created optimized data storage solutions with future scalability in mind
- Established a complete IaC workflow for data platform components
- Integrated version control and CI/CD for Databricks Asset Bundles
- Automated deployment of table schemas, jobs, and notebooks
- Implemented environment promotion strategies (Dev/Test/Prod)
- Managed configuration for cross-environment consistency
Services: Azure Databricks, Databricks Live Tables, Databricks Asset Bundle, Azure Data Factory, Delta Lake, Spark SQL, Azure Key Vault, Azure Storage, Power BI
Industries Experience
See where this freelancer has spent most of their professional time. Longer bars indicate deeper hands-on experience, while shorter ones reflect targeted or project-based work.
Experienced in Transportation (4 years), Retail (2 years), and Manufacturing (1.5 years).
Business Areas Experience
The graph below provides a cumulative view of the freelancer's experience across multiple business areas, calculated from completed and active engagements. It highlights the areas where the freelancer has most frequently contributed to planning, execution, and delivery of business outcomes.
Experienced in Business Intelligence (7.5 years), Information Technology (5.5 years), and Quality Assurance (0.5 years).
Summary
Big Data Specialist Focus: Big Data, Cloud Architecture, Data Management Platforms
Skills
Big Data Platform Specialist With Focus On Amazon Web Services & Microsoft Azure
Etl Processes/pipelines & Data Engineering
Architecture Of Data Management Plaftorm In Enterprises
Build Up Of Data Lakes & Data Lakehouses
Application Migrations Using Cloud Services
Consulting & Implementation Of Automation Concepts Especially Devops
Integration Of Active Directory, Security Concepts And Compliance Requirements
Monitoring And Logging
Confident In Python, Sql, Typescript, Golang
Big Data Cloud Architecture (Aws & Microsoft Azure)
Data Engineering (Databricks, Synapse Analytics, Fabric, Apache Spark, Aws Glue, Athena, Redshift & Emr)
Infrastructure As Code (Terraform, Pulumi, Aws Cdk, Arm)
Languages
Certifications & licenses
AWS Business Professional
AWS Certified Cloud Practitioner
AWS Certified Machine Learning – Specialty
AWS Certified Solutions Architect – Associate
AWS Technical Professional
Azure Solutions Architect Expert: AZ-300: Microsoft Azure Architect Technologies AZ-301: Microsoft Azure Architect Design
Databricks Certified Associate Developer For Apache Spark 3.0
HashiCorp Certified: Terraform Associate
Profile
Frequently asked questions
Do you have questions? Here you can find further information.
Where is Jan based?
What languages does Jan speak?
How many years of experience does Jan have?
What roles would Jan be best suited for?
What is Jan's latest experience?
What companies has Jan worked for in recent years?
Which industries is Jan most experienced in?
Which business areas is Jan most experienced in?
Which industries has Jan worked in recently?
Which business areas has Jan worked in recently?
Does Jan have any certificates?
What is the availability of Jan?
What is the rate of Jan?
How to hire Jan?
Average rates for similar positions
Rates are based on recent contracts and do not include FRATCH margin.
Similar Freelancers
Discover other experts with similar qualifications and experience
Nearby freelancers
Professionals working in or nearby Berlin, Germany