Paul M.

Data Engineer

Warsaw, Poland

Experience

Mar 2023 - Nov 2025
2 years 9 months

Data Engineer

Luxoft

  • Built and deployed an end-to-end enterprise data integration platform using CloverDX ETL pipelines, Python, PostgreSQL, and AWS services to ingest, validate, and structure raw analytical datasets supporting AI-powered automation for financial operations and digital banking workflows.
  • Designed data extraction connectors collecting fragmented structured and semi-structured input sources and aligning them to unified schema definitions required for downstream analytics workflows.
  • Built automated data loading and distribution jobs targeting multi-region storage across S3, RDS, and Redshift, ensuring secure data availability for risk analysis, fraud detection models, credit scoring, and scalable financial reporting.
  • Worked closely with business product leaders to evaluate new integration paths and prototype rapid connectors for high-priority data partners.
  • Provided on-call operational support to troubleshoot ingestion failures, data latency bottlenecks, and corrupted financial files, performing root-cause analysis through transaction-level replay and controlled environment replication.
  • Maintained automated data quality profiling, validation rulesets, and error-handling flows, ensuring consistency and reducing manual reconciliation across systems.
  • Created internal technical documentation including data lineage, field definitions, reconciliation rules, financial lifecycle diagrams, and mapping specifications used across engineering, compliance, and support teams.
Oct 2021 - Feb 2023
1 year 5 months

Data Engineer

Unicage

  • Built cloud-based ETL ingestion framework using Airflow, Python, Aurora PostgreSQL, and AWS Lambda to integrate multiple partner data providers into financial-grade web applications.
  • Developed custom SQL transformation scripts with field-level validation logic to handle malformed input and edge-case behavior from third-party interfaces.
  • Integrated data warehousing concepts including dimensional modeling and incremental loading patterns to support scalable insights tooling.
  • Collaborated with security teams to align data access flows with regulatory controls and auditing documentation.
  • Introduced automated regression data tests, enabling detection of mapping drift before deployment to production systems.
Apr 2019 - Oct 2021
2 years 7 months

Data Engineer

Biobot Analytics

  • Built large-scale COVID-19 public health data processing pipelines using Databricks, Apache Spark, Snowflake, and AWS to ingest real-time case reporting feeds from hospitals, diagnostic labs, and national open-data programs supporting public health intelligence platforms.
  • Integrated disparate raw datasets including vaccination progress tracking, ICU bed utilization, mortality curves, and population density metrics into curated warehouse models designed for advanced epidemiological and operational analysis.
  • Designed automated data validation rules and quality scoring frameworks utilizing anomaly detection logic and threshold-based alerting tied to pipeline health metrics.
  • Built operational observability dashboards in Grafana and Cloud Monitoring, visualizing pipeline latency, throughput, and schema-change impact to assist proactive issue detection.
  • Provided rapid response support during emergency reporting intervals, verifying the correctness of published datasets prior to high-visibility distribution.
Feb 2018 - Mar 2019
1 year 2 months

Data Developer Intern

Amazon

  • Modernized legacy ETL workflows by migrating to modular, service-based pipelines, reducing operational maintenance and improving reliability across data systems.
  • Built automated ingestion frameworks for partner data feeds with cleansing and normalization, reducing processing time and improving data accuracy.
  • Partnered with security & compliance teams to integrate regulated access controls and audit mechanisms, ensuring alignment with enterprise governance and regulatory standards.

Summary

Cloud-focused Senior Data Engineer with over eight years of hands-on experience designing and delivering high-reliability data processing systems, enterprise ETL pipelines, and distributed integration platforms across financial and AI-driven environments. Deep background integrating complex data sources, optimizing large-scale pipelines, and ensuring data integrity for mission-critical applications. Strong collaboration with cross-functional teams including analysts, architects, and business stakeholders within fast-paced environments.

Languages

English
Advanced

Education

The University Of Tokyo

B.S. · Computer Science · Japan

Need a freelancer? Find your match in seconds.
Try FRATCH GPT
More actions