Christian Richter - Data Engineer

Go to Website

Berlin, Germany

Experience

Oct 2024 - Present

10 months

Data Engineer

Support and consulting services for an SCM application.

Technologies: Google Cloud, GKE, Kubernetes, SQL, Kafka, Kafka Connect, Kotlin, Dbt, GitHub, Pekko.

Assisted in creating a master data set for a supply chain management application.
Designed and implemented ETL processes.
Performed data analysis to optimize data quality.
Implemented a dashboard to monitor data quality.
Set up CI/CD pipelines, monitoring, and alerting for operational ETL processes.

Sep 2023 - Oct 2024

1 year 2 months

Data Engineer

Covestro AG

Support and consulting services for a data warehouse implementation.

Technologies: AWS Cloud, CloudFormation, SAP PLM, OpenSearch, Docker, Spring, Flyway, Java, SQL.

Advised on data model, ETL processes & cloud architecture.
Designed and implemented multiple ETL processes.
Connected SAP PLM system for data extraction and preparation.
Connected MES system for data extraction and preparation.
Transferred knowledge to internal staff.

Jul 2021 - Oct 2023

2 years 4 months

Data Engineer

GfK SE

Supported migration of a data warehouse from on-premise to AWS Cloud.

Technologies: AWS Cloud, Airflow, Terraform, Python, Java, SQL, Cloudera, Hadoop, Glue, EMR, LakeFormation, GitLab.

Designed and implemented a data warehouse using LakeFormation and Glue Catalog.
Migrated existing Spark jobs to AWS Glue, AWS EMR.
Implemented workflow management with Airflow.
Designed and implemented multiple GitLab CI/CD pipelines.
Trained and transferred knowledge to internal staff.

Jun 2020 - Dec 2022

2 years 7 months

Data Architect

RTL Group

Designed and built a cloud-based data warehouse for user data analysis.

Technologies: AWS Cloud, Kinesis, Kubernetes, Istio, Docker, Kafka, Python, PySpark, Pandas, Avro, Kafka Streams, Terraform, Kustomize, PowerBI.

Designed and implemented cloud infrastructure for a data warehouse.
Integrated Airflow as workflow engine.
Integrated Google Spark K8s Operator as runtime for ETL processes.
Built a team to implement ETL processes.
Developed multiple ETL processes with PySpark.
Implemented event streaming pipeline for real-time analytics.
Prepared data for further analysis in PowerBI.
Advised on overall data architecture.

Aug 2019 - Dec 2019

5 months

DevOps Engineer

D.Swarovksi KG

Designed and built infrastructure for sensor data processing, expanded an existing data science environment.

Technologies: AWS Cloud, Kubernetes, CloudFormation, Kafka, Kafka Connect, PySpark, Bamboo, Java, Python, Docker, Helm Charts, InfluxDB.

Advised on design and tools for Kubernetes-based sensor data processing infrastructure.
Designed and built Kubernetes infrastructure (Kafka cluster, Spark framework, Zookeeper, ZK Manager).
Designed and built ETL pipeline for data ingestion.
Designed and built CI/CD pipeline with Bamboo & Kubernetes.

Jan 2019 - Mar 2020

1 year 3 months

Data Engineer

Volkswagen AG

Designed and built a cloud-based data warehouse for vehicle data analysis, evaluated data science environment.

Technologies: AWS Cloud, Lambda, IAM, Kubernetes, Kubeflow, Docker, Databricks, Terraform, Python, OpenSearch, LogStash, Kibana, Helm.

Expanded and deployed a prototype for mass data processing.
Built and launched a CI/CD pipeline.
Designed, implemented & deployed backend API including Helm chart.
Designed & implemented project structure and release management.
Evaluated Kubeflow and Databricks.
Built ETL pipeline for data validation and ingestion.
Implemented feature extraction from vehicle data.

Oct 2017 - Nov 2018

1 year 2 months

Data Engineer

D. Swarovski KG

Designed and built a cloud-based data warehouse/data science environment.

Technologies: AWS Cloud, Kubernetes, Spark, R, NiFi, CloudFormation, Docker, Python, Jupyter Notebook.

Designed a dynamically scalable data warehouse.
Implemented infrastructure with CloudFormation (Infrastructure as Code).
Implemented infrastructure components in Kubernetes.
Modeled DWH & data storage.
Built ETL pipelines for data ingestion.

May 2017 - Dec 2018

1 year 8 months

Data Architect

aixigo AG

Supported design and implementation of a microservice architecture.

Technologies: Microservices, Java, Docker, Kafka, LiquiBase, general IT architecture.

Coordinated various teams on technology choices.
Supported design of core components (de/serializers, data pipe design, error handling, message handling, database design).
Introduced Kafka as central message bus for microservices.
Introduced LiquiBase for database schema management.
Provided functional/technical support for a specific microservice.

Mar 2017 - Mar 2017

1 month

Requirements Engineer

Open Grid Europe GmbH

Supported evaluation of Big Data providers.

Technologies: Hortonworks, Cloudera, SAP Cloud, Apache NiFi, AWS Cloud, MS Azure.

Gathered and documented technical and functional requirements for building and operating a Hadoop-based data warehouse.
Collected offers from various providers, prepared information for decision-making.
Implemented a prototype for data ingestion.

Jan 2017 - Aug 2017

8 months

Data Engineer

GfK SE

Designed and implemented a Big Data warehouse in AWS Cloud.

Technologies: AWS Cloud, Spark, SparkR, Cloudera, Hadoop, Hive, Python, Jupyter Notebook, R, Bamboo, Terraform.

Technical project lead.
Designed AWS Cloud infrastructure.
Implemented data pipelines.
Built data warehouse and workflow management.
Prepared data and managed processes.

Oct 2016 - Oct 2016

1 month

Data Engineer

Universitätsspital Basel

Big Data technologies workshop – introduction and functionality.

Technologies: Hadoop, Spark, AWS Cloud, MapReduce, Hive, Pig, R.

Conducted a three-day workshop.
Introduced Big Data/Hadoop ecosystem.
Hands-on exercises for Big Data use in AWS Cloud.

Jun 2016 - Dec 2016

7 months

Data Engineer

Helix Leisure Pte Ltd

Architecture review, design and implementation of streaming layer.

Technologies: Hadoop, Spark, AWS Cloud, Scala, MapReduce, JCascalog, RedShift, CloudFormation.

Reviewed and assessed existing architecture and data model.
Conducted workshop on data management/Lambda architecture.
Designed and implemented real-time layer with Spark RT.
Developed concept and integrated real-time and batch layers.

Feb 2016 - Jul 2016

6 months

Data Engineer

Otto GmbH & Co. KG

Supported building ETL processes for a Hadoop-based DWH.

Technologies: Hadoop, Hive, Spark, Redis, Kafka, Avro, Scala, HCatalog, Schedoscope.

Planned and implemented a Hive export module.
Implemented Kafka & Redis export modules as part of an open-source project.
Developed analysis algorithms for clickstream evaluation.

Dec 2015 - Aug 2016

9 months

DevOps Engineer

GfK SE

Designed and built continuous deployment/delivery pipeline for a data-driven application in a cloud environment.

Technologies: AWS Cloud, Hadoop, Spark, Bamboo, Git, Terraform, Vagrant, InfluxDB.

Planned and implemented Big Data infrastructure in AWS Cloud.
Planned and implemented continuous deployment pipeline.
Technical lead for an internal team.

Jul 2015 - Oct 2015

4 months

Data Engineer

RadioOpt GmbH

Designed and built a data warehouse based on Big Data technologies – OLAP workload.

Technologies: Hadoop, Impala, Hive, ETL, AWS Cloud.

Planned and built cluster infrastructure.
Evaluated input formats for performance.
Prepared and conducted load tests.

Jul 2014 - Jun 2015

1 year

Data Engineer

Technicolor SA

Designed and implemented a Big Data system for batch and real-time processing.

Technologies: Hadoop, Samza, Spark, Kafka, Java, ETL, AWS, CloudFormation.

Planned and executed deployment environment.
Evaluated technologies for data capture/processing.
Technical lead for a team.
Implemented a distributed, fault-tolerant high-throughput messaging and analytics system for machine data (Lambda Architecture).

Mar 2013 - Sep 2014

1 year 7 months

Data Engineer

Ubisoft / BlueByte GmbH

Designed and implemented a Hadoop-based data warehouse for game analytics.

Technologies: Hadoop, MapReduce, Kafka, Hive, ETL, Java, Linux.

Planned and built a data warehouse.
Evaluated data capture approaches.
Selected suitable technologies.
Led and coordinated a distributed team (GER, CN, CAN).
Implemented a distributed, fault-tolerant high-throughput messaging system.

Feb 2013 - Jun 2014

1 year 5 months

DevOps Engineer

Deutsche Telekom AG

Designed and implemented Big Data infrastructure in virtualized environments.

Technologies: Hadoop, OpenStack, Opscode Chef, Java, Linux.

Planned and built Big Data deployment infrastructure.
Implemented deployment process for on-demand Hadoop clusters in a virtualized environment.
Prototyped various MapReduce algorithms.

Nov 2012 - Aug 2015

2 years 10 months

DevOps Engineer

GfK SE

Designed and implemented Big Data architecture for telecom data analysis.

Technologies: Cloudera, Hadoop, Hive, Flume, Java, Spring, Puppet, Ubuntu Linux, AWS.

Planned and built network (VPC).
Planned and built a 100TB Hadoop cluster.
Set up deployment process, including monitoring.
Implemented data ingestion framework to store ~300GB per day.

May 2012 - Dec 2012

8 months

Data Engineer

exactag GmbH

Designed and implemented a Hadoop cluster.

Technologies: Cloudera, Hadoop, Hive, Pig, Python, Java, Maven, Puppet, Debian Linux.

Advised and designed a Hadoop cluster.
Selected suitable hardware.
Set up deployment process and rolled out the cluster.
Ported existing statistical routines to MapReduce.

Jun 2011 - Mar 2012

10 months

Data Engineer

Etracker GmbH

Reimplemented an analysis tool as a MapReduce application.

Technologies: Cloudera, Hadoop/HBase, Java, Maven, Ganglia, Chef, PHP, Debian Linux.

Analyzed and integrated existing implementation into MapReduce using Hadoop Streaming API.
Installed and configured a Hadoop cluster with monitoring.
Set up deployment process.

Mar 2011 - Sep 2016

5 years 7 months

Data Engineer, DevOps Engineer

LambdaNow.com / AltusInsight GmbH

Designed and developed a web application (LambdaNow).

Technologies: Apache Hadoop, Python, Puppet, AWS, OpenStack, Git, RedHat Linux.

Designed the application.
Implemented website and backend.
Set up deployment process and hosting environment.
Established fully automated Hadoop deployment in Amazon and OpenStack clouds.

Sep 2010 - Feb 2011

6 months

Backend Entwickler

Aupeo GmbH

Integrated a payment provider into existing backend.

Technologies: Ruby/Rails, OAuth, MySQL, Git, Debian Linux.

Mapped data, text-matched with existing data.
Prepared, converted, and imported data into database.
Integrated payment provider.

May 2010 - Sep 2010

5 months

Backend Entwickler

OpenLimit SignCubes GmbH

Integrated a signature component into an email program (KMail).

Technologies: C++, Qt, KDE, Ubuntu Linux.

Set up debug environment.
Integrated signature component into KMail.
Tested the implementation.

Mar 2010 - May 2010

3 months

Backend Entwickler

Etracker GmbH

Implemented and refactored an analysis tool in C++.

Technologies: C++, MySQL C/C++ API, Doxygen, Hudson, Ubuntu/Debian Linux.

Set up build environment for C++ projects.
Refactored the prototype.
Adapted and extended software for production (logging, error handling, unit testing).
Set up deployment process.
Set up CI build servers.

Jan 2010 - Present

15 years 7 months

Berlin, Germany

Freiberuflicher Data Engineer

Ingenieurbüro Christian Richter – Data, Cloud & Container

Freelance Data Engineer with interest in DevOps
Contributed to over 20 successful projects

Nov 2009 - Feb 2010

4 months

Backend Entwickler

Ingenieurbüro Christian Richter

Designed and developed a web crawler.

Technologies: C++, Fedora/RedHat Linux, Cassandra.

Designed high-performance multithreaded server application.
Implemented distributed application using non-blocking I/O sockets.

May 2009 - Oct 2009

6 months

Data Engineer

MOG Inc.

Extended an existing indexing framework.

Technologies: Ruby/Rails, MySQL.

Adapted indexing framework for music data to a changed database model.
Converted existing data (approx. 100GB).

May 2008 - Apr 2009

1 year

DevOps Engineer

MOG Inc.

Designed, built and launched hosting environment for a large website.

Technologies: Apache, Nginx, HAProxy, Mongrel, MySQL, MySQLProxy, BIND, DHCP, Cobbler, Puppet, RedHat Linux.

Designed hosting environment.
Built server provisioning/configuration infrastructure.
Set up MySQL master/master replication + MySQLProxy.
Configured server software, monitoring and logging.
Migrated website from hosting provider to colocation.
Analyzed and optimized system performance.

Oct 2007 - Apr 2008

7 months

Backend Entwickler

MOG Inc.

Ported an XML-RPC server from Ruby on Rails to C++.

Technologies: C++, XML-RPC, Ruby/Rails, XML Schema, MySQL C/C++ API, RedHat Linux.

Analyzed performance issues.
Implemented/ported according to specification (given protocol).
Replaced component and integrated into existing backend.

Jun 2006 - Sep 2007

1 year 4 months

Data Engineer

MOG Inc.

Designed and developed an infrastructure to integrate external provider data.

Technologies: Ruby/Rails, MySQL, Bash, RedHat Linux.

Built a music database with data from providers like Allmusic, Muze, Rhapsody, MediaNet.
Prepared, converted and imported data into database.
Mapped data from various providers.

Jan 2006 - May 2006

5 months

Backend Entwickler

MOG Inc.

Designed, planned and developed a spellchecker component for an indexing framework.

Technologies: C++, SOAP, CLucene, MySQL C/C++ API, Doxygen, RedHat Linux.

Analyzed existing algorithms.
Implemented algorithm (Levenshtein distance and n-gram index) in C++.
Integrated into existing system.

May 2005 - Oct 2009

4 years 6 months

Berkeley, United States

Softwareentwickler und Systemarchitekt

MOG Inc. – Startup im Bereich Medien/Internet

Responsible for design, implementation and launch of hosting environment
Designed and implemented multiple C++ software projects

May 2005 - Dec 2005

8 months

Backend Entwickler

MOG Inc.

Designed, planned and developed a collaborative filtering system as a distributed application.

Technologies: C++, MySQL C/C++ API, XML, XML Schema, Perl, Doxygen, RedHat Linux.

Evaluated suitable algorithms.
Implemented as distributed application with database integration.
Integrated into existing backend and frontend.

Sep 2004 - Mar 2005

7 months

Backend Entwickler

Fraunhofer Institut

Implemented/ported a method to extract the main melody line.

Technologies: C++, Matlab, Mandrake Linux.

Ported a Matlab algorithm to C++.
Optimized implementation for runtime.

Jul 2003 - Mar 2005

1 year 9 months

Ilmenau, Germany

Mitarbeiter

Fraunhofer IDMT – Forschungsinstitut im Bereich Audio/Video

Conducted scientific research on search algorithms
Participated in algorithm development and implementation

Jul 2003 - Aug 2004

1 year 2 months

Data Analyst

Fraunhofer Institut

Scientific investigation of data structures to determine nearest neighbors for a music information retrieval system.

Technologies: Matlab, C++, Perl, Apache, CGI, Mandrake Linux.

Evaluated multiple nearest neighbor algorithms for suitability.
Designed and implemented an algorithm in C++ and integrated it as a dynamic library.
Conducted and analyzed test series for evaluation.

Jan 2003 - Jun 2003

6 months

Data Analyst

ID Analytics Inc.

Worked on algorithms to detect identity theft.

Technologies: Java, Octave, Perl, Tomcat, Oracle, RH Linux, Solaris.

Developed a tool to visualize graphs.
Analyzed large datasets (~250 GB) for feature extraction using a Java processing framework.
Developed and implemented algorithms for identity theft detection (regression analysis).

Jan 2003 - Jun 2003

6 months

San Diego, United States

Lorem ipsum dolor sit amet

ID Analytics Inc. – Startup im Bereich Finanzdienstleistungen

Contributed to software for detecting identity theft
Data preparation and analysis
Participated in algorithm development and implementation

Nov 2001 - Dec 2002

1 year 2 months

Lorem ipsum dolor sit amet

Fraunhofer Institut

Developed and implemented a cross-platform GUI (Win32/Linux) for a query-by-humming system.

Technologies: C++, Qt, Win32, Linux.

Aug 2000 - Dec 2002

2 years 5 months

Ilmenau, Germany

Lorem ipsum dolor sit amet

Fraunhofer IDMT – Forschungsinstitut im Bereich Audio/Video

Developed test environments to evaluate algorithms
Designed and implemented GUIs to present algorithms at trade shows

Aug 2000 - Oct 2001

1 year 3 months

Lorem ipsum dolor sit amet

Fraunhofer Institut

Conducted scientific evaluation of similarity search algorithms for a query-by-humming system.

Technologies: Matlab, Bash, Linux.

Implemented various algorithms in Matlab.
Automated and parallelized test environment with Bash scripts.

Jun 1999 - Jul 1999

2 months

Eckental, Germany

Lorem ipsum dolor sit amet

Jumatech – Firma im Bereich Leiterplattenproduktion

Provided IT infrastructure

Summary

Design and implementation of ETL processes, data pipelines, and ML pipelines
GDPR-compliant data management and processing
Design and development of cloud-based data warehouse, data lake and lakehouse implementations
Data modeling and analysis, data conversion and preparation
Design and implementation of data-driven applications on cloud-native infrastructures
Requirements analysis, business process analysis, risk analysis