Serge (Dr.) K.

Senior DevOps (external)

Munich, Germany

Experience

Apr 2022 - Present
3 years 8 months
Karlsruhe, Germany

Senior DevOps (external)

Atrvivia AG

  • Development of Data Integration Hub platform (DIH) as part of a Data Governance project. DIH is the central architecture component for sharing data between tenants. It is based on data product descriptions (specifications), data catalogs and services representing shared data.
  • A typical workflow includes:
  • Inject data product description via REST API or Swagger UI.
  • Metadata is written into Kafka topics.
  • Kafka consumers read the data and perform actions such as creating metadata in Datahub, creating tables in Trino, creating predefined file structures on S3, setting up policies, etc.
  • My key tasks:
  • Implement single sign-on in services based on JWT tokens.
  • Develop REST APIs.
  • Build integration tools between software components (SelfService, Trino, S3, Datahub, Great Expectations, etc.).
  • Create data quality validation services.
  • Develop ETL pipelines.
  • Onboarding new customers.
  • Build monitoring systems.
  • Troubleshoot and support services and customers.
  • Development stack:
  • CI/CD: OpenShift (Kubernetes), Helm, Docker, Git, Tekton, ArgoCD.
  • Data catalogs and lineage: Datahub, OpenLineage; integrated with Spark, Pandas (implemented in Python).
  • SQL engines: Trino with Starburst Web UI, PostgreSQL, Hadoop, DB2, Delta Lake.
  • Data quality: Great Expectations.
  • REST API: Java, Swagger, Spring Boot.
  • Authentication: JWT, OAuth2, Single Sign-On.
  • Apache Ranger for access policy management.
  • Monitoring: Prometheus, Grafana.
  • Certification: AWS Certified Data Engineer - Associate.
Oct 2021 - Apr 2023
1 year 7 months
Hamburg, Germany

Senior DevOps (external)

Otto GmbH & Co KG

  • Design and implementation of data-driven microservices for Google search engine optimization using AWS services. These services follow ETL patterns: a typical service takes data from a source (REST API, SQS, DynamoDB, etc.), transforms it (e.g., calculates changes in a list compared to previous days) and uploads results to a backend (S3, database).
  • Service I (MLOps): assess OTTO pages by extracting keywords that describe page content and matching them with Google searches. Migrated data transformation, model training, retraining, and deployment from GCP to AWS. Designed and implemented workflows.
  • Use GitHub Actions for CI/CD pipelines.
  • Use Terraform to manage cloud resources (container creation, load balancing model instances, etc.).
  • Implement model validation and testing with Python.
  • Implement model monitoring with Grafana.
  • Service S:
  • Handle millions of REST API calls per hour using AsyncIO.
  • Parse and filter nested JSON data.
  • Store results on S3.
  • Languages: Python, Java, TypeScript, Kotlin.
  • Monitoring: CloudWatch, Grafana, Tableau.
  • Databases: MongoDB, DynamoDB, PostgreSQL, Exasol.
  • Message processing: SNS, SQS.
  • Provisioning: Terraform, Pulumi, Serverless (Cloud Foundation).
  • Containers: Docker, ECR, ECS.
  • Unit tests: PyTest.
Jul 2018 - Sep 2021
3 years 3 months
Cologne, Germany
Hybrid

Senior Big Data Consultant (external)

REWE Systems GmbH

  • Designed and implemented hybrid environments on Google Cloud Platform.
  • Provisioned GCP infrastructure with Terraform and later with Ansible.
  • Set up redundant connectivity and data encryption between GCP and on-premise systems.
  • Provisioned MapR and Spark environments on GCP.
  • Configured real-time data replication from on-premise tables to GCP.
  • Integrated with REWE services (Active Directory, DNS, Instana, etc.).
  • Developed REST APIs for machine learning models using Flask.
  • Implemented persistent storage based on MapR for Kubernetes clusters.
  • Operated MapR clusters: upgrades, scaling, troubleshooting services and applications.
  • Synchronized a Kafka cluster with MapR streams using Kafka Connect.
  • Designed and implemented ETL pipelines, synchronization and integration of MapR clusters with various data sources (e.g., DB2 and Teradata warehouses).
  • Onboarded new internal REWE customers to MapR platforms.
  • Advised management on technical topics and future developments in big data.
  • Proposed security solutions (e.g., constrained delegation on F5 or authentication for OpenTSDB) and conducted PoCs.
  • Developed solutions in data science projects.
  • Built market classification models.
  • Visualized data and predictions with Jupyter and Grafana.
  • Integrated with JIRA.
  • Provided 3rd-level support.
Sep 2016 - May 2018
1 year 9 months
Munich, Germany

Senior Big Data Architect

Allianz Technology SE

  • Managed large-scale, multi-tenant, secure and highly available Hadoop infrastructure supporting rapid data growth for a diverse customer base.
  • Pre-sales: onboarded new customers.
  • Provided architectural guidance, planned and estimated cluster capacity, and created roadmaps for Hadoop deployments.
  • Designed, implemented and maintained enterprise-level secure Hadoop environments (Kerberos, LDAP/AD, Apache Sentry, encryption in transit, encryption at rest).
  • Installed and configured multi-tenant Hadoop environments, applying updates, patches and version upgrades.
  • Created runbooks for troubleshooting, cluster recovery and routine maintenance.
  • Troubleshot Hadoop applications, components and infrastructure at large scale.
  • Provided 3rd-level support (DevOps) for business-critical applications and use cases.
  • Evaluated and recommended new tools and technologies to meet the needs of the Allianz Group.
  • Worked closely with infrastructure, network, database, application, business intelligence and data science teams.
  • Contributed to Fraud Detection projects, including machine learning.
  • Designed and set up a Microsoft R data science model training platform (Microsoft R Open) on Azure and on-premise for Fraud Detection using Docker and Terraform.
  • Contributed to Supply Chain Analytics projects (e.g., GraphServer for executing graph queries on data in HDFS).
  • Transformed internal team processes according to Agile/SCRUM framework.
  • Developed Kafka-based use cases.
  • ClickStream:
  • Producer: aggregator for streamed URLs clicked on web pages via REST API or other sources (e.g., Oracle).
  • Consumer: Flink job that, after pre-processing (sanity checks, extracting time information), writes data to HDFS in XML files.
  • Stack for ClickStream: Java, Kafka, Cloudera, SASL, TLS/SSL, Apache Sentry, YARN, Flink, Cassandra.
  • Document classification:
  • Producer: custom producer reading documents from a shared file system and writing them into Kafka.
  • Consumer: Spark Streaming job that, after pre-processing, sends documents to the UIMA platform for classification. After classification, data is stored on HDFS for further batch processing.
  • Stack for Document classification: Java, Kafka, Spark (Streaming), Cloudera, SASL, TLS/SSL, Apache Sentry, YARN, UIMA.
  • Graph database (PoC): managed graphs via Kafka interface.
  • Producer: data from Twitter, news agency sites, etc.
  • Consumer: converted articles and messages into graph queries and executed them with Gremlin.
  • Stack for Graph database (PoC): Java, Python, Kafka, Cassandra, Gremlin, KeyLines (for graph visualization; JavaScript), Google Cloud.
Jun 2014 - Jul 2016
2 years 2 months
Berlin, Germany

System Architect

WebThe Unbelievable Machine Company GmbH

Sep 2012 - Jun 2014
1 year 10 months
Cologne, Germany

System Operations

Werkenntwen GmbH

Jan 2009 - Sep 2012
3 years 9 months
Wuppertal, Germany

Postdoc

Bergische Universität Wuppertal

Oct 2006 - Dec 2008
2 years 3 months
Aachen, Germany

Postdoc

Rheinisch-Westfälische Technische Hochschule

Languages

Russian
Native
German
Advanced
English
Advanced
French
Advanced

Education

Jan 2001 - Sep 2006

Université catholique de Louvain-la-Neuve

PhD · Ottignies-Louvain-la-Neuve, Belgium

Sep 1998 - Jun 2000

Moscow Institute of Physics and Technology

High energy physics · Dolgoprudny, Russian Federation

Sep 1994 - Jun 1998

Moscow Institute of Physics and Technology

High energy physics · Dolgoprudny, Russian Federation

Certifications & licenses

AWS Certified Data Engineer - Associate

AWS

Need a freelancer? Find your match in seconds.
Try FRATCH GPT
More actions