Serge (Dr.) Kalinin
MLOps (machine learning operations)
Experience
Apr 2025 - Present
10 monthsCologne, Germany
MLOps (machine learning operations)
REWE Digital GmbH
- It is like a startup within REWE, where we have to build a new forecasting system on Google Cloud Platform from the scratch. Although, officially my role is called MLOps, my actual tasks also include development of data processing pipelines (data engineering) and data scientists tasks such as feature engineering and model trainings.
- GCP: Terraform (tofu), Vertex AI (Kubeflow), Cloud Run, IAM, Google Cloud Storage, BigQuery, Artifact Registry
- Data engineering: Snowflake as the main data warehouse, Terraform, DBT for data model implementations
- CI/CD: GitLab. We have built a CI/CD pipeline that automates deployments of new releases up to production environment
Jan 2025 - Apr 2025
4 monthsUnited States
Senior Software Developer (freelancer)
StarStruck
- Consolidate visibility of companies or software products in marketplaces to compare footprints.
- Develop data models and troubleshoot scraper scripts.
- Assign development tasks to other developers.
- Text analysis and implementation of recommendation system.
- AWS: DynamoDB, Lambda, S3, EC2, IAM, EventBridge.
- Scraping: Python, Selenium Chrome Webdriver.
- Text analysis: Amazon Comprehend.
- Recommendation system: XGBoost.
- MLOps: MLFlow
Apr 2022 - Dec 2024
2 years 9 monthsKarlsruhe, Germany
Senior Software Developer (extern)
Atrvivia AG (via STI GmbH, Michael Page International)
- Developed Data Integration Hub platform (DIH) for a Data Governance project, central architecture for sharing data between tenants, based on data product descriptions, data catalogs and services representing Data Mesh and Data Vault 2.0.
- Workflow: inject data product description via REST API or Swagger UI; metadata written into Kafka topics; Kafka consumers create metadata in Datahub, tables in Trino, file structures on S3, policies, etc.
- Implemented single sign-on in services based on JWT tokens.
- Developed REST APIs and integration tools between SelfService, Trino, S3, Datahub, Great Expectations.
- Performed data synchronization between S3 and Azure Blob Storage.
- Developed ETL with PySpark on Azure Databricks and pipelines with pandas, PySpark, Airflow DAGs.
- Built data quality validation services and monitoring systems.
- Created accounting system based on CPU, storage and memory usage using star schema.
- Developed complex data models (up to 5000 columns from several hundreds data sources).
- Troubleshot and supported services and customers; delivered according to AGILE/SCRUM.
- CI/CD: OpenShift (Kubernetes), Helm, Docker, Git, Tekton, ArgoCD, Terraform.
- Data catalogs and lineage: Datahub, OpenLineage; Python.
- SQL engines: Trino with Starburst Web UI, PostgreSQL, Hive, DB2, Delta Lake.
- Data quality: Great Expectations, Talend.
- Code quality: SonarQube.
- REST API: Java, Swagger UI, Spring Boot.
- Authentication: JWT, OAuth2, Single Sign-On; Apache Ranger; Prometheus and Grafana
Oct 2021 - Apr 2023
1 year 7 monthsHamburg, Germany
Senior Software Developer (extern)
Otto GmbH & Co KG (via Soorce GmbH)
- Designed and implemented data-driven microservices for search engine optimization using AWS services; ETL patterns: ingest data from REST API, SQS, DynamoDB; transform; upload to S3 or database.
- Service I (MLOps): assessed OTTO pages by extracting keywords and matching with Google searches; migrated data transformation, model training/retraining, and deployment from GCP to AWS; designed workflows.
- Used GitHub Actions for CI/CD and Terraform for cloud resource management.
- Implemented model validations and testing with Python; model monitoring with Grafana.
- Service S: handled millions of REST API calls per hour using AsyncIO; parsed nested JSON; stored results on S3.
- Languages: Python, Java, TypeScript, Kotlin.
- Monitoring: CloudWatch, Grafana, Tableau.
- Databases: MongoDB, DynamoDB, PostgreSQL, Exasol.
- Message processing: SNS, SQS.
- Provisioning: Terraform, Pulumi, Serverless (Cloud Foundation).
- Containers: Docker, ECR, ECS.
- Unit tests: PyTest; delivery according to AGILE/SCRUM
Jul 2018 - Sep 2021
3 years 3 monthsCologne, Germany
HybridSenior Big Data Consultant (extern)
REWE Systems GmbH (via STI GmbH)
- Conceptualized and implemented hybrid environments on Google Cloud Platform.
- Developed and optimized normalized data models and Spark ETL workflows using dbt.
- Created data warehouse models on Hadoop using Data Vault 2.0.
- Provisioned MapR and Spark (Databricks) environments on GCP with Terraform; set up real-time data replication from on-premise to GCP.
- Integrated with REWE services (Active Directory, DNS, Instana).
- Developed REST API for ML models using Flask.
- Implemented persistent storage based on MapR for Kubernetes cluster.
- Operated MapR clusters: upgrades, extensions, troubleshooting via Ansible and Jenkins.
- Synchronized Kafka cluster with MapR streams using Kafka Connect.
- Designed ETL pipelines, synchronization and integration of MapR with DB2 and Teradata.
- Onboarded new internal customers; consulted management on Big Data topics; proposed security solutions and PoCs.
- Developed market classification models; visualized data with Jupyter and Grafana; integrated with JIRA; delivered according to AGILE/SCRUM
Sep 2016 - May 2018
1 year 9 monthsMunich, Germany
Senior Big Data Architect
Allianz Technology SE
- Managed large-scale, multi-tenant, secure, highly available Hadoop infrastructure; provided architectural guidance, capacity planning, roadmaps for deployments.
- Conducted pre-sales onboarding; worked with infrastructure, network, database, BI and data science teams.
- Designed, implemented and maintained enterprise-level security for Hadoop (Kerberos, LDAP/AD, Sentry, encryption in motion and at rest).
- Installed and configured multi-tenant Hadoop environments; managed updates, patches, upgrades; created run books for troubleshooting and maintenance.
- Provided 3rd-level DevOps support; evaluated new tools and technologies.
- Set up Microsoft R Open data science model training platform on Azure and on-premise using Docker and Terraform for fraud detection.
- Developed Supply Chain Analytics GraphServer for HDFS graph queries.
- Transformed internal processes to Agile/SCRUM.
- Developed Kafka-based use cases: ClickStream (Java, Kafka, Flink, Cassandra), document classification (Java, Kafka, Spark Streaming, UIMA, Impala), graph database PoC (Java, Python, Kafka, Cassandra, Gremlin, KeyLines)
Jun 2014 - Jul 2016
2 years 2 monthsBerlin, Germany
System Architect
WebThe unbelievable Machine Operations Company GmbH
Sep 2012 - Jun 2014
1 year 10 monthsCologne, Germany
System Operations
Werkenntwen GmbH
Jan 2009 - Sep 2012
3 years 9 monthsWuppertal, Germany
Postdoc
Bergische Universität Wuppertal
Oct 2006 - Dec 2008
2 years 3 monthsAachen, Germany
Postdoc
Rheinisch-Westfälische Technische Hochschule Aachen
Skills
- It: Object Oriented Programming, Databases, Administration, Distributed Computing, High Performance Computing, High Availability
- Profound Knowledge Of Public Clouds: Aws, Microsoft Azure, Google Cloud Platform
- Infrastructure As Code: Terraform, Ansible, Puppet
- Expert In Hadoop Stacks: Hortonworks, Mapr, Cloudera, Spark, Kafka, Flume, Storm, Oozie, Sentry, Ranger, Knox, Hbase, Sqoop, Yarn, Mesos, Impala, Hive, Zookeeper, Key Trustee
- Expert In Network Devices: Loadbalancer (F5, A10), Switches (Arista, Juniper, Force10), Firewalls (Juniper, Fortigate, Palo Alto), Routers (Brocade)
- Programming Languagues: Python, Java, Sql, Scala, C++, Php, Vba, R, Ruby
- Expert In Application Level Security Systems: Web Application Firewall, Xml-gateway, Haproxy
- Expert In Databases: Mysql, Postgresql, Objectivity, Oracle, Microsoft Sql Server
- Expert In Linux. Very Good Knowledge Of Microsoft Windows
- Expert In Mass-storage Systems: Emc Isilon, Scality, Lustre/sfs, Pnfs, Hadoop, Dcache, Netapp
- Expert In Virtualization And Containerization: Kvm, Xen, Vmware, Virtualbox, Vagrant, Docker, Cloudfoundry, Kubernetes, Citrix
- Scms: Git, Svn, Nexus, Cvs
- Expert In Monitoring: Grafana, Prometheus, Nagios, Cacti, Ganglia, Munin, Lemon
- Expert In Statistics: Monte Carlo, Statistical Tests, Parameter Estimation, Error Estimation, Classification, Prediction, Unfolding, Fitting
- Expert In Technical Analyses
- Expert In Multi-variate Methods: Neural Networks, Support Vector Machines, Genetic Algorithms, Gradient Boosting
- Good Knowledge Of Security Standards Such As Pci Dss
- Ldap Authentication Systems: Activedirectory, Openldap, Apacheds
Languages
Russian
NativeGerman
AdvancedEnglish
AdvancedFrench
AdvancedEducation
Jan 2001 - Sep 2006
Université catholique de Louvain-la-Neuve
PhD Student · Ottignies-Louvain-la-Neuve, Belgium
Sep 1998 - Jun 2000
Moscow Institute of Physics and Technology
M.S., High energy physics · High energy physics · Dolgoprudny, Russian Federation
Sep 1994 - Jun 1998
Moscow Institute of Physics and Technology
B.S., High energy physics · High energy physics · Dolgoprudny, Russian Federation
Certifications & licenses
AWS Certified Data Engineer - Associate
Need a freelancer? Find your match in seconds.
Try FRATCH GPT More actions
Similar Freelancers
Discover other experts with similar qualifications and experience