AI driven Whole Portfolio Optimization and buy side reseaech summarization.
Explore the potential of the very novel GPT language models in conjunction with Graph Databases, fine tuning of GPT prompts and PoC on automated AI driven data linkage
Details of the project are still confidential
First product relating to AI Assisted Career Guidance released on April 26th 2023
Technologies: Research, Jupyter Notebook, Python, Panda, Numpy, Scikit, FastApi, Flask, Django, Java, Jena, GIT, GitHub, CICD, CI/CD, Jira, TDD, DevOps, Terraform, Docker, Cloud Azure, Azure App, Azure OpenAI, DBT, API, sFTP, yarrml, RMLMapper, GraphDB, Ontorefine, SQL, SPARQL, Graph Database, OWL, RDF, Ontologies, GPT-3, GPT-3.5-turbo, GPT-4.0, (Including programmatic interaction with the APIs exposed by OpenAI.io and Azure)
Redesign, Rebuild, and Migrate the Semantic Engine supporting the Metadata for a Number of Datasources from the current third party tool to an in-house replacement
The project required to replace the current implementation of the semantic data hub. The solution required design a product able to handle a volume of metadata collected across across multiple divisions, develop a proof of concept (PoC), confirm the PoC with the stakeholders, and deliver a fully fledged implementation suitable for productionization. The solution consisted of a set of extractors, based on meltano, custom API connectors and ingestors written in Python to harvest the metadata from different sources. The metadata was then staged in Postgres and cleansed using DBT transformations. The Clean metadata got mapped to the internal ontologies using rmlmapper, transformed into triples and nquads, and loaded into Allegrograph.
Once in Allergrograph we used SPARQL queries to augment data across different graphs and extract Knowledge out of the bulk of information. The solution is designed to be deployed to AWS using a combination of native services (Airflow, S3, RDS-Postgres, EKS) and containers (Allegrograph, custom transformers, extractors, loaders, Meltano, DBT, RMLMapper). Final workload yielded datasets in excess of 50Mln triples.
Technologies: Stakeholder engagement, yarrml, RMLMapper, Allegrogarph, SQL, SPARQL, Graph Database, OWL, RDF, Ontologies, Protege, Meltano, DBT, Postgres, Snowflake,SnowPipe, Matillion, Python, FastApi, Flask, Django, GIT, GitHub, GitActions, CICD, CI/CD, Jira, TDD, DevOps, AWS, Terraform, Docker, Docker Compose, Airflow, RDS-Postgres, Cloud, AWS, API, sFTP.
Onboard internal and external dataset to support Customer Service and Marketing
Projects Supported:
Technologies: Stakeholder engagement, Python, Pandas, Scipy, FastApi, Flask, Django, GIT, GitHub, Jenkins, Jira, ClickMe, CI/CD, TDD, DevOps, Terraform, Docker, Fivetran, BigQuery, Snowflake, Composer-Airflow, Cloud GCP, DBT, API, sFTP, Vertex AI.
Translate Data Science Models (R, Jupyter Notebook, Matlab) in production ready application in the Azure Cloud and on-prem Hadoop/Spark cluster
Project supported:
Technologies: Stakeholder engagement, Java (EE), Python, Pandas, NLTK, Scipy, Numpy, Hadoop, HIVE, Pyspark, FastApi, Flask, Django, GIT, GitHub, Jenkins, Jira, CI/CD, TDD, DevOps, Automated Testing, Load Testing, ETL, Pipelines, Data Preprocessing, Data Lake, Azure, AzureML, Kafka, Spark, Hadoop, Hive, SQL, PostgreSQL, Teradata, Refinitiv Point Connect, Bloomberg SAPI.
Delivered the core of the migration of Vodafone's Big Data platform to Google Cloud (Team of 15 – Fully Remote, UK, India)
The platform serves all European markets and handles several terabyte of data per day (data retention of about 2-3 Petabyte of rolling data)
Refurbished capabilities of the Core Data Engineering Squad for the migration of big data platform to Google Cloud after the impact of IR35 reform. Delivered the migration under tight time/budget constraints with minor delay despite the serious constraints posed by Covid-19.
Initial challenges: Team was impacted by the IR35 related change of policies, the project suffered loss of knowledge, delay, high technical debt, and missing documentation.
Benefits: Team was reinforced, Technical debt was assessed and its impact mitigated, a reduction in scope was agreed with the stakeholders to fit timelines and budget. The project was delivered with minor delay despite serious technical, budgetary and environmental constraint. "Vodafone calls for transformative insights, Google Cloud answers" ([link]
Technologies: Stakeholder engagement, Java (EE), Scala, Python, Pyspark, Github, Jenkins, Jira, CI/CD, TDD/BDD, DevOps, Test Automation, Load/Stress Test, Cost optimization, Google Cloud platform (GCP), multiple services including DataFlow (Apache Beam), Composer (Airflow), DataProc, Cloud Storage, BigQuery, BigTable, Spanner, Pub/Sub, internal microservice architecture based on Kubernetes, Docker, Terraform.
Revamped the automated trade surveillance platform to meet the criteria set by the auditor (Team of 6 - co-located).
Initial challenges: Pending review from the regulator. The project suffered disconnect between stakeholders, compliance requirements, and developers. The platform was legacy. The development team suffered high attrition rate, thus loss of knowledge. Documentation was partial.
Benefits: Passing auditing (serious cost reduction). Providing meaningful alerts (67% spam reduction on downstream teams). Platform was consolidate and made extensible.
Asset Classes: FX spot/options, rates futures/bonds/swaps, repo, bespoke OTC.
Technologies: Stakeholder engagement, Java (EE), Python, Pandas, NLTK, Scipy, Numpy, Pyspark, Dask, Bitbucket, Jenkins, Jira, CI/CD, TDD, DevOps, Risk Scenarions, Automated Testing, Load Testing.
Managed the start-up of the company from ground zero to the first viable product, with particular focus on the e-commerce exposure and the click through rate optimization.
Delivered "Project James". A reinforcement learning AI for direct marketing optimization.
News UK won a Google sponsored innovation grant aimed at delivering an advanced solution to real marketing problems. Attrition of the initial investigator created condition for reassigning the task. The intervention required assessment of the partially implemented project, baseline the approach, rebuild the reinforcement learning core using state of the art tools. Tune and deliver a production viable tool within the scheduled time frame.
Challenges: Time pressure for delivery. Partially implemented platform with partial documentation. Full research project with no previous case study to leverage for comparison.
Benefits: "JAMES has revolutionised churn further, and advisors informed by readers interests underpin an award winning contact centre" ([link]
Technologies: Python, pandas, scipy, numpy, TensorFlow, Django, Flask, github, jenkins, jira, GitOps, CI/CD, DevOps, Kubernetes, Docker, Terraform, Microservice Architecture
Delivered the propensity model and API (Team of 5 - co-located).
The client wanted to improve conversion rate on the digital platform, and deliver a personalized user experience. Therefore, we piloted an online propensity model. The model follows each user of The Times Digital in real time and predicts the best opportunity for calls to action. E.g. subscriptions, cross-sale, up-sale.
Challenges: The model should work at high throughput (1000+prediction/sec) and low latency (<250ms max response time).
Benefits: It increased subscriptions ad cross-sales 5% and 9%, respectively. Piloted the deployment of high throughput APIs in NewsUK's brand new k8s cluster.
"Best Ever Growth for The Times & The Sunday Times Thanks to Usable Data Science" ([link]
Technologies: Stakeholder management, python, pandas, nltk, scipy, numpy, API, django, nginx, docker, Kubernetes (k8s), Terraform, Microservice Architecture, TensorFlow, github, jenkins, jira, CI/CD, DevOps, New Relic.
Managed the delivery of the Cloud Logging and Monitoring Platform (Team of 20 across 3 sites).
In the framework of public cloud adoption, JPMC needed a standardized, large scale, logging and monitoring system to meet cyber-security requirements for all application in the public cloud.
Davide joined the team after the PoC of the platform. He reviewed architecture and implementation. Then, scaled the platform to handle 5TB of data a day (approximately 5 billion messages with peak of 1.3 bln during the first hour of trading).
Challenges: very new project, under hard constraints in terms of data protections, thus limited availability of approved cloud services. Very challenging requirements in terms of SLO/SLA, high availability, disaster recovery, and sustained recovery.
Benefits: The platform allowed to monitor an initial set of 5 mission critical applications in the public cloud (AWS). It pioneered new technologies, produced a number of architectural patterns new to JPMC, and demonstrated its ability to scale up to a higher number of monitored applications at a button push.
Technologies: Leadership, AWS (API Gateway, Route53, S3, DynamoDB, Kinesis, Elastic Beanstalk, Lambda, ELB, AIM, CloudWatch, CloudTrail, etc.), Boto, Terraform, FluentD, Kafka, Kafka Streams, (replaced by Kinesis after SOC3), Kinesis Firehose, NiFi, Elastic Search, Logstash, Kibana, Java (EE), Python, Bitbucket, Jenkins, Jira, CI/CD, TDD, BDD, DevOps, Hera (JPMC Terraform based API), Automate Testing, Load Testing. Microservice architecture, Docker, Kubernetes (k8s), Datadog. L1 and L3 support during rollout and production, respectively.
Set the basis for standardized regulatory reporting across all business (Regulatory driven - Team of 4).
Due to regulatory change, the company was required to produce reporting aggregating across all lines of business (LoB). It required the standardization of thousands of words used for reporting ('Loan' has a different meaning in retail than in derivatives). We created controlled vocabularies, devised and automated the procedures for meta-data management. Served the dictionaries and the reference data through restAPI based a constellation of microservices. Promoted numerous educational interventions across the organization.
Challenges: High exposure to the regulators. Humongous amount of non listed words that needed attentions. Serious need to mediate between different high ranking stakeholders (senior executive and managing directors)
Benefits: We hampered the regulatory risk and provided tools to gain insight in the corporate dynamics.
Asset Classes: FX spot/options, rates futures/bonds/swaps, derivatives, OTC.
Technologies: Java (EE), Spring, Python, RDF, OWL, SparQL, Semantic web standards, Ontologies, Semantic Wiki, Knowledge graphs, Graph Database, Neo4j, BigQuery(Blazegraph), ISO20022, bitbucket, jenkins, jira, CI/CD, TDD, BDD,DevOps. Docker, Microservices.
Developed the Meta Analytic of Corporate and Investment Branch (CIB) of the bank.
As part of the digital transformation initiative, JPMC aimed at labelling and scoring all data repositories and all software products owned by the line of business. We defined the Data Quality Metrics, formal ontologies for data representation of logical data models (LDM), scan through the meta-data of all databases inferring the physical data model (PDM) and linked them through heuristics. The results were manually refined by Information Architects.
Challenges: Very broad collections of dishomogeneous data. Data quality was not always prime. Some data steward was only partially cooperative with the process.
Benefits: The semi-automated approach increased productivity of the Information Architects by a factor 4.7x.
Technologies: Java, Spring, Python, RDF, OWL, Semantic web standards, Ontologies,Knowledge graphs, Graph Database, BigQuery, ISO11179, bitbucket, jenkins, jira, CI/CD, TDD, DevOps.
Discover other experts with similar qualifications and experience