David Silver
Researcher (AlphaGo Zero)
Experience
Researcher (AlphaGo Zero)
DeepMind
Developed a reinforcement learning algorithm that achieves superhuman proficiency in the game of Go starting from random play without human data, guidance, or domain knowledge beyond basic rules.
Achieved a 100-0 win record against the champion-defeating version of AlphaGo.
Replaced separate policy and value networks with a single neural network architecture consisting of many residual blocks of convolutional layers with batch normalization and rectifier non-linearities.
Implemented a simplified tree search that relies on a single neural network for position evaluation and move sampling without performing Monte-Carlo rollouts.
Incorporated lookahead search inside the training loop to achieve rapid improvement and stable learning.
Rediscovered fundamental elements of human Go knowledge, including joseki (corner sequences), fuseki (opening strategies), and life-and-death concepts from first principles.
Optimized the system to run on a single machine with 4 Tensor Processing Units (TPUs) in the Google Cloud.
Researcher (AlphaGo Master)
DeepMind
Defeated the strongest human professional players 60-0 in online games in January 2017.
Utilized a specialized neural network architecture and reinforcement learning algorithm consistent with the AlphaGo Zero framework.
Integrated handcrafted features and rollouts into the search algorithm.
Initialized training through supervised learning from human expert data.
Researcher (AlphaGo Lee)
DeepMind
Defeated Lee Sedol, the winner of 18 international titles, in March 2016.
Implemented a distributed system over multiple machines utilizing 48 Tensor Processing Units (TPUs) to evaluate neural networks during search.
Trained the value network using outcomes from fast games of self-play by AlphaGo.
Employed a larger policy and value network architecture compared to earlier versions, featuring 12 convolutional layers.
Researcher (AlphaGo Fan)
DeepMind
Defeated the European champion Fan Hui in October 2015.
Developed and utilized two deep neural networks: a policy network to output move probabilities and a value network to output position evaluations.
Trained the policy network initially by supervised learning to predict human expert moves and subsequently refined it using policy-gradient reinforcement learning.
Trained the value network to predict the winner of games played by the policy network against itself.
Combined neural networks with a Monte-Carlo Tree Search (MCTS) to provide a sophisticated lookahead search.
Researcher
Google Inc.
Ioannis Antonoglou: Published extensive research in AI and machine learning, contributing to 36 publications and accumulating over 10,375 citations.
Yutian Chen: Published advanced research in machine learning, contributing to 24 publications and accumulating over 77,831 citations.
Researcher
Microsoft
- Thore Graepel: Published high-impact research in the field of Artificial Intelligence, contributing to 206 publications with over 33,474 citations.
Languages
Profile
Frequently asked questions
Do you have questions? Here you can find further information.
Where is David based?
What languages does David speak?
How many years of experience does David have?
What roles would David be best suited for?
What is David's latest experience?
Which industries is David most experienced in?
Which business areas is David most experienced in?
What is the availability of David?
What is the rate of David?
How to hire David?
Average rates for similar positions
Rates are based on recent contracts and do not include FRATCH margin.
Similar Freelancers
Discover other experts with similar qualifications and experience
Experts recently working on similar projects
Freelancers with hands-on experience in comparable project as a Researcher (AlphaGo Zero)
Nearby freelancers
Professionals working in or nearby Chemnitz, United Kingdom