Active Learning for Billion Scale Data

This is a hybrid event. Please find the Teams link in the Abstract.

Bio: Talfan Evans is a research scientist at Google Deepmind working. His work is focused on developing scalable data curation strategies for compute efficient large-scale pretraining. He has a MEng from Keble College and did his PhD at UCL in Cognitive Neuroscience, where he worked on adapting message-passing algorithms from the autonomous driving literature to explain neural activity during spatial exploration. As a postdoc with Andrew Davison at Imperial, he worked on real-time computer vision systems before moving to Deepmind.

Blurb: Large foundation model scaling laws tell us that to continue to make additive improvements to performance, we should expect to need to pay orders of magnitude more in compute costs and data. In this talk, I’ll present work that paints a more optimistic picture – actively choosing which data to train on can shift these curves in our favour, producing significantly more performant models for the same compute budget.

Teams link: teams.microsoft.com/l/meetup-join/19%3ameeting_M2UyYTVlYzEtNTc5NS00Yjc1LWJjNDItODM0MjgyMjg1ODlj%40thread.v2/0?context=%7b%22Tid%22%3a%22cc95de1b-97f5-4f93-b4ba-fe68b852cf91%22%2c%22Oid%22%3a%22e44820d7-5edb-4030-9763-4c8cdc3aafd6%22%7d

Date: 7 November 2024, 17:00
Venue:
Wolfson College
Linton Road OX2 6UD
See location on maps.ox

Details: Levett Room
Speaker: Dr Talfan Evans (DeepMind)
Organising department: Wolfson College
Organisers: Dr Csaba Botos, Dr. Yi Yin (Wolfson College, University of Oxford)
Organiser contact email address: yi.yin@wolfson.ox.ac.uk
Part of: Oxford Cross-Disciplinary Machine Learning (OxfordXML) Research Cluster Seminar Series
Booking required?: Not required
Booking url: https://oxfordxml.github.io
Cost: Free (tea/coffee/cake will be provided)
Audience: Public
Editor: Yi Yin