Skip to the content.

About - Call for Papers - Speakers - Schedule - Registration - Important Dates


Big successes of machine learning (ML) for molecules have been achieved recently, e.g. the accurate prediction of protein 3D structure (Jumper 2021; Thornton, 2021), discovery of novel antibiotics (Stokes, 2020; Das, 2021), or chemical synthesis planning (Segler, 2018). These successes make molecular machine learning one of the prime candidates to tackle the climate-, energy- and pandemic-related crisis that we are facing. Nevertheless, there are still major challenges and substantial critique is voiced on current methods that are based mostly on deep learning (Marcus, 2018). Deep learning (DL) methods are data hungry, have limited knowledge transfer capabilities, do not quickly adapt to changing tasks or distributions, insufficiently incorporate world and prior knowledge, and cannot inherently distinguish causation from correlation (Bengio, 2021; Chollet, 2019; Marcus, 2018; Schölkopf, 2019). Furthermore, the current models are usually not composable in a sense that sub-components or different modules can be combined in a new way. With these characteristics, the machine learning systems currently employed for molecules are of the type of a narrow artificial intelligence (AI) (Chollet, 2019; Hochreiter, 2022). The above-mentioned drawbacks hold in particular for molecular machine learning, such as activity and property prediction, generative modeling (Yang, 2017; Bender, 2021; Fan, 2022), chemical reactivity and synthesis (Segler, 2018; Seidl, 2022), and molecular modeling (Bereau, 2013) and representation learning.

Therefore, this workshop focuses on exposing the current limitations of machine learning methods for molecules by critically assessing them, either theoretically or in applied and in industrial settings. The methods contributed to this workshop can focus on architectures that are robust against domain shifts, such as new biotechnologies or types of molecules. The proposed methods can also focus on quickly adapting to newly acquired data with potentially expensive biotechnologies, concretely few- and zero-shot learning methods. A further theme of the workshop is on methods that lead to new levels of abstractions of molecule representations, such that broader generalization capabilities are enabled. A potential step in this direction are machine learning methods for creating relevant physical abstractions, e.g. for improving molecular dynamics simulations or force fields. Advancing machine learning for molecules also means that these new systems should be able to interact with humans and transfer knowledge between them and the system, which is covered by the workshop theme on interpretability and explainability methods. The workshop also includes considerations and methodologies that allow for modularity or compositionality of architectures for molecular machine learning.

    The workshop will be hosted by the ELLIS unit Cambridge     and ELLIS unit Linz as a side-event to NeurIPS2022 and held     in VIRTUAL mode via Zoom and GatherTown.


The workshop will be open for everyone without registration fee. Please register here!


CET Event Speakers Title
09:00 Opening remarks Marwin Segler  
09:00 Invited Talk Koji Tsuda Self-Learning Entropic Population Annealing for Interpretable Materials Design
09:30 Invited Talk Tao Qin TBD
10:00 Contributed talk Oliver Schilter Using deep generative models for generating catalyst candidates for Suzuki cross coupling reactions
10:15 Contributed talk Morgan Thomas Re-evaluating sample efficiency in de novo molecule generation
10:30 Contributed talk Puck Van Gerwen Benchmarking families of descriptors for DFT-level predictions of chemical reaction properties
10:45 Contributed talk Austin Tripp Re-Evaluating Chemical Synthesis Planning Algorithms
11:00 Poster Session 1 Poster discussion at Gathertown  
12:00 Invited talk Christian Kramer AI in Small Molecule Drug Design – A User Perspective
12:30 Invited talk Andreas Bender Modelling Molecules? Great! What to Consider to Really Impact Drug Discovery
13:00 Break    
13:30 Invited talk Francesca Grisoni Molecular machine learning through the lens of activity cliffs
14:00 Invited talk Janet Thornton Protein Structure and Function Predictions
14:30 Contributed talk Emma Svensson Task-conditioned modeling of drug-target interactions
14:45 Contributed talk Oscar Mendez-Lucio MolE: a molecular foundation model for drug discovery
15:00 Contributed talk Laurence Illing Midgley Flow Annealed Importance Sampling Bootstrap
15:15 Contributed talk Leo Klarner GAUCHE: A Library for Gaussian Processes in Chemistry
15:30 Invited talk Lucy Colwell Biological sequence design with therapeutic applications
16:00 Invited talk Payel Das Advancing Molecular Design: Promise Evaluation of Foundation Models and Generative AI Talk
16:00 Stand-in talk Jose M. H. Lobato Meta-learning Adaptive Deep Kernel Gaussian Processes for Molecular Property Prediction
16:30 Panel discussion Moderator: Marwin Segler Panelists: Derek Lowe, Ola Engkvist, Philippe Schwaller, Nadine Schneider,Jose M. H. Lobato
17:00 Poster Session 2 Poster discussion at Gathertown  
18:00 Closing remarks    


Janet-Thornton-Senior-scientist-and-outgoing-director-of-EMBL-EBI Janet Thornton, EBI Hinxton, UK. She is a pioneer in structural bioinformatics and was leading the European BioinformaticsInstitute for 14 years. Her work strongly influenced the understanding of protein 3D structures and protein function. One of her most impactful works is on the stereochemicalquality of protein structures.
avatar_user__1470987161 Tao Qin, Microsoft Research, China He has contributed crucial work on adaptiveness, domain generalization, and robustnesswith several applications to molecules.
images Andreas Bender, University Cambridge, UK. Chemoinformatics pioneer and has recently critically analyzed AI for drug discovery
Lucy Colwell-tweaked Lucy Colwell, Google Research, US. Lucy Colwell focuses on gaining new insight and understanding from large bodies of data. Especially she tried to obtain evolutionary constraints from a set of sequence homologs of a protein. Her works on generating, optimizing and annotating biological sequences have become widely known.
avatar_hubb012713819be56dd0951d75bf245c3b_1329383_250x250_fill_q90_lanczos_center Koji Tsuda, University of Tokyo, Japan. He has contributed some of the early works in ML for molecules, such as graph kernels; recently he contributed hybrid methods for chemical synthesis and molecule optimization.
thumbnail_Grisoni_Francesca_BMT_PO_AS_2560 Francesca Grisoni, TU Eindhoven, Netherlands. She provided some pioneering work on the use of generative Deep Learning methods for molecules and she has also worked on exposing limitations of such methods.
Payel_Das_300x300 Payel Das, IBM NYC, US. She leads research projects related to AI for creativity and discovery, with inspirations from and applications in material science, chemistry, and biology.
23964-76 Christian Kramer, Roche, Switzerland He is head of the computer-aided drug design group at Roche and will provide insights about the industry- and real-world impact of Deep Learning methods in this area

Accepted contributions (oral)

1 Task-conditioned modeling of drug-target interactions Emma Svensson, Pieter-Jan Hoedt, Sepp Hochreiter, Günter Klambauer [PDF]
2 Benchmarking families of descriptors for DFT-level predictions of chemical reaction properties Puck Van Gerwen, Malte Franke, Clemence Corminboeuf [PDF]
3 Re-Evaluating Chemical Synthesis Planning Algorithms Austin Tripp, Krzysztof Maziarz, Sarah Lewis, Guoqing Liu, Marwin Segler [PDF]
4 Using deep generative models for generating catalyst candidates for Suzuki cross coupling reactions Oliver Schilter, Alain C. Vaucher, Federico Zipoli, Philippe Schwaller, Teodoro Laino [PDF]
5 Flow Annealed Importance Sampling Bootstrap Laurence Illing Midgley, Vincent Stimper, Gregor N. C. Simm, Bernhard Schölkopf, José Miguel Hernández-Lobato [PDF]
6 MolE: a molecular foundation model for drug discovery Oscar Mendez-Lucio, Christos A Nicolaou, Berton Earnshaw [PDF]
7 GAUCHE: A Library for Gaussian Processes in Chemistry Ryan-Rhys Griffiths, Leo Klarner, Henry Moss, Aditya Ravuri, Sang T. Truong, Bojana Ranković, Arian Rokkum Jamasb, Yuanqi Du, Julius Schwartz, Austin Tripp, Gregory Kell, Anthony Bourached, Alex Chan, Jacob Moss, Chengzhi Guo, Alpha Lee, Philippe Schwaller, Jian Tang [PDF]
8 Re-evaluating sample efficiency in de novo molecule generation Morgan Thomas, Noel O’Boyle, Andreas Bender, Chris de Graaf [PDF]

Accepted contributions (poster)

9 A substructure-aware loss for feature attribution in drug discovery Jose Jimenez-Luna, Kenza Amara, Raquel Rodriguez-Perez [PDF]
10 Directional Variational Transformers for continuous molecular embedding Tushar Gadhiya, Falak Shah, Nisarg Vyas, Vahe Gharakhanyan, Julia H. Yang, Alexander Holiday [PDF]
11 Is Neural Chemical Reaction Prediction the Modern Alchemy? Andrea Valenti, Davide Bacciu, Antonio Vergari [PDF]
12 Data-driven Reaction Template Fingerprints Anubhab Chakraborty, Amol Thakkar, Alain C. Vaucher, Teodoro Laino [PDF]
13 Differential top-k learning for template-based single-step retrosynthesis Andres M Bran, Philippe Schwaller [PDF]
14 Context enrichment yields expressive molecular representations for few-shot drug discovery Johannes Schimunek, Philipp Seidl, Lukas Friedrich, Daniel Kuhn, Friedrich Rippmann, Sepp Hochreiter, Günter Klambauer [PDF]
15 Dynamic Molecular Graph-based Implementation for Biophysical Properties Prediction Carter Knutson, Gihan Uthpala Panapitiya, Rohith Anand Varikoti, Neeraj Kumar [PDF]
16 Are VAEs Bad at Reconstructing Molecular Graphs? Hagen Münkler, Hubert Misztela, Michal Pikusa, Marwin Segler, Nadine Schneider, Krzysztof Maziarz [PDF]
17 Bayesian optimisation for additive screening and yield improvements in chemical reactions - beyond one-hot encodings Bojana Ranković, Ryan-Rhys Griffiths, Henry Moss, Philippe Schwaller [PDF]
18 Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization Wenhao Gao, Tianfan Fu, Jimeng Sun, Connor W. Coley [PDF]
19 Digitization of Chemical Reactions Schemes Mark Martori Lopez, Daniel Probst, Amol Thakkar, Teodoro Laino [PDF]
20 Probing Graph Representations of Molecules Mohammad Sadegh Akhondzadeh, Vijay Lingam, Aleksandar Bojchevski [PDF]
21 Multi-Objective Drug Optimization by Markov Chain Monte Carlo Agustin Kruel, Andrew D. McNaughton, Neeraj Kumar [PDF]
22 Is GPT-3 all you need for machine learning for chemistry? Kevin Maik Jablonka, Philippe Schwaller, Berend Smit [PDF]

Important dates

Call for papers

We are calling for papers advancing or critically assessing molecular machine learning. Topics include (but not limited to):

Please submit your contributions on OpenReview until Oct 18 2022 12:00AM AOE (abstract registration deadline) and Oct 20 2022 12:00AM AOE (paper deadline). The submissions should be in PDF and follow the NeurIPS template with a maximum of 4 pages (not including references and appendices). Please anonymize your paper since the review process is dual-anonymous. For submitting the camera-ready paper please use this updated style file.

Organizing Committee and Contact

Chairs: Jennifer Wei, Nadine Schneider, Günter Klambauer, Marwin Segler, and Jose Miguel Hernandez Lobato

Program Committee

Akshat Kumar Nigam, Alain C. Vaucher, Alexandros Kalousis, Andrea Volkamer, Andreas Mayr, Bharath Ramsundar, Bowen Jing, Brooks Paige, Cheng-Hao Liu, Daniel Stauso Wigh, Danilo Numeroso, Davide Bacciu, Fergus Imrie, Floriane Montanari, Gregor N. C. Simm, Hehuan Ma, Hiroshi Kajino, Hongyu Shen, Johannes Kirchmair, Kangway V Chuang, Ke Yu, Kobi Felton, Kristof T Schütt, Lagnajit Pattanaik, Lei Xie, Michele Ceriotti, Miguel Garcia Ortegon, Morgan Thomas, Nathan C. Frey, Ola Engkvist, Omar Rivasplata, Patricia Adriana Suriana, Rocío Mercado, Ryan-Rhys Griffiths, Simon Axelrod, Soha Hassoun, Soha Hassoun, Sowmya Ramaswamy Krishnan, Teodoro Laino


Bender, A., & Cortés-Ciriano, I. (2021). Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 1: ways to make an impact, and why we are not there yet. Drug discovery today, 26(2), 511-524.
Bengio, Y., Lecun, Y., & Hinton, G. (2021). Deep learning for AI. Communications of the ACM, 64(7), 58-65.
Bereau, T., Kramer, C., & Meuwly, M. (2013). Leveraging symmetries of static atomic multipole electrostatics in molecular dynamics simulations. Journal of Chemical Theory and Computation, 9(12), 5450-5459.
Chen, H., Engkvist, O., Wang, Y., … & Blaschke, T. (2018). The rise of deep learning in drug discovery. Drug discovery today, 23(6), 1241-1250.
Chollet, F. (2019). On the measure of intelligence. arXiv preprint arXiv:1911.01547.
Das, P., Sercu, T., Wadhawan, K., Padhi, I., Gehrmann, S., Cipcigan, F., … & Mojsilovic, A. (2021). Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nature Biomedical Engineering, 5(6), 613-623.
Fan, Y., Xia, Y., Zhu, J., Wu, L., Xie, S., & Qin, T. (2022). Back translation for molecule generation. Bioinformatics, 38(5), 1244-1251.
Hochreiter, S. (2022). Toward a broad AI. Communications of the ACM 65, no. 4: 56-57.
Jumper, J., Evans, R., Pritzel, A., … & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589.
Segler, M. H., Preuss, M., & Waller, M. P. (2018). Planning chemical syntheses with deep neural networks and symbolic AI. Nature, 555(7698), 604-610.
Klambauer, G. (2021). Moving beyond narrow AIs in Drug Discovery – a perspective. ELLIS ML4Molecules workshop Dec 13, 2021. Virtual,
Marcus, G. (2018). Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631.
Schölkopf, B. (2019). Causality for machine learning. arXiv preprint arXiv:1911.10500.
Seidl, P., Renz, P., Dyubankova, N., Neves, P ,… & Klambauer, G. (2022). Improving Few-and Zero-Shot Reaction Template Prediction Using Modern Hopfield Networks. Journal of chemical information and modeling.
Stokes, J. M., Yang, K., Swanson, K., Jin, W.,… & Collins, J. J. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4), 688-702.
Thornton, J. M., …,, & Borkakoti, N. (2021). AlphaFold heralds a data-driven revolution in biology and medicine. Nature Medicine, 27(10), 1666-1669.
Yang, X., Zhang, J., Yoshizoe, K., Terayama, K., & Tsuda, K. (2017). ChemTS: an efficient python library for de novo molecular generation. Science and technology of advanced materials, 18(1), 972-976.