Critical assessment of molecular machine learning workshop [ML4Molecules]

About - Call for Papers - Speakers - Schedule - Registration - Important Dates

About

Big successes of machine learning (ML) for molecules have been achieved recently, e.g. the accurate prediction of protein 3D structure (Jumper 2021; Thornton, 2021), discovery of novel antibiotics (Stokes, 2020; Das, 2021), or chemical synthesis planning (Segler, 2018). These successes make molecular machine learning one of the prime candidates to tackle the climate-, energy- and pandemic-related crisis that we are facing. Nevertheless, there are still major challenges and substantial critique is voiced on current methods that are based mostly on deep learning (Marcus, 2018). Deep learning (DL) methods are data hungry, have limited knowledge transfer capabilities, do not quickly adapt to changing tasks or distributions, insufficiently incorporate world and prior knowledge, and cannot inherently distinguish causation from correlation (Bengio, 2021; Chollet, 2019; Marcus, 2018; Schölkopf, 2019). Furthermore, the current models are usually not composable in a sense that sub-components or different modules can be combined in a new way. With these characteristics, the machine learning systems currently employed for molecules are of the type of a narrow artificial intelligence (AI) (Chollet, 2019; Hochreiter, 2022). The above-mentioned drawbacks hold in particular for molecular machine learning, such as activity and property prediction, generative modeling (Yang, 2017; Bender, 2021; Fan, 2022), chemical reactivity and synthesis (Segler, 2018; Seidl, 2022), and molecular modeling (Bereau, 2013) and representation learning.

Therefore, this workshop focuses on exposing the current limitations of machine learning methods for molecules by critically assessing them, either theoretically or in applied and in industrial settings. The methods contributed to this workshop can focus on architectures that are robust against domain shifts, such as new biotechnologies or types of molecules. The proposed methods can also focus on quickly adapting to newly acquired data with potentially expensive biotechnologies, concretely few- and zero-shot learning methods. A further theme of the workshop is on methods that lead to new levels of abstractions of molecule representations, such that broader generalization capabilities are enabled. A potential step in this direction are machine learning methods for creating relevant physical abstractions, e.g. for improving molecular dynamics simulations or force fields. Advancing machine learning for molecules also means that these new systems should be able to interact with humans and transfer knowledge between them and the system, which is covered by the workshop theme on interpretability and explainability methods. The workshop also includes considerations and methodologies that allow for modularity or compositionality of architectures for molecular machine learning.

The workshop will be hosted by the ELLIS unit Cambridge and ELLIS unit Linz as a side-event to NeurIPS2022 and held in VIRTUAL mode via Zoom and GatherTown.

Registration

The workshop will be open for everyone without registration fee. Please register here!

Schedule

CET	Event	Speakers	Title
09:00	Opening remarks	Marwin Segler
09:00	Invited Talk	Koji Tsuda	Self-Learning Entropic Population Annealing for Interpretable Materials Design
09:30	Invited Talk	Tao Qin	TBD
10:00	Contributed talk	Oliver Schilter	Using deep generative models for generating catalyst candidates for Suzuki cross coupling reactions
10:15	Contributed talk	Morgan Thomas	Re-evaluating sample efficiency in de novo molecule generation
10:30	Contributed talk	Puck Van Gerwen	Benchmarking families of descriptors for DFT-level predictions of chemical reaction properties
10:45	Contributed talk	Austin Tripp	Re-Evaluating Chemical Synthesis Planning Algorithms
11:00	Poster Session 1	Poster discussion at Gathertown
12:00	Invited talk	Christian Kramer	AI in Small Molecule Drug Design – A User Perspective
12:30	Invited talk	Andreas Bender	Modelling Molecules? Great! What to Consider to Really Impact Drug Discovery
13:00	Break
13:30	Invited talk	Francesca Grisoni	Molecular machine learning through the lens of activity cliffs
14:00	Invited talk	Janet Thornton	Protein Structure and Function Predictions
14:30	Contributed talk	Emma Svensson	Task-conditioned modeling of drug-target interactions
14:45	Contributed talk	Oscar Mendez-Lucio	MolE: a molecular foundation model for drug discovery
15:00	Contributed talk	Laurence Illing Midgley	Flow Annealed Importance Sampling Bootstrap
15:15	Contributed talk	Leo Klarner	GAUCHE: A Library for Gaussian Processes in Chemistry
15:30	Invited talk	Lucy Colwell	Biological sequence design with therapeutic applications
16:00	~~Invited talk~~	~~Payel Das~~	~~Advancing Molecular Design: Promise Evaluation of Foundation Models and Generative AI~~ Talk
16:00	Stand-in talk	Jose M. H. Lobato	Meta-learning Adaptive Deep Kernel Gaussian Processes for Molecular Property Prediction
16:30	Panel discussion	Moderator: Marwin Segler	Panelists: Derek Lowe, Ola Engkvist, Philippe Schwaller, Nadine Schneider,Jose M. H. Lobato
17:00	Poster Session 2	Poster discussion at Gathertown
18:00	Closing remarks

Speakers

	Janet Thornton, EBI Hinxton, UK. She is a pioneer in structural bioinformatics and was leading the European BioinformaticsInstitute for 14 years. Her work strongly influenced the understanding of protein 3D structures and protein function. One of her most impactful works is on the stereochemicalquality of protein structures.
	Tao Qin, Microsoft Research, China He has contributed crucial work on adaptiveness, domain generalization, and robustnesswith several applications to molecules.
	Andreas Bender, University Cambridge, UK. Chemoinformatics pioneer and has recently critically analyzed AI for drug discovery
	Lucy Colwell, Google Research, US. Lucy Colwell focuses on gaining new insight and understanding from large bodies of data. Especially she tried to obtain evolutionary constraints from a set of sequence homologs of a protein. Her works on generating, optimizing and annotating biological sequences have become widely known.
	Koji Tsuda, University of Tokyo, Japan. He has contributed some of the early works in ML for molecules, such as graph kernels; recently he contributed hybrid methods for chemical synthesis and molecule optimization.
	Francesca Grisoni, TU Eindhoven, Netherlands. She provided some pioneering work on the use of generative Deep Learning methods for molecules and she has also worked on exposing limitations of such methods.
	Payel Das, IBM NYC, US. She leads research projects related to AI for creativity and discovery, with inspirations from and applications in material science, chemistry, and biology.
	Christian Kramer, Roche, Switzerland He is head of the computer-aided drug design group at Roche and will provide insights about the industry- and real-world impact of Deep Learning methods in this area

Accepted contributions (oral)

1	Task-conditioned modeling of drug-target interactions	Emma Svensson, Pieter-Jan Hoedt, Sepp Hochreiter, Günter Klambauer	[PDF]
2	Benchmarking families of descriptors for DFT-level predictions of chemical reaction properties	Puck Van Gerwen, Malte Franke, Clemence Corminboeuf	[PDF]
3	Re-Evaluating Chemical Synthesis Planning Algorithms	Austin Tripp, Krzysztof Maziarz, Sarah Lewis, Guoqing Liu, Marwin Segler	[PDF]
4	Using deep generative models for generating catalyst candidates for Suzuki cross coupling reactions	Oliver Schilter, Alain C. Vaucher, Federico Zipoli, Philippe Schwaller, Teodoro Laino	[PDF]
5	Flow Annealed Importance Sampling Bootstrap	Laurence Illing Midgley, Vincent Stimper, Gregor N. C. Simm, Bernhard Schölkopf, José Miguel Hernández-Lobato	[PDF]
6	MolE: a molecular foundation model for drug discovery	Oscar Mendez-Lucio, Christos A Nicolaou, Berton Earnshaw	[PDF]
7	GAUCHE: A Library for Gaussian Processes in Chemistry	Ryan-Rhys Griffiths, Leo Klarner, Henry Moss, Aditya Ravuri, Sang T. Truong, Bojana Ranković, Arian Rokkum Jamasb, Yuanqi Du, Julius Schwartz, Austin Tripp, Gregory Kell, Anthony Bourached, Alex Chan, Jacob Moss, Chengzhi Guo, Alpha Lee, Philippe Schwaller, Jian Tang	[PDF]
8	Re-evaluating sample efficiency in de novo molecule generation	Morgan Thomas, Noel O’Boyle, Andreas Bender, Chris de Graaf	[PDF]

Accepted contributions (poster)

9	A substructure-aware loss for feature attribution in drug discovery	Jose Jimenez-Luna, Kenza Amara, Raquel Rodriguez-Perez	[PDF]
10	Directional Variational Transformers for continuous molecular embedding	Tushar Gadhiya, Falak Shah, Nisarg Vyas, Vahe Gharakhanyan, Julia H. Yang, Alexander Holiday	[PDF]
11	Is Neural Chemical Reaction Prediction the Modern Alchemy?	Andrea Valenti, Davide Bacciu, Antonio Vergari	[PDF]
12	Data-driven Reaction Template Fingerprints	Anubhab Chakraborty, Amol Thakkar, Alain C. Vaucher, Teodoro Laino	[PDF]
13	Differential top-k learning for template-based single-step retrosynthesis	Andres M Bran, Philippe Schwaller	[PDF]
14	Context enrichment yields expressive molecular representations for few-shot drug discovery	Johannes Schimunek, Philipp Seidl, Lukas Friedrich, Daniel Kuhn, Friedrich Rippmann, Sepp Hochreiter, Günter Klambauer	[PDF]
15	Dynamic Molecular Graph-based Implementation for Biophysical Properties Prediction	Carter Knutson, Gihan Uthpala Panapitiya, Rohith Anand Varikoti, Neeraj Kumar	[PDF]
16	Are VAEs Bad at Reconstructing Molecular Graphs?	Hagen Münkler, Hubert Misztela, Michal Pikusa, Marwin Segler, Nadine Schneider, Krzysztof Maziarz	[PDF]
17	Bayesian optimisation for additive screening and yield improvements in chemical reactions - beyond one-hot encodings	Bojana Ranković, Ryan-Rhys Griffiths, Henry Moss, Philippe Schwaller	[PDF]
18	Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization	Wenhao Gao, Tianfan Fu, Jimeng Sun, Connor W. Coley	[PDF]
19	Digitization of Chemical Reactions Schemes	Mark Martori Lopez, Daniel Probst, Amol Thakkar, Teodoro Laino	[PDF]
20	Probing Graph Representations of Molecules	Mohammad Sadegh Akhondzadeh, Vijay Lingam, Aleksandar Bojchevski	[PDF]
21	Multi-Objective Drug Optimization by Markov Chain Monte Carlo	Agustin Kruel, Andrew D. McNaughton, Neeraj Kumar	[PDF]
22	Is GPT-3 all you need for machine learning for chemistry?	Kevin Maik Jablonka, Philippe Schwaller, Berend Smit	[PDF]

Important dates

October 18, 2022: submission deadline
Mid November, 2022: author notification
November 28, 2022: workshop

Call for papers

We are calling for papers advancing or critically assessing molecular machine learning. Topics include (but not limited to):

Critical assessment of molecular machine learning
Benchmarking machine learning methods for molecules and new data sets or tasks
Zero- and few-shot learning methods for molecules, data-efficient learning
Domain shifts and out-of-distribution handling
Abstraction and improved representations of molecules
Interpretability and explainability methods for molecules
Multi-modal learning and modular architectures
Causality and physics-based machine learning for molecules

Please submit your contributions on OpenReview until Oct 18 2022 12:00AM AOE (abstract registration deadline) and Oct 20 2022 12:00AM AOE (paper deadline). The submissions should be in PDF and follow the NeurIPS template with a maximum of 4 pages (not including references and appendices). Please anonymize your paper since the review process is dual-anonymous. For submitting the camera-ready paper please use this updated style file.

Organizing Committee and Contact

Chairs: Jennifer Wei, Nadine Schneider, Günter Klambauer, Marwin Segler, and Jose Miguel Hernandez Lobato
Contact: ml4molecules@ml.jku.at

Program Committee

Akshat Kumar Nigam, Alain C. Vaucher, Alexandros Kalousis, Andrea Volkamer, Andreas Mayr, Bharath Ramsundar, Bowen Jing, Brooks Paige, Cheng-Hao Liu, Daniel Stauso Wigh, Danilo Numeroso, Davide Bacciu, Fergus Imrie, Floriane Montanari, Gregor N. C. Simm, Hehuan Ma, Hiroshi Kajino, Hongyu Shen, Johannes Kirchmair, Kangway V Chuang, Ke Yu, Kobi Felton, Kristof T Schütt, Lagnajit Pattanaik, Lei Xie, Michele Ceriotti, Miguel Garcia Ortegon, Morgan Thomas, Nathan C. Frey, Ola Engkvist, Omar Rivasplata, Patricia Adriana Suriana, Rocío Mercado, Ryan-Rhys Griffiths, Simon Axelrod, Soha Hassoun, Soha Hassoun, Sowmya Ramaswamy Krishnan, Teodoro Laino

References

Bender, A., & Cortés-Ciriano, I. (2021). Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 1: ways to make an impact, and why we are not there yet. Drug discovery today, 26(2), 511-524.
Bengio, Y., Lecun, Y., & Hinton, G. (2021). Deep learning for AI. Communications of the ACM, 64(7), 58-65.
Bereau, T., Kramer, C., & Meuwly, M. (2013). Leveraging symmetries of static atomic multipole electrostatics in molecular dynamics simulations. Journal of Chemical Theory and Computation, 9(12), 5450-5459.
Chen, H., Engkvist, O., Wang, Y., … & Blaschke, T. (2018). The rise of deep learning in drug discovery. Drug discovery today, 23(6), 1241-1250.
Chollet, F. (2019). On the measure of intelligence. arXiv preprint arXiv:1911.01547.
Das, P., Sercu, T., Wadhawan, K., Padhi, I., Gehrmann, S., Cipcigan, F., … & Mojsilovic, A. (2021). Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nature Biomedical Engineering, 5(6), 613-623.
Fan, Y., Xia, Y., Zhu, J., Wu, L., Xie, S., & Qin, T. (2022). Back translation for molecule generation. Bioinformatics, 38(5), 1244-1251.
Hochreiter, S. (2022). Toward a broad AI. Communications of the ACM 65, no. 4: 56-57.
Jumper, J., Evans, R., Pritzel, A., … & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589.
Segler, M. H., Preuss, M., & Waller, M. P. (2018). Planning chemical syntheses with deep neural networks and symbolic AI. Nature, 555(7698), 604-610.
Klambauer, G. (2021). Moving beyond narrow AIs in Drug Discovery – a perspective. ELLIS ML4Molecules workshop Dec 13, 2021. Virtual, https://moleculediscovery.github.io/workshop2021/.
Marcus, G. (2018). Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631.
Schölkopf, B. (2019). Causality for machine learning. arXiv preprint arXiv:1911.10500.
Seidl, P., Renz, P., Dyubankova, N., Neves, P ,… & Klambauer, G. (2022). Improving Few-and Zero-Shot Reaction Template Prediction Using Modern Hopfield Networks. Journal of chemical information and modeling.
Stokes, J. M., Yang, K., Swanson, K., Jin, W.,… & Collins, J. J. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4), 688-702.
Thornton, J. M., …,, & Borkakoti, N. (2021). AlphaFold heralds a data-driven revolution in biology and medicine. Nature Medicine, 27(10), 1666-1669.
Yang, X., Zhang, J., Yoshizoe, K., Terayama, K., & Tsuda, K. (2017). ChemTS: an efficient python library for de novo molecular generation. Science and technology of advanced materials, 18(1), 972-976.