ML for molecules and materials in the era of LLMs [ML4Molecules]

About - Registration - Schedule - Keynote Speakers - Keynote Abstracts - Accepted Contributions - Important Dates - Call for Papers

IMPORTANT
2024-09-18: Registration for the workshop is now open. Please register and join us!

About

Machine Learning models for molecules and materials are essential drivers in drug discovery, materials design, environmental science, and precision medicine. With these models, experts are enabled to meaningfully interact with molecular systems and to design them. The interaction with molecular systems allows for deeper insights into typically complex systems and facilitate (enable) the development of new drugs, personalized medical treatments, and innovative materials - e.g., advanced fuel cells in the field of energy conversion and storage or materials for environmental remediation in the field of environmental science. Recently, large Language Models (LLMs) have demonstrated remarkable performance across various domains in machine learning, revolutionizing natural language processing tasks and showing notable capabilities in understanding and generating human-like text. LLMs have also had a profound impact on molecular machine learning because they allow for human-AI interaction, interpretability, zero- and few-shot adaption, and in-context learning. However, the application of LLM architectures to molecular data presents significant challenges. Therefore, this workshop focuses on exploring potentials of ML methods with broader capabilities like LLMs incorporate them. We welcome all work which might provide potential steps towards these broader capabilities, e.g., by tackling the areas of knowledge and interaction, adaptability and robustness, abstraction and reasoning, or efficiency.

Join us at our workshop at which experts from diverse fields, ranging from ML and LLMs to molecular sciences, will collaborate to explore new potentials of ML methods for molecules and materials in the era of LLMs.

Registration

The workshop will be open to everyone without a registration fee. You can register here!

Schedule

Fri, Dec 6th, 09:00 am - 6:00 pm, CET; venue: Fritz Haber Institute of the Max Planck Society, Berlin; online at Zoom

CET	Event	Speakers	Title
08:00 - 09:00	Registration
09:00 - 09:30	Invited Talk	Rocio Mercado	Generative AI for Molecular Design: From Drugs to Sustainable Materials
09:30 - 10:00	Invited Talk	Daniel Probst	Of graphs, sets, and molecules
10:00 - 10:15	Contributed Talk	Nawaf Alampara	Task Alignment Outweighs Framework Choice in Scientific LLM Agents
10:15 - 10:30	Contributed Talk	Riccardo Tedoldi	WEISS: Wasserstein efficient sampling strategy for LLMs in drug design
10:30 - 11:00	Coffee Break
11:00 - 11:30	Invited Talk	Marwin Segler	Deep Learning for Molecules: The First Decade
11:30 - 11:45	Contributed Talk	Nikhil Branson	AntiDIF: Accurate and Diverse Antibody Specific Inverse Folding with Discrete Diffusion
11:45 - 12:00	Contributed Talk	Yujia Guo	Bridging Data-Driven and Expert Knowledge for Interpretable Evaluation of Synthetic Routes
12:00 - 12:30	Panel Discussion	Günter Klambauer, José Miguel Hernández Lobato, TBD
12:30 - 13:30	Lunch
13:30 - 14:00	Invited Talk	Nadine Schneider	Applying AI/ML to Accelerate the DMTA Cycle in Drug Discovery
14:00 - 14:15	Contributed Talk	Julian Cremer	FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction
14:15 - 14:30	Contributed Talk	Rasmus Hannibal Tirsgaard	Semi-Supervised Learning for Molecular Graphs via Ensemble Consensus
14:30 - 14:45	Contributed Talk	Xuan Vu Nguyen	Synthelite: Chemist-aligned and feasibility-aware synthesis planning with LLMs
14:45 - 15:00	Contributed Talk	Marcel Hiltscher	Explaining What Matters: Faithfulness in Molecular Deep Learning
15:00 - 15:30	Coffee Break
15:30 - 16:00	ELLIS UnConference Welcome
16:00 - 16:30	Retreat of Program Fellows
16:00 - 18:00	Poster session
18:00 - 20:00	Reception

Keynote Speakers

Accepted contributions (poster)

ID	Title	Authors
1	DEQuify your force field: More efficient simulations using deep equilibrium models	Andreas Burger, Luca Thiede, Alan Aspuru-Guzik, Nandita Vijaykumar
2	ML Force Fields for Computational NMR Spectra of Dynamic Materials across Time-Scales	Lars Leon Schaaf, Benjamin J. Rhodes, Mary E. Zick, Suzi M. Pugh, Jordon S. Hilliard, Shivani Sharma, Casey R. Wade, Phillip J. Milner, Gabor Csanyi, Alexander C. Forse
3	DrugDiff - small molecule diffusion model with flexible guidance towards molecular properties	Marie Oestreich, Matthias Becker
4	Generative Model for Synthesizing Ionizable Lipids: A Monte Carlo Tree Search Approach	Jingyi Zhao, Yuxuan Ou, Austin Tripp, Morteza Rasoulianboroujeni, José Miguel Hernández-Lobato
5	In silico enzyme prediction and generation by fine tuning protein language models	Marco Nicolini, Emanuele Saitto, Ruben Jimenez, Emanuele Cavalleri, Aldo Galeano, Dario Malchiodi, Alberto Paccanaro, Peter N Robinson, Elena Casiraghi, Giorgio Valentini
6	A language model assistant for biocatalysis	Yves Gaetan Nana Teukam, Francesca Grisoni, Matteo Manica
7	Tango*: Constrained synthesis planning using chemically informed value functions	Daniel P Armstrong, Zlatko Jončev, Jeff Guo, Philippe Schwaller
8	A Generative Model for the Design of Synthesizable Ionizable Lipids	Yuxuan Ou, Jingyi Zhao, Austin Tripp, Morteza Rasoulianboroujeni, José Miguel Hernández-Lobato
9	The Jungle of Generative Drug Discovery: Traps, Treasures, and Ways Out	Rıza Özçelik, Francesca Grisoni
10	BNEM: A Boltzmann Sampler Based on Bootstrapped Noised Energy Matching	RuiKang OuYang, Bo Qiang, José Miguel Hernández-Lobato
11	Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models	Fengzhe Zhang, Jiajun He, Laurence Illing Midgley, Javier Antoran, José Miguel Hernández-Lobato
12	UPT++: Latent Point Set Neural Operators for Modeling System State Transitions	Andreas Fürst, Florian Sestak, Artur Petrov Toshev, Benedikt Alkin, Nikolaus A. Adams, Andreas Mayr, Günter Klambauer, Johannes Brandstetter
13	Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences	Niklas Schmidinger, Lisa Schneckenreiter, Philipp Seidl, Johannes Schimunek, Pieter-Jan Hoedt, Johannes Brandstetter, Andreas Mayr, Sohvi Luukkonen, Sepp Hochreiter, Günter Klambauer
14	Enhancing Molecular Property Prediction with GNNs via Architecture-Agnostic Graph Transformations	Zhifei Li, Gerrit Großmann, Verena Wolf
15	Fantastic SMILES Augmentation Strategies and Where to Find Them	Helena Brinkmann, Francesca Grisoni, Antoine Argante, Hugo ter Steege
16	Rectified Flow For Structure Based Drug Design	Daiheng Zhang, Chengyue Gong, Qiang Liu
17	Equivariant conditional diffusion model for exploring the chemical space around Vaska’s complex	François R J Cornet, Pratham Deshmukh, Bardi Benediktsson, Mikkel N. Schmidt, Arghya Bhowmik
18	Sample Efficient Goal-Directed Catalyst Design for the Morita-Baylis-Hillman Reaction	Sarina Kopf, Jeff Guo, Jaime Martin, Johannes Schoergenhumer, Cristina Nevado, Philippe Schwaller
19	Steering generative deep learning with task negation for de novo drug design	Sarah de Ruiter, Rıza Özçelik, Francesca Grisoni
20	MØDRCN – Open-Source Chemical Reservoir Computing Tool	Mehmet Aziz Yirik, Jakob Lykke Andersen, Rolf Fagerberg, Daniel Merkle
21	Enhancing Generalizability in Permeability Prediction for Macrocyclic Peptides	Anna Borisova, Rebecca Manuela Neeser, Philippe Schwaller
22	An exploration of dataset importance in single-step retrosynthesis prediction	Sara Tanovic, Fernanda Duarte
23	Towards transparent ML for kinase drug discovery: Insights from structural binding models	Joschka Groß, Michael Backenköhler, Paula Linh Kramer, Verena Wolf, Andrea Volkamer
24	Autonomous Atomistic Simulations with Hierarchical LLM Agents	Ziqi Wang, Hongshuo Huang, Hancheng Zhao, Changwen Xu, Jan Janssen, Venkatasubramanian Viswanathan
25	Protein Language Model-Driven Zoonotic Risk Assessment of SARS-CoV-2 Variants	Payel Das, Alessandra Toniato, Aurelie Lozano, Vijil Chenthamarakshan
26	RxnRule: Filtering single step reaction predictions based on reaction class incompatibilities	Zlatko Jončev, Daniel P Armstrong, Victor Sabanza Gil, Ziad El Malki, Philippe Schwaller
27	Physics-Aware Diffusion Models for Micro-structure Material Design	Jacob K Christopher, Stephen Baek, Ferdinando Fioretto

Important dates

November 1, 2024: Deadline for submission
Mid November, 2024: Author notification
December 6, 2024: Workshop

Call for papers

We are calling for papers exploring machine learning for molecules and materials in the era of LLMs. Topics include (but not limited to):

Molecular Dynamics and Machine Learning
Generative Models for Molecule and Material Design
Molecular Property Prediction
Inverse Design and Optimization in Materials Science
Molecular Simulation Enhancement with ML
Representation Learning for Molecules and Materials
Machine learning methods for chemical reactions and synthesis
Uncertainty Quantification and Robustness in Molecular ML Models
Active Learning and Data-Efficient ML for Molecular Science
Language models in chemistry

Please submit your contributions on OpenReview until November 1 2024 11:59 PM UTC-0. The submissions should be in PDF and follow the NeurIPS template with a maximum of 5 pages (not including references and appendices). We also welcome extended abstract submissions (2 pages). Please anonymize your paper since the review process is dual-anonymous. Filling the NeurIPS checklist with the paper is not compulsory.

Note that workshop papers often represent the current status of on-going projects. They should not be considered as final versions of record for a particular project. The workshop is non-archival.

Best Paper Award

The best paper award (1000€) is sponsored by AstraZeneca.

Two papers share the best paper award:

DEQuify your force field: More efficient simulations using deep equilibrium models. Andreas Burger et al.
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences. Niklas Schmidinger et al.

Award committee: Andrea Volkamer, Philippe Schwaller, Cecilia Clementi

Organizing Committee and Contact

Chairs: Johannes Margraf, Francesca Grisoni, Günter Klambauer

Organizing committee: Alaa Bessadok, Sovhi Luukkonen, Johannes Schimunek, Karsten Reuter, Giulia Glorani, Francesca Grisoni, Johannes Margraf, Günter Klambauer

AstraZeneca has provided a sponsorship grant towards this independent Programme.

Contact: ml4molecules@ml.jku.at