Skip to the content.

Big successes of machine learning for molecules have been achieved recently which had and will have a tremendous impact on how research is done (Chen, 2018), e.g. the accurate prediction of protein 3D structure (Jumper 2021; Thornton, 2021), discovery of novel antibiotics (Stokes, 2020; Das, 2021), or chemical synthesis planning (Segler, 2018). Nevertheless, there are still major challenges and substantial critique is voiced on current methods that are based mostly on deep learning (Marcus, 2018). Deep learning methods are data hungry, have limited knowledge transfer capabilities, do not quickly adapt to changing tasks or distributions, insufficiently incorporate world and prior knowledge, and cannot inherently distinguish causation from correlation (Bengio, 2021; Chollet, 2019; Marcus, 2018; Schölkopf, 2019). Furthermore, the current models are usually not composable in a sense that sub-components or different modules can be combined in a new way. With these characteristics, the machine learning systems currently employed for molecules are of the type of a narrow AI (Chollet, 2019; Hochreiter, 2022), as those are focused on a specific application, such as predicting a particular property or generating a certain type of molecules. The above-mentioned drawbacks hold in particular for molecular machine learning, such as activity/property prediction (Neves, 2018), generative modeling (Yang, 2017; Bender, 2021), chemical reactivity and synthesis (Segler, 2018; Seidl, 2022), and molecular modeling and representation learning (Klambauer, 2021). Therefore, this workshop focuses on exposing the current limitations of machine learning methods for molecules by critically assessing them, either theoretically or in applied and in industrial settings. The methods contributed to this workshop can focus on architectures that are robust against domain shifts, such as new biotechnologies or types of molecules. The proposed methods can also focus on quickly adapting to newly acquired data with potentially expensive biotechnologies, concretely few- and zero-shot learning methods. A further theme of the workshop is on methods that lead to new levels of abstractions of molecule representations, such that broader generalization capabilities are enabled. A potential step in this direction are physics-based machine learning methods improving molecular dynamics simulations or force fields. Advancing machine learning for molecules also means that these new systems should be able to interact with humans and transfer knowledge between them and the system, which is covered by the workshop theme on interpretability and explainability methods. The workshop also includes considerations and methodologies that allow for modularity or compositionality of architectures for molecular machine learning.

Confirmed speakers

Organizing Committee and Contact

Chairs: Jennifer Wei, Nadine Schneider, Günter Klambauer, Marwin Segler, and Jose M. H. Lobato Contact:


Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., … & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589.
Thornton, J. M., Laskowski, R. A., & Borkakoti, N. (2021). AlphaFold heralds a data-driven revolution in biology and medicine. Nature Medicine, 27(10), 1666-1669.
Stokes, J. M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N. M., … & Collins, J. J. (2020). A deep learning approach to antibiotic discovery. Cell, 180(4), 688-702.
Das, P., Sercu, T., Wadhawan, K. et al. Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nat Biomed Eng 5, 613–623 (2021).
Segler, M. H., Preuss, M., & Waller, M. P. (2018). Planning chemical syntheses with deep neural networks and symbolic AI. Nature, 555(7698), 604-610.
Neves, B. J., Braga, R. C., Melo-Filho, C. C., Moreira-Filho, J. T., Muratov, E. N., & Andrade, C. H. (2018). QSAR-based virtual screening: advances and applications in drug discovery. Frontiers in pharmacology, 9, 1275.
Bender, A., & Cortés-Ciriano, I. (2021). Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 1: ways to make an impact, and why we are not there yet. Drug discovery today, 26(2), 511-524.
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., & Blaschke, T. (2018). The rise of deep learning in drug discovery. Drug discovery today, 23(6), 1241-1250.
Marcus, G. (2018). Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631.
Bengio, Y., Lecun, Y., & Hinton, G. (2021). Deep learning for AI. Communications of the ACM, 64(7), 58-65.
Chollet, F. (2019). On the measure of intelligence. arXiv preprint arXiv:1911.01547.
Schölkopf, B. (2019). Causality for machine learning. arXiv preprint arXiv:1911.10500.
Hochreiter, S.. “Toward a broad AI.” Communications of the ACM 65, no. 4 (2022): 56-57.
Seidl, P., Renz, P., Dyubankova, N., Neves, P., Verhoeven, J., Wegner, J. K., … & Klambauer, G. (2022). Improving Few-and Zero-Shot Reaction Template Prediction Using Modern Hopfield Networks. Journal of chemical information and modeling.
Yang, X., Zhang, J., Yoshizoe, K., Terayama, K., & Tsuda, K. (2017). ChemTS: an efficient python library for de novo molecular generation. Science and technology of advanced materials, 18(1), 972-976.
Klambauer, G. (2021). Moving beyond narrow AIs in Drug Discovery – a perspective. ELLIS ML4Molecules workshop Dec 13, 2021. Virtual,