Over the past two decades, the interdisciplinary predicTeam has established a prediction platform at Bayer Pharma R&D with the goal to generate state-of-the-art machine learning models for a variety of pharmacokinetic and physicochemical endpoints in early drug discovery. These tools are accessible to all scientists within the company and can be useful in assisting with the selection and design of novel leads, as well as the process of lead optimization. The predicTeam provides an all-inclusive package covering the data pipeline from experiment to application in projects. In close interaction with experimentalists, we select endpoints for model building that are relevant for drug discovery. We implement and maintain the infrastructure to retrieve and prepare the data and make it accessible as a data lake. For each endpoint, after fully exploring the matrix of data, molecule representations and algorithms, we implement the best-performing and most stable-models models in our internal research platform Pix. A highly automated infrastructure allows us to perform weekly retraining of the models to ensure that the novel chemical space of drug discovery projects is well embedded. We ensure close interaction (e.g. presentations, tutorials, teams channel) with the user base for optimal model use and direct feedback allowing for constant improvements. Finally, our Model Performance Report helps users to assess the applicability of each model to their specific project molecules.

about 2 years ago

I was at the Bayer DS&AI Poster - predicTeam POAP image