Intelligence

Tools

91 AI tools for drug discovery, curated and rated by practitioners. Filter by category, maturity, and availability.

91 tools

AlphaFold2

Google DeepMind

Predicts single-chain and multimer protein 3D structures from amino acid sequence using MSA-based deep learning. Set the modern benchmark on CASP14.

ProductionStructure PredictionOpen Source

ColabFold

Steinegger Lab (Seoul National University)

Wraps AlphaFold2 with MMseqs2-based MSA generation, making AF2 runs 40-60x faster. Accessible via Google Colab or local install.

ProductionStructure PredictionOpen Source

Boltz-1

MIT Jameel Clinic

First fully open-source model achieving AlphaFold3-level accuracy for joint structure prediction of proteins, nucleic acids, and small molecules.

Research+Structure PredictionOpen Source

Boltz-2

MIT + Recursion Pharmaceuticals

Extends Boltz-1 to jointly predict 3D complex structure AND protein-ligand binding affinity. Approaches FEP accuracy at 1,000x lower compute cost.

Research+Structure PredictionOpen Source

Chai-1

Chai Discovery

Multi-modal foundation model for joint structure prediction of proteins, small molecules, DNA, RNA, and glycosylations. Performs well in single-sequence mode.

Research+Structure PredictionOpen Source

Protenix

ByteDance Research

Fully open-source PyTorch reproduction of AlphaFold3 architecture. Protenix-v1 (Feb 2026) reported to outperform AF3 across diverse benchmarks.

Research+Structure PredictionOpen Source

ESMFold

Meta AI (FAIR)

Single-sequence protein structure prediction using the ESM-2 protein language model (15B parameters). No MSA required — fast inference directly from sequence.

Research+Structure PredictionOpen Source

OmegaFold

HeliXon Protein

Single-sequence structure prediction using a protein language model plus geometry-inspired transformer. First MSA-free method to approach AF2 accuracy.

Research+Structure PredictionOpen Source

AlphaFold3

Google DeepMind / Isomorphic Labs

Joint structure prediction of proteins, DNA, RNA, small molecules, ions, and covalent modifications in a single diffusion-based model.

ProductionStructure Prediction

OpenFold3

OpenFold Consortium

Fully open-source AF3-architecture co-folding model. Full-stack release includes training data, weights, code, and evaluation scripts.

Research+Structure PredictionOpen Source

IntelliFold-2

IntelliGen AI

Controllable open-source foundation model for biomolecular structure prediction. Claims to surpass AF3 on FoldBench for antibody-antigen co-folding.

ResearchStructure PredictionOpen Source

DiffDock / DiffDock-L

MIT CSAIL (Corso et al.)

Diffusion-based generative model that treats docking as a generative problem over ligand poses. No pre-specified binding pocket needed.

Research+Docking & ScreeningOpen Source

GNINA

Koes Lab (University of Pittsburgh)

AutoDock Vina-based docking engine augmented with a 3D CNN scoring function. Uses Vina for sampling, CNN for scoring and re-ranking.

Research+Docking & ScreeningOpen Source

FlowDock

Morehead, Cheng Lab (University of Missouri)

Geometric flow matching model that maps apo protein structures to bound complexes for multiple ligands simultaneously. Outputs confidence scores and affinity estimates.

Research+Docking & ScreeningOpen Source

AutoDock Vina

The Scripps Research Institute

Classical rigid receptor, flexible ligand docking using empirical and knowledge-based scoring. The most widely used open-source docking tool.

ProductionDocking & ScreeningOpen Source

Uni-Dock

DP Technology

GPU-accelerated molecular docking achieving >2000x speedup over CPU Vina. Enables ultra-large virtual screening of billions of compounds.

ProductionDocking & ScreeningOpen Source

GLIDE / Glide WS

Schrödinger

High-precision grid-based docking with hierarchical filtering. Glide WS (2025) explicitly models water molecules during docking.

ProductionDocking & Screening

GOLD

Cambridge Crystallographic Data Centre (CCDC)

Genetic algorithm-based docking supporting full ligand flexibility and partial protein side-chain flexibility. Four scoring functions available.

ProductionDocking & Screening

Deep Docking

Aspuru-Guzik Group

Active-learning framework using QSAR models to predict docking scores, enabling 50-100x acceleration of large-library virtual screening.

Research+Docking & ScreeningOpen Source

ADMET-AI

Greenstone Biosciences / Stanford

Predicts 41 ADMET endpoints using a Chemprop-RDKit GNN. Held the highest average rank on the TDC ADMET Leaderboard at time of publication.

Research+ADMET PredictionOpen Source

ADMETlab 3.0

SCBDD Group, Central South University

Comprehensive ADMET prediction platform covering 119 endpoints with uncertainty estimates. Trained on >400,000 curated entries.

Research+ADMET Prediction

SwissADME

Swiss Institute of Bioinformatics (SIB)

Predicts pharmacokinetics, drug-likeness, and medicinal chemistry friendliness. Known for the BOILED-Egg visualization and Bioavailability Radar.

Research+ADMET Prediction

pkCSM

BioSig Lab (University of Queensland)

Predicts 28 PK and toxicity properties using graph-based molecular signatures. Long-standing academic reference tool.

Research+ADMET Prediction

Deep-PK

BioSig Lab (University of Queensland)

Deep learning successor to pkCSM. Predicts 73 endpoints using GNNs with molecular optimization and interpretability outputs.

Research+ADMET Prediction

ProTox 3.0

Charité Berlin

Comprehensive toxicity prediction with 61 models across acute toxicity, organ toxicity, mutagenicity, carcinogenicity, and toxicological pathway activity.

Research+ADMET Prediction

Chemprop v2

MIT (Barzilay, Coley et al.)

Open-source D-MPNN library for molecular property prediction. The architecture underlying ADMET-AI and ADMETlab 3.0. Used in Halicin antibiotic discovery.

ProductionADMET PredictionOpen Source

ADMET Predictor v12

Simulations Plus

Commercial platform predicting 175+ ADMET properties with integrated PBPK (GastroPlus) and generative drug design (AIDD module).

ProductionADMET Prediction

QikProp

Schrödinger

Predicts PK and physicochemical properties based on full 3D molecular structure. Part of the Schrödinger Drug Discovery Platform.

ProductionADMET Prediction

RFdiffusion

Baker Lab / IPD (University of Washington)

Diffusion-based generative model for de novo protein backbone design. Generates novel protein structures conditioned on binding targets, symmetry, or functional sites.

Research+Generative DesignOpen Source

RFdiffusion2

Baker Lab / IPD (University of Washington)

Successor to RFdiffusion using flow matching. Designs enzymes directly from active site geometry (theozyme) specifications.

Research+Generative DesignOpen Source

ProteinMPNN

Baker Lab / IPD (University of Washington)

Inverse folding model: generates amino acid sequences predicted to fold into a target 3D backbone structure. Standard component of all modern protein design pipelines.

ProductionGenerative DesignOpen Source

LigandMPNN

Baker Lab / IPD (University of Washington)

Extension of ProteinMPNN that conditions sequence design on bound ligands, small molecules, metals, and nucleotides.

Research+Generative DesignOpen Source

PocketFlow

PocketFlow Team

Flow-based generative model that creates novel small molecule ligands for a target binding pocket. Generates hundreds of candidates in minutes.

Research+Generative Design

REINVENT4

AstraZeneca Molecular AI

RL + transformer platform for de novo small molecule design. Supports scaffold decoration, R-group replacement, linker design, and multi-parameter optimization.

ProductionGenerative DesignOpen Source

SAFE-GPT

Valence Labs

GPT-style model trained on SAFE (fragment-based) molecular representation. Enables fragment-constrained design including scaffold decoration and linker generation.

Research+Generative DesignOpen Source

Chemistry42

Insilico Medicine

Commercial generative chemistry platform using an ensemble of deep learning architectures. Part of the Pharma.AI platform. Has molecules in Phase I/II clinical trials.

ProductionGenerative Design

Chroma

Generate:Biomedicines

Programmable generative model for protein and protein complex design using diffusion with conditioning on geometry, symmetry, and functional annotations.

Research+Generative Design

ESM3

EvolutionaryScale

Multimodal protein language model that simultaneously reasons over sequence, structure, and function. Can generate novel proteins by prompting with partial information.

Research+Generative Design

DiffAb

Luo et al. (NeurIPS 2022)

Diffusion-based generative model that jointly designs antibody CDR sequences and 3D structures conditioned on antigen structure.

Research+Antibody DesignOpen Source

RFantibody

Baker Lab / IPD (University of Washington)

RFdiffusion fine-tuned for de novo antibody design. Generates VHHs, scFvs, and full antibodies targeting user-specified epitopes. Experimentally validated with cryo-EM.

Research+Antibody DesignOpen Source

AntiFold

OPIG (Oxford)

Antibody-specific inverse folding model fine-tuned from ESM-IF1. Designs sequences predicted to maintain structural fold given an antibody backbone.

Research+Antibody DesignOpen Source

ImmuneBuilder

OPIG (Oxford)

Suite for predicting 3D structures of antibodies (ABodyBuilder), nanobodies (NanoBodyBuilder2), and TCRs (TCRBuilder2) from sequence.

Research+Antibody DesignOpen Source

IgFold

Johns Hopkins / Profluent Bio

Fast deep learning model for antibody structure prediction from sequence alone. Processes paired heavy/light chain inputs.

Research+Antibody DesignOpen Source

AbLang-2

OPIG (Oxford)

Antibody-specific language model for sequence restoration, per-residue scoring, and embedding. Reduces germline bias from original AbLang.

Research+Antibody DesignOpen Source

ESM-IF1

Meta AI (FAIR)

Structure-conditioned inverse folding model: given a protein backbone, predicts sequences likely to fold into it. General-purpose (not antibody-specific).

Research+Antibody DesignOpen Source

BindCraft

EPFL / MIT (Pacesa, Ovchinnikov, Correia)

One-shot automated pipeline for de novo protein binder design. Backpropagates through AlphaFold2 to hallucinate binders. 10-100% experimental success rates.

Research+Antibody DesignOpen Source

ANARCI

OPIG (Oxford)

Sequence annotation tool for numbering antibody and TCR variable domains according to standard schemes (IMGT, Kabat, Chothia, AHo).

ProductionAntibody DesignOpen Source

ParaSurf

CERTH

AI-driven prediction of antibody-antigen binding sites (paratope and epitope prediction) from structure.

Research+Antibody DesignOpen Source

GROMACS 2026

GROMACS Consortium (KTH, Max Planck, et al.)

High-performance all-atom and coarse-grained MD engine. GROMACS 2026 added native NNP/MM support for hybrid ML-classical simulations.

ProductionMD & SimulationOpen Source

OpenMM 8.5

Stanford / OpenMM Community

Python-first MD framework with native ML potential API (openmm-ml). Wraps MACE, NequIP, AceFF, and other ML force fields directly.

ProductionMD & SimulationOpen Source

AMBER 24

AMBER Consortium (UCSF et al.)

MD suite with best-in-class GPU acceleration (pmemd.cuda) and strong force field ecosystem. Now includes NNP integration via DeePMD-GNN.

ProductionMD & Simulation

Desmond / FEP+

D.E. Shaw Research / Schrödinger

GPU-accelerated MD engine integrated with Schrödinger's platform. FEP+ is the industry gold standard for relative binding free energy predictions in lead optimization.

ProductionMD & Simulation

OpenFE

Open Free Energy Consortium (15+ pharma companies)

Open-source RBFE framework. 2025 benchmark across 1,700+ ligands from 15 pharma companies showed out-of-the-box accuracy approaching FEP+.

ProductionMD & SimulationOpen Source

MACE-OFF24

Cambridge (Csányi Lab)

Equivariant ML force field for organic molecules covering H, C, N, O, F, P, S, Cl, Br, I (~90% of drug-like space). Near-DFT accuracy for torsion profiles.

Research+MD & SimulationOpen Source

Open Targets Platform

EBI / Genentech / GSK / MSD / Pfizer / Sanofi / Wellcome Sanger

Integrates 23+ public data sources to systematically score and rank target-disease associations. Provides target prioritization based on clinical precedence and tractability.

ProductionTarget DiscoveryOpen Source

DisGeNET

IMIM / DisGeNET (commercial entity)

Comprehensive gene-disease and variant-disease association database. >2M GDAs, >4M VDAs, >20M DDAs. Integrates curated repositories, GWAS, animal models, and NLP-extracted evidence.

ProductionTarget Discovery

STRING v12.5

EMBL / SIB / CPR

Functional protein-protein association networks across 12,535 organisms. v12.5 added a regulatory network layer capturing directionality via LLM-parsed literature.

ProductionTarget DiscoveryOpen Source

ChEMBL v35

EMBL-EBI

Manually curated bioactivity database. 5.4M+ bioactivity measurements for 1M+ compounds against 5,200+ protein targets from peer-reviewed literature.

ProductionTarget DiscoveryOpen Source

DrugBank v6.0

OMx Personal Health Analytics

Gold standard drug knowledge resource. 4,563 FDA-approved drugs, 6,231 investigational drugs, 1.4M drug-drug interactions, comprehensive target annotations.

ProductionTarget Discovery

PubTator3

NCBI / NIH

AI-powered literature resource providing automated NER and relation annotations across ~36M PubMed abstracts and ~6M PMC full-text articles. Updated weekly.

ProductionTarget DiscoveryOpen Source

INDRA

Gyori Lab, Harvard Medical School

Automated knowledge assembly system that reads NLP systems and databases, standardizes causal statements, and assembles them into executable mechanistic models.

Research+Target DiscoveryOpen Source

PubMedBERT

Microsoft Research

BERT encoder pre-trained exclusively on PubMed abstracts with domain-specific vocabulary. Baseline for biomedical NER, relation extraction, and classification.

Research+Target DiscoveryOpen Source

Monarch Initiative

Monarch Consortium (EMBL-EBI et al.)

Integrates and cross-species aligns phenotype-gene-disease data from 33 sources. Enables phenotype-driven gene discovery and cross-species model comparison.

ProductionTarget DiscoveryOpen Source

PrimeKG

Zitnik Lab, Harvard Medical School

Precision Medicine Knowledge Graph integrating 20 sources across 10 biological scales. 17,080 diseases with 4M+ relationships including drug indication and off-label use edges.

Research+Target DiscoveryOpen Source

RetroTRAE

KNU LCBC (Kyungpook National University)

Single-step retrosynthesis prediction using fragment-based tokenization of atomic environments and a Transformer architecture. Mimics chemical reasoning by learning changes in atom environments between products and reactants.

ResearchRetrosynthesisOpen Source

AiZynthFinder

AstraZeneca Molecular AI

Multi-step retrosynthetic planning tool using Monte Carlo tree search guided by neural network policies. Recursively breaks down target molecules into purchasable precursors. Production-used at AstraZeneca.

ProductionRetrosynthesisOpen Source

TorchDrug

DeepGraphLearning (Mila / Université de Montréal)

PyTorch-based ML platform for drug discovery covering graph neural networks, geometric deep learning, knowledge graphs, generative models, and retrosynthesis. Provides unified API for property prediction, generation, and synthesis planning.

Research+RetrosynthesisOpen Source

DeepPurpose

Huang et al. (Harvard / MIT)

Deep learning toolkit for drug-target interaction (DTI) prediction, compound property prediction, protein-protein interaction prediction, and drug-drug interaction prediction. Supports 15+ encoding methods and 5+ model architectures.

Research+ADMET PredictionOpen Source

scGPT

Bo Wang Lab (University of Toronto)

Foundation model for single-cell multi-omics built on generative pre-training of ~33M cells. Fine-tunes to SOTA on cell type annotation, multi-batch integration, perturbation prediction, and gene network inference.

Research+Single-CellOpen Source

CellTypist

Teichmann Lab (Wellcome Sanger Institute)

Automated cell type annotation tool for scRNA-seq data using logistic regression models trained on curated cross-tissue immune cell atlases. Provides a growing encyclopedia of pre-trained cell type models.

ProductionSingle-CellOpen Source

ESM-2

Meta AI (FAIR)

State-of-the-art protein language model (up to 15B parameters) trained on 250M protein sequences. Provides rich per-residue and per-sequence embeddings used across structure prediction, function annotation, and variant effect scoring.

ProductionProtein LMsOpen Source

ProGen2

Salesforce Research

Autoregressive protein language model (up to 6.4B parameters) for controllable protein sequence generation. Generates functional proteins conditioned on protein family or function tags.

Research+Protein LMsOpen Source

EvoDiff

Microsoft Research

Discrete diffusion framework for controllable protein generation in sequence space. Combines evolutionary-scale data with diffusion model conditioning for generating diverse, structurally plausible proteins.

Research+Protein LMsOpen Source

PandaOmics

Insilico Medicine

AI-driven target identification and biomarker discovery platform. Processes omics data, text mining, and knowledge graphs to prioritize novel therapeutic targets. Core component of Insilico's Pharma.AI suite alongside Chemistry42 and inClinico.

ProductionTarget Discovery

Therapeutics Data Commons

Zitnik Lab, Harvard Medical School

Coordinated initiative providing AI-ready datasets, curated benchmarks, and leaderboards across therapeutic modalities and discovery stages. Covers 22 tasks across single-instance, multi-instance, and generation problems.

ProductionTarget DiscoveryOpen Source

OpenFold

OpenFold Consortium (Columbia, NVIDIA, SandboxAQ et al.)

Trainable, memory-efficient, GPU-friendly PyTorch reproduction of AlphaFold2. Includes full training code and data, enabling retraining and fine-tuning on custom datasets. Demonstrated AF2 reproducibility from scratch.

ProductionStructure PredictionOpen Source

RFdiffusion All-Atom

Baker Lab / IPD (University of Washington)

Extension of RFdiffusion that jointly diffuses over protein backbone AND small molecule ligands, enabling de novo design of proteins that bind specific small molecules like heme or digoxigenin.

Research+Generative DesignOpen Source

RFdiffusion3

Baker Lab / IPD (University of Washington)

Third-generation diffusion model for protein design unifying backbone generation, sequence design, and all-atom refinement. Open-sourced Dec 2025 with full training code.

Research+Generative DesignOpen Source

NeuralPLexer

Qiao et al. (Caltech / NVIDIA)

Multi-scale deep generative model for state-specific protein-ligand complex structure prediction. Predicts both protein conformational change and ligand binding pose simultaneously from sequence.

Research+Docking & ScreeningOpen Source

Umol

Bryant et al. (FU Berlin / Noé Lab)

Unified molecular model predicting protein-ligand complex structures directly from sequence information. Combines MSA-based protein features with ligand graph representations.

Research+Structure PredictionOpen Source

DiffSBDD

Schneuing et al. (Cambridge / Microsoft / VantAI)

Equivariant diffusion model for structure-based drug design that generates novel 3D molecules directly inside protein binding pockets. Published in Nature Computational Science 2024.

Research+Generative DesignOpen Source

Uni-Mol / Uni-Mol2

DP Technology / DeepModeling

Universal 3D molecular pretraining framework for property prediction, conformation generation, and docking. Uni-Mol2 (2024) scales to 1.1B parameters trained on 800M conformations.

Research+ADMET PredictionOpen Source

PharmacoNet

Seo & Kim (KAIST)

Deep learning-guided pharmacophore modeling for ultra-large-scale virtual screening. Derives protein-based pharmacophore models automatically and scores compounds at extreme throughput.

Research+Docking & ScreeningOpen Source

MolMIM

NVIDIA

Molecular generation model using Mutual Information Machine with a Perceiver encoder. Maps molecules into a smooth latent space enabling controlled interpolation and optimization.

Research+Generative Design

BioNeMo

NVIDIA

End-to-end AI platform for drug discovery providing GPU-accelerated NIMs (NVIDIA Inference Microservices) spanning protein structure, molecular generation, docking, and property prediction. Includes ESMFold, DiffDock, MolMIM, and 25+ healthcare NIMs.

ProductionAI Platforms

HelixFold3

Baidu / PaddlePaddle

PaddlePaddle-based reproduction of AlphaFold3 for biomolecular structure prediction covering proteins, nucleic acids, small molecules, and ions. Open-sourced Aug 2024 with web server.

Research+Structure PredictionOpen Source

Lingo3DMol

StoneWise AI Drug Design

Pocket-based 3D molecule generation combining language model token prediction with geometric deep learning for 3D coordinate generation. Published in Nature Machine Intelligence 2024.

Research+Generative DesignOpen Source

Pocket2Mol

Peng et al. (Peking University)

Efficient 3D molecular generation conditioned on protein binding pockets using equivariant graph neural networks with autoregressive atom placement.

Research+Generative DesignOpen Source

Geneformer

Theodoris Lab (Harvard / MIT)

Transformer-based foundation model pre-trained on ~30M single-cell transcriptomes. Learns context-dependent gene network dynamics and transfers to diverse downstream tasks including disease modeling and therapeutic target prioritization.

Research+Single-CellOpen Source

MACE-MP-0

Cambridge (Csányi Lab)

Universal foundation model for atomistic simulations covering 89 elements. Pre-trained on the Materials Project dataset, generalizes across organic molecules, inorganic crystals, and interfaces without fine-tuning.

Research+MD & SimulationOpen Source

OpenCRISPR-1

Profluent Bio

First open-source AI-generated CRISPR-Cas9 gene editor. Protein language model-designed Cas9 variant with comparable editing efficiency to SpCas9 but novel sequence. Demonstrates LLM-driven protein engineering at scale.

ResearchProtein LMsOpen Source