Science gaps are the critical spaces where AI capabilities have matured enough to address a real, unmet biotech need — but no one has fully closed the loop yet. These are the whitespace opportunities where the next generation of AI-native biotech companies will be built.
Each gap is scored on the unmet need, the current state of play, what recent AI advances changed, and the companies to watch. Updated with real-time intelligence.
1
AI Protein Degraders for Neurodegeneration
Unmet Need
Tau and alpha-synuclein aggregates drive Alzheimer's and Parkinson's diseases, yet no approved therapy can selectively degrade these proteins. Traditional small molecules cannot effectively target these 'undruggable' intrinsically disordered proteins, and current immunotherapies show limited efficacy at clearing intracellular aggregates.
Current State
PROTAC protein degraders targeting tau and alpha-synuclein are in preclinical development. Arvinas is conducting preclinical studies on CNS-penetrant PROTACs for neurodegeneration. Dual PROTAC degraders targeting both tau and alpha-synuclein have been reported in Journal of Medicinal Chemistry (2024). Researchers at Harvard/Broad Institute published optimized tau degraders using iPSC-derived neuronal models in 2025–2026. Kymera Therapeutics, Nurix Therapeutics, Amphista Therapeutics, and Plexium are all advancing TPD platforms with neuroscience potential.
What Changed
AlphaFold and related structure-prediction tools now enable rational design of degraders against conformationally flexible targets. Generative chemistry models can explore vast chemical space for CNS-penetrant bifunctional molecules. iPSC-derived neuronal models enable rapid screening of degrader candidates in disease-relevant human cells.
AI Enablers
AlphaFold for target structure predictionGenerative molecular design for CNS-penetrant PROTACsGraph neural networks for E3 ligase-substrate modelingiPSC-based phenotypic screening with computer visionMolecular dynamics simulations for ternary complex prediction
$15–25B+ (Alzheimer's and Parkinson's combined addressable market for disease-modifying therapies)
2
AI-Designed Antibodies for Refractory Autoimmune Disease
Unmet Need
Patients with refractory lupus (SLE) and inflammatory bowel disease (IBD) often fail existing biologics. Multi-target approaches are needed but designing bispecific antibodies with optimal pharmacology is extremely complex using traditional methods. Up to 40% of IBD patients are primary non-responders to anti-TNF therapy.
Current State
Earendil Labs signed a $1.8B deal with Sanofi (April 2025) for AI-discovered bispecific antibodies HXN-1002 (α4β7/TL1A) and HXN-1003 (TL1A/IL-23) targeting refractory IBD. Seismic Therapeutic is a clinical-stage ML immunology company building AI-designed biologics for autoimmune diseases. Fate Therapeutics presented data at ACR 2025 on iPSC-derived CAR-T (FT819) for lupus showing durable immune remodeling.
What Changed
Large language models for protein sequences (ESM-2, ProtGPT2) enable de novo antibody design. High-throughput experimental platforms combined with AI now allow rapid iteration on bispecific formats. The success of Dupixent and emerging TL1A antibodies validated multi-cytokine targeting in autoimmune disease.
AI Enablers
Protein language models (ESM-2, ProtGPT2) for sequence designGenerative antibody design platformsML-driven affinity maturationComputational bispecific format optimizationAI-guided immunogenicity de-risking
Companies to Watch
Earendil LabsSeismic TherapeuticAbsciGenerate BiomedicinesBigHat BiosciencesNabla Bio
Computational Retrosynthesis for Natural Product Total Synthesis
Unmet Need
Natural products remain a premier source of drug leads, but their structural complexity makes total synthesis extremely challenging—often requiring 20–40 steps with low overall yields. Many bioactive natural products cannot be produced at scale, limiting clinical development of promising compounds.
Current State
DeepRetro (Nature Scientific Reports, Feb 2026) uses iterative LLM reasoning for retrosynthetic pathway discovery. Chemical.AI's ChemAIRS platform offers AI-driven retrosynthesis with synthesizability assessment and process chemistry optimization. Iktos' Spaya platform provides AI-driven retrosynthesis integrated with robotic synthesis. MilliporeSigma's Synthia uses rule-based AI for retrosynthetic analysis. Academic advances include transformer-based single-step retrosynthesis models achieving >60% top-1 accuracy.
What Changed
LLM-based reasoning (DeepRetro) now handles multi-step retrosynthetic planning that was previously intractable for AI. Integration of retrosynthesis AI with automated robotic synthesis platforms closes the loop from design to execution. Reaction databases have grown 10x in the past five years, enabling better training of predictive models.
AI Enablers
Large language models for retrosynthetic reasoningTransformer-based reaction predictionReinforcement learning for route optimizationGraph neural networks for reaction outcome predictionAutomated synthesis platform integration
$5–10B (natural product-derived drug development and process chemistry optimization)
4
AI-Optimized ADC Linker Design
Unmet Need
Antibody-drug conjugates (ADCs) are transforming oncology, but linker instability causes off-target toxicity and limits therapeutic windows. Current linker design relies heavily on trial-and-error, with most ADCs using only a handful of validated linker chemistries. Optimizing the linker-payload-antibody combination space is combinatorially explosive.
Current State
The ADC market exceeded $10B in 2025 with blockbusters like Enhertu and Padcev. A Frontiers review (June 2025) highlighted growing sophistication of AI-driven ADC design for antibody-linker-payload optimization. Startups Endeavor BioMedicines, Araris Biotech, and MBrace are pioneering linker and payload innovations. Exelixis published advanced site-specific conjugation and linker chemistry work (2025). Sutro Biopharma uses cell-free protein synthesis with non-natural amino acids for precise conjugation.
What Changed
ML models can now predict linker stability, DAR (drug-to-antibody ratio) distribution, and pharmacokinetics from molecular structure. AlphaFold-based antibody modeling enables computational selection of optimal conjugation sites. High-throughput ADC screening platforms generate the training data needed for predictive models.
AI Enablers
Graph neural networks for linker stability predictionMolecular dynamics for conjugation site optimizationGenerative chemistry for novel linker scaffoldsQSAR models for ADC pharmacokineticsAlphaFold-based conjugation site selection
$20B+ (ADC market projected to exceed $30B by 2030)
5
Foundation Models for Single-Cell Drug Response Prediction
Unmet Need
Drug response varies dramatically across cell types and patient populations. Current bulk-level assays mask critical heterogeneity. Predicting how individual cell populations respond to perturbations could transform precision oncology and enable rational combination therapy design, but the biological complexity has been intractable.
Current State
scGPT (Nature Methods, 2024) established the foundation model paradigm for single-cell multi-omics. CRISP framework (2025) enables predicting perturbation responses in unseen cell types using foundation models and transfer learning. The 2025 Virtual Cell Challenge tested perturbation-response prediction on unseen cell types via open competition. Recursion Pharmaceuticals is building a 'Virtual Cell' platform integrating phenomics and multi-omics data, with a $7M Sanofi milestone hit in 2025. Nature published a comprehensive review of single-cell foundation models in October 2025.
What Changed
Transformer architectures adapted to gene expression data enable cross-cell-type generalization. Massive single-cell atlases (Human Cell Atlas, CellxGene) provide pre-training data at scale. Integration with drug molecular representations enables joint cell-drug modeling. The Recursion-Exscientia merger created the largest biology+chemistry AI platform.
AI Enablers
Single-cell foundation models (scGPT, Geneformer, scBERT)Graph neural networks for cell-drug interactionTransfer learning across cell typesVirtual cell simulation platformsMulti-modal transformers for omics integration
$8–15B (precision oncology and drug response biomarker market)
6
AI-Guided Radiopharmaceutical Targeting
Unmet Need
Radiopharmaceutical theranostics are revolutionizing oncology (Lu-177 PSMA for prostate cancer), but target identification for new tumor types is slow, dosimetry is imprecise, and patient selection relies on crude imaging biomarkers. Expanding beyond prostate cancer requires new targeting vectors and better response prediction.
Current State
The radiopharmaceutical theranostics market hit ~$5B in 2025 with 11.5% annual growth. Big Pharma went on a $9B+ acquisition spree: BMS acquired RayzeBio ($4.1B), AstraZeneca bought Fusion Pharmaceuticals ($2.4B), Eli Lilly acquired Point Biopharma ($1.4B), and Novartis bought Mariana Oncology ($1B). AI is being applied to PET/SPECT image reconstruction, target identification via AlphaFold, and patient stratification through radiomics. Companies like Convergent Therapeutics, Aktis Oncology, ITM Isotope Technologies, and Telix Pharmaceuticals are advancing next-gen platforms.
What Changed
AlphaFold enables structure-guided design of novel targeting peptides and small molecules for radioisotope conjugation. Deep learning radiomics can predict therapeutic response from baseline PET/SPECT imaging. GAN-based image reconstruction enables low-dose diagnostic imaging, expanding patient screening. The $9B+ M&A wave validated the space and attracted massive R&D investment.
AI Enablers
AlphaFold for targeting vector designDeep learning radiomics for response predictionGAN-based low-dose PET image reconstructionML-driven dosimetry optimizationMulti-modal data fusion for patient stratification
$15B+ (radiopharmaceutical theranostics market projected by 2031)
7
Generative Chemistry for CNS-Penetrant Molecules
Unmet Need
Over 98% of small molecules and nearly 100% of biologics fail to cross the blood-brain barrier (BBB). CNS disorders represent ~$80B in unmet therapeutic need, yet the attrition rate for CNS drug candidates is the highest of any therapeutic area. Designing molecules that balance BBB penetration, target engagement, and safety is extraordinarily difficult.
Current State
Insilico Medicine signed a $66M deal with Hygtia (Jan 2026) for an AI-designed brain-penetrant NLRP3 inhibitor discovered via Chemistry42. 1910 Genetics published CANDID-CNS (Dec 2025), achieving 87% AUPRC on beyond-Rule-of-5 BBB penetration prediction vs. 56% for Pfizer's CNS MPO score. BORAZON offers a generative AI platform specifically for BBB-penetrant molecule design. Apertura Gene Therapy licensed BBB-penetrant AAV capsids to multiple partners (Aug 2025).
What Changed
Generative models can now co-optimize BBB penetration, P-glycoprotein efflux, metabolic stability, and target affinity simultaneously. New AI models like CANDID-CNS dramatically outperform traditional medicinal chemistry rules. The success of Insilico's rentosertib (AI-designed, positive Phase IIa data) proved generative chemistry works in the clinic.
AI Enablers
Multi-objective generative molecular designBBB penetration prediction models (CANDID-CNS)P-glycoprotein efflux predictionMolecular dynamics for membrane permeabilityReinforcement learning for property co-optimization
$12–20B (CNS therapeutic market for disease-modifying treatments)
8
AI Clinical Trial Design for Rare Diseases
Unmet Need
Over 7,000 rare diseases affect 300M+ people globally, yet 95% have no approved treatment. Traditional clinical trials require large patient cohorts that simply don't exist for rare diseases. Adaptive trial designs are needed but are complex to model and execute. Patient recruitment alone can take years.
Current State
Unlearn.AI uses digital twins of clinical trial participants to reduce required enrollment by up to 35%. Medidata (Dassault Systèmes) offers AI-powered adaptive trial design and synthetic control arms. Harvard's TxGNN model can predict drug repurposing candidates for over 17,000 diseases including rare conditions. Nature published a comprehensive perspective on AI, LLMs, adaptive designs, and digital twins for clinical trials (Nov 2025). The AI in clinical trials market is growing rapidly with IQVIA, Saama, and NVIDIA as key platform players.
What Changed
Digital twin technology enables synthetic control arms, dramatically reducing required patient numbers. Bayesian adaptive designs powered by AI allow continuous learning and real-time protocol adjustment. NLP/LLM models can mine electronic health records to identify undiagnosed rare disease patients for recruitment. FDA has signaled increasing acceptance of AI-augmented trial designs and real-world evidence.
AI Enablers
Digital twin generation for synthetic control armsBayesian adaptive trial optimizationNLP for patient identification from EHRsCausal AI for endpoint selectionLLMs for protocol design and regulatory writing
$10B+ (rare disease therapeutics and clinical trial optimization)
9
Computational Prediction of Drug-Drug Interactions for Combination Therapies
Unmet Need
Combination therapies are the standard of care in oncology, infectious disease, and increasingly autoimmune disease, yet predicting drug-drug interactions (DDIs) remains largely empirical. Unexpected DDIs cause ~195,000 hospitalizations annually in the US alone. The combinatorial explosion of multi-drug regimens makes experimental testing of all interactions impossible.
Current State
A Frontiers in Pharmacology review (July 2025) highlighted integration of NLP with knowledge graphs for automatic identification of overlooked DDIs. Reinforcement learning methods simulate DDI scenarios and adjust models based on predicted outcomes. Recursion Pharmaceuticals' phenomics platform can screen compound combinations at massive scale. Graph neural networks and transformer models are achieving >90% accuracy on benchmark DDI prediction datasets. The field is moving toward multi-modal approaches combining molecular structure, genomics, and clinical data.
What Changed
Knowledge graph approaches now integrate drug structure, target biology, metabolic pathways, and clinical outcomes in unified models. Foundation models pre-trained on massive biomedical corpora enable better transfer learning for DDI prediction. Real-world data from EHRs provides clinical DDI evidence at scale for model validation. Physics-informed neural networks combine mechanistic pharmacokinetic models with data-driven approaches.
AI Enablers
Knowledge graphs for multi-scale DDI modelingGraph neural networks for molecular interaction predictionNLP for literature and EHR miningReinforcement learning for combination optimizationPhysics-informed neural networks for PK/PD modeling
Companies to Watch
Recursion PharmaceuticalsTempus AIBenevolentAIStandigmCertaraSimulations Plus
Opportunity Size
$6–12B (combination therapy optimization and DDI safety market)
10
AI for Predicting Immunogenicity of Biologic Therapeutics
Unmet Need
Anti-drug antibodies (ADAs) affect up to 90% of patients on certain biologics, reducing efficacy and causing adverse reactions. Immunogenicity is one of the top reasons biologic candidates fail in clinical development. Current prediction tools are unreliable, and immunogenicity testing in preclinical species poorly translates to human outcomes.
Current State
ALP AI (Luxembourg startup, 2026) focuses specifically on AI-driven immunogenicity prediction and ADA risk reduction for antibody development. The SITA model (Site-specific Immunogenicity for Therapeutic Antibodies) uses transfer learning and 3D structural descriptors for residue-level B-cell immunogenicity prediction. The PEGS Boston Summit 2026 features a dedicated session on predicting immunogenicity with AI/ML tools, including IGMotifFinder for early identification. EVQLV uses computational multi-parameter optimization including immunogenicity for de novo antibody design. The AI in antibody discovery market is projected to reach $3B by 2034 at 22.9% CAGR.
What Changed
Protein language models now capture complex sequence-immunogenicity relationships beyond simple T-cell epitope prediction. 3D structure-aware models (enabled by AlphaFold) predict B-cell epitopes and aggregation-prone regions that drive immunogenicity. Large clinical immunogenicity datasets are becoming available, enabling supervised learning approaches. Multi-parameter optimization platforms can now co-optimize efficacy, stability, and immunogenicity simultaneously.
AI Enablers
Protein language models for epitope predictionAlphaFold-based 3D immunogenicity modelingTransfer learning from clinical ADA datasetsMulti-parameter antibody optimizationAggregation propensity prediction models
Companies to Watch
ALP AISeismic TherapeuticEVQLVMabsilicoAbsciBigHat Biosciences
Opportunity Size
$4–8B (biologic development de-risking and biosimilar optimization)