Authors
Jennifer M Umbles Hayes, Emmanuel O Olawode, Anietie Andy, Edmund Essah Ameyaw
Published in
Journal of cheminformatics. Apr 03, 2026. Epub Apr 03, 2026.
Abstract
Automated interpretation of Markush structures widely used in pharmaceutical patents to claim large families of related compounds remains challenging due to non-machine-readable structure images, variable R-groups, dependency rules, scaffold diversity, and heterogeneous claim language. Challenges include attachment points and stereochemistry, nested/conditional dependencies, and inconsistent drafting conventions that hinder faithful enumeration. Early rule-based cheminformatics systems parsed claims and mapped Markush representations into searchable formats, but struggled with nested dependencies, cross-references, and multimodal (text + image) descriptions. More recently, artificial intelligence (AI) methods have been introduced including language-based tools, vision-based tools, and multimodal or hybrid tools. Language-based tools increasingly use large language models (LLMs) and natural language processing (NLP) capabilities to extract variable definitions, constraints, and dependency graphs from claim text; vision systems translate structure depictions into machine-readable formats (e.g., SMILES, CXSMILES); multimodal or hybrid pipelines align both for end-to-end interpretation. Emerging datasets support these efforts, though licensing, family-wise leakage, and standardized splits remain inconsistent. This narrative review synthesizes tools, datasets, and evaluation practices for AI-assisted Markush interpretation, identifies persistent failure modes, and maps open legal questions (sufficiency, enablement, enforceability). We outline priorities for the field; transparent benchmarks with family-aware splits, interpretable constraint handling, and workflows aligned with U.S. Patent Office practice, near-term use is decision support, not legal advice.
PMID:
41933423
Bibliographic data and abstract were imported from PubMed on 04 Apr 2026.
Read full publication at:
Please sign in
to see all details.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 37
- Comments 0