Implementation Plan — ChemLib¶
Overview¶
The project is divided into 6 phases, each building on the previous. Each phase produces a working, testable increment.
Phase 1: Foundation (Database + Models + Config)¶
Goal: Set up the project skeleton, database, ORM models, and Alembic migrations.
Tasks¶
- Project scaffolding
- Create directory structure as defined in
CLAUDE.md - Initialize
pyproject.tomlwith dependencies - Create
requirements.txt -
Initialize git repository
-
Configuration
chemlib/config.py— Pydantic settings with DATABASE_URL, constants-
.env.examplewith default SQLite configuration -
SQLAlchemy models
chemlib/models/base.py—Base,TimestampMixinchemlib/models/compound.py—Compound,Fragment,CompoundFragmentchemlib/models/assembly.py—AssembledMolecule,AssemblyStepchemlib/models/structure.py—Conformerchemlib/models/reaction.py—ReactionTemplate-
All relationships and indexes as specified in
DATABASE_DESIGN.md -
Alembic setup
alembic init alembic- Configure
alembic/env.pyfor async SQLAlchemy - Set
render_as_batch=Truefor SQLite compatibility -
Generate and apply initial migration
-
Database session management
-
chemlib/db/session.py— async engine, session factory,get_db()dependency -
DB service layer
chemlib/db/service.py—CRUDBasegeneric + specialized services-
CompoundDBService,FragmentDBService,AssemblyDBService,ConformerDBService -
Tests
tests/conftest.py— async test fixtures, in-memory SQLitetests/test_models/test_compound.py— model creation, relationshipstests/test_db/test_service.py— CRUD operations
Deliverable¶
- Running database with all tables created via Alembic
- CRUD operations verified by tests
- No API or UI yet
Phase 2: Chemistry Engine¶
Goal: Implement all RDKit-based chemistry utilities as standalone, testable modules.
Tasks¶
- Representations (
chemlib/chemistry/representations.py) - SMILES ↔ Mol ↔ canonical SMILES
- InChI / InChIKey generation
- MOL block generation (2D and 3D)
- SDF parsing
-
Molecular formula
-
Descriptors (
chemlib/chemistry/descriptors.py) compute_properties()— MW, LogP, TPSA, HBD, HBA, rotatable bonds, rings, QED-
Individual property functions
-
Fingerprints (
chemlib/chemistry/fingerprints.py) - Morgan fingerprint generation (ECFP4)
- Serialization/deserialization for DB storage
-
Tanimoto similarity (single and bulk)
-
Fragmentation (
chemlib/chemistry/fragmentation.py) - BRICS decomposition
- Attachment point parsing
-
Compatibility rules (BRICS_COMPATIBLE dict)
-
Assembly (
chemlib/chemistry/assembly.py) - Fragment joining via BRICS rules
- Molecule validation
- Dummy atom cleanup
-
Combinatorial BRICS build
-
Conformers (
chemlib/chemistry/conformers.py) - Conformer generation (ETKDGv3)
- MMFF94 / UFF minimization
- Lowest energy selection
-
MOL block extraction per conformer
-
Filters (
chemlib/chemistry/filters.py) - Lipinski Rule of Five
- Veber rules
- PAINS filter
- QED score
- SA Score (with vendor/sascorer.py setup)
-
Full drug-likeness report
-
Tests
tests/test_chemistry/test_representations.py— known molecule SMILES ↔ conversionstests/test_chemistry/test_descriptors.py— property values against known moleculestests/test_chemistry/test_fragmentation.py— BRICS output for known moleculestests/test_chemistry/test_assembly.py— joining known fragment pairstests/test_chemistry/test_conformers.py— conformer generation, minimizationtests/test_chemistry/test_filters.py— Lipinski pass/fail for known molecules
Deliverable¶
- Complete chemistry utility library
- All chemistry functions tested independently of DB
- Can run:
python -c "from chemlib.chemistry import ..."
Phase 3: Services + API (Compounds & Fragments)¶
Goal: Build the service layer and API endpoints for compound and fragment management.
Tasks¶
- FastAPI app setup
chemlib/main.py— app factory, router registration, exception handlers-
chemlib/api/deps.py— shared dependencies (get_db, get_service) -
Pydantic schemas
chemlib/schemas/compound.py— CompoundCreate, CompoundResponse, CompoundFilter-
chemlib/schemas/fragment.py— FragmentResponse, DecompositionResponse -
Compound service (
chemlib/services/compound_service.py) import_from_smiles()— parse, validate, compute properties, storeimport_from_sdf()— batch importsearch_similar()— fingerprint similarity search-
search_substructure()— SMARTS substructure search -
Fragment service (
chemlib/services/fragment_service.py) decompose_compound()— BRICS decompose, store fragments-
get_compatible()— find fragments with matching attachment points -
Compound API routes (
chemlib/api/compounds.py) -
All endpoints from
API_DESIGN.mdcompounds section -
Fragment API routes (
chemlib/api/fragments.py) -
All endpoints from
API_DESIGN.mdfragments section -
Error handling
- Custom exception classes
-
FastAPI exception handlers mapping to HTTP status codes
-
Tests
tests/test_api/test_compounds.py— endpoint tests with httpx.AsyncClienttests/test_api/test_fragments.py— decomposition and compatibility teststests/test_services/test_compound_service.py— service integration tests
Deliverable¶
- Working API: can create compounds, decompose into fragments, search
http://localhost:8000/docsshows all endpoints- All endpoints tested
Phase 4: Assembly System¶
Goal: Implement the molecule assembly pipeline — the core innovation of the system.
Tasks¶
- Assembly Pydantic schemas
-
chemlib/schemas/assembly.py— AssemblyCreate, AddFragmentRequest, AssemblyResponse, FinalizeRequest -
Assembly service (
chemlib/services/assembly_service.py) start_assembly()— create from initial fragmentadd_fragment()— join fragment, validate, record stepget_available_attachment_points()— what's open on current molecule-
finalize()— clean molecule, compute properties, score -
Assembly API routes (
chemlib/api/assembly.py) -
All assembly endpoints from
API_DESIGN.md -
Scoring service (
chemlib/services/scoring_service.py) score_molecule()— full drug-likeness report-
evaluate_smiles()— score without storing -
Scoring API routes (
chemlib/api/scoring.py) -
Scoring endpoints from
API_DESIGN.md -
Tests
tests/test_api/test_assembly.py— full assembly workflow teststests/test_services/test_assembly_service.py— service teststests/test_api/test_scoring.py— scoring endpoint tests
Deliverable¶
- End-to-end assembly workflow via API
- Can build a molecule from fragments, score it, get drug-likeness report
- All assembly and scoring endpoints tested
Phase 5: 3D Visualization & Energy Minimization¶
Goal: Add conformer generation, energy minimization, and the 3D viewer.
Tasks¶
- Conformer service (
chemlib/services/conformer_service.py) generate_and_minimize()— full pipeline: embed → minimize → storeget_viewer_data()— MOL block for 3Dmol.js-
get_conformer_list()— all conformers with energies -
Visualization service (
chemlib/services/viz_service.py) get_2d_svg()— SVG depiction-
get_3d_mol_block()— 3D coordinates for viewer -
Visualization API routes (
chemlib/api/visualization.py) -
All viz endpoints from
API_DESIGN.md -
3Dmol.js integration
chemlib/static/js/viewer.js— MolViewer classchemlib/static/js/conformer_browser.js— ConformerBrowser class-
Viewer template page with controls
-
Tests
tests/test_services/test_conformer_service.py— generation and minimizationtests/test_api/test_visualization.py— endpoint tests
Deliverable¶
- 3D viewer working in browser
- Conformer generation and energy minimization via API
- Can rotate, zoom, and browse conformers
Phase 6: UI & Integration¶
Goal: Build the web UI and wire everything together.
Tasks¶
- Base template (
chemlib/templates/base.html) - Navigation bar, footer, CDN imports (Bootstrap, 3Dmol.js)
-
Common CSS and JS
-
Dashboard (
chemlib/templates/index.html) -
Summary stats, quick actions
-
Compound browser (
chemlib/templates/compound_browser.html) -
Filterable table, 2D depictions, pagination
-
Compound detail (
chemlib/templates/compound_detail.html) -
Properties card, scorecard, fragment list, 3D viewer link
-
Fragment browser (
chemlib/templates/fragment_browser.html) -
Grid view with attachment point badges
-
Assembly workspace (
chemlib/templates/assembly_workspace.html) - Split-pane layout: current molecule + available fragments
- Step-by-step assembly with live preview
-
Finalize button
-
3D viewer page (
chemlib/templates/viewer_3d.html) -
Full-page 3Dmol.js canvas with controls sidebar
-
Scoring report (
chemlib/templates/scoring_report.html) -
Visual scorecard with gauges and indicators
-
UI page routes (in
chemlib/main.py) -
GET routes that serve templates
-
Seed script (
scripts/seed_fragments.py)- Populate DB with a starter set of common fragments
- Include 20-30 diverse, drug-like building blocks
-
Integration testing
- Full workflow: import compound → decompose → assemble → score → view 3D
Deliverable¶
- Complete working web application
- All pages functional and connected
- Starter fragment library seeded
- Full user workflow testable end-to-end
Phase Summary¶
| Phase | Name | Key Output | Depends On |
|---|---|---|---|
| 1 | Foundation | DB + Models + CRUD | — |
| 2 | Chemistry Engine | RDKit utilities | — |
| 3 | Services + API (Compounds) | REST API for compounds/fragments | 1, 2 |
| 4 | Assembly System | Fragment joining + scoring | 3 |
| 5 | 3D Visualization | Viewer + conformers | 3, 4 |
| 6 | UI & Integration | Web interface | 3, 4, 5 |
Note: Phases 1 and 2 can be developed in parallel as they have no mutual dependencies.
Definition of Done (per phase)¶
- All code written and follows project structure
- All tests pass
- Alembic migrations apply cleanly
- No linting errors
- Documentation updated if interfaces changed
- Verified against design documents