Files
data-entry-app/DATABASE-DESIGN.MD
T
2026-05-31 20:19:44 +12:00

12 KiB
Raw Blame History

Database Design

Purpose

This app uses a relational database to support five main concerns:

  1. Raw material pricing and unit conversion.
  2. Mix definitions and mix costing.
  3. Product-level formulas and product costing.
  4. Mix calculator session history.
  5. Access control for both internal users and client users.

The backend is written with SQLAlchemy models in backend/app/models. The schema is created automatically at startup, and lightweight migration/patch logic lives in backend/app/db/migrations.py.

Design Principles

  • tenant_id is the tenancy boundary for most business tables.
  • Reference/master data is stored separately from transactional/session data.
  • Product costing is built from raw materials -> formulas -> products -> outputs.
  • The mix calculator now prefers product-specific ingredient formulas over the shared mix master.
  • The database is designed to run on both SQLite locally and Postgres in production.

High-Level Domains

1. Raw Materials

These tables store ingredients and their price history.

  • raw_materials

    • One row per ingredient/raw material.
    • Stores name, supplier, unit of measure, kg_per_unit, status, and notes.
    • Example: Hulled Oats, White French Millet, Pano.
  • raw_material_price_versions

    • One-to-many from raw_materials.
    • Stores market_value, waste_percentage, effective_date, and status.
    • Lets the system keep historical prices instead of overwriting one current value.

Relationship:

  • raw_materials.id -> raw_material_price_versions.raw_material_id

2. Mix Master

These tables store shared mix definitions.

  • mixes

    • One row per named mix.
    • Stores client name, mix name, version, status, and notes.
    • This is the shared mix/master-recipe layer.
  • mix_ingredients

    • One-to-many from mixes.
    • One row per raw material inside a mix.
    • Stores quantity_kg for that mix.

Relationships:

  • mixes.id -> mix_ingredients.mix_id
  • raw_materials.id -> mix_ingredients.raw_material_id

Important note:

  • This table is still used by mix master pages and as a fallback.
  • It is no longer the primary source for mix calculator formulas when product-specific formulas exist.

3. Products

These tables describe saleable products and their formula rows.

  • products

    • One row per sellable product/SKU.
    • Stores client name, product name, optional item_id, packaging/unit info, margins, and linked mix.
    • mix_id links the product to the shared mix master entry.
  • product_ingredients

    • One-to-many from products.
    • One row per raw material required for that products formula.
    • Stores quantity_kg, sort_order, and optional notes.
    • This is now the key table for the mix calculator.

Relationships:

  • products.mix_id -> mixes.id
  • products.id -> product_ingredients.product_id
  • raw_materials.id -> product_ingredients.raw_material_id

Why both mix_ingredients and product_ingredients exist:

  • mix_ingredients represents a shared recipe.
  • product_ingredients represents the actual formula used for a specific product.
  • Multiple products can point at the same mix name but still require product-specific formula rows.
  • This solves the workbook case where product labels like Budgie Mix 20kg map to a formula/mix name like Hunter - Budgie Mix.

4. Costing Assumptions

These tables hold non-ingredient costs used in product costing.

  • process_cost_rules

    • Holds grading, bagging, and cracking costs by process_name.
  • packaging_cost_rules

    • Holds bag cost by sale_type, unit_of_measure, and own_bag.
  • freight_cost_rules

    • Holds freight cost by sale_type and unit_of_measure.

These tables are read during product cost calculation after ingredient cost has been resolved.

5. Scenarios and Stored Outputs

  • scenarios

    • Named pricing/costing scenarios.
    • Stores overrides as JSON.
  • costing_results

    • One-to-many from scenarios.
    • Stores calculated output per product for a scenario.
    • Includes prices, warnings, and calculation details as JSON.

Relationships:

  • scenarios.id -> costing_results.scenario_id
  • products.id -> costing_results.product_id

6. Mix Calculator Sessions

These tables store saved calculator runs.

  • mix_calculator_sessions

    • Header row for a calculator run.
    • Stores product, mix, batch size, total bags, total kg, prepared by, and timestamps.
  • mix_calculator_session_lines

    • One-to-many from mix_calculator_sessions.
    • Snapshot of the scaled ingredient rows shown to the user at save time.
    • Stores required_kg, mix_percentage, unit, and display name.

Relationships:

  • mix_calculator_sessions.id -> mix_calculator_session_lines.session_id
  • products.id -> mix_calculator_sessions.product_id
  • mixes.id -> mix_calculator_sessions.mix_id

Important note:

  • Session lines are denormalized snapshots.
  • They are intentionally stored separately so historical saved runs do not change if product formulas are updated later.

7. Client Access / Tenant Administration

These tables manage customer-facing users and feature/module access.

  • client_accounts

    • One row per client/tenant account.
  • client_users

    • One-to-many from client_accounts.
    • Customer-side users tied to a client account.
  • client_feature_access

    • One-to-many from client_accounts.
    • Feature flags per client account.
  • client_user_module_permissions

    • One-to-many from client_users.
    • Module-level access levels per client user.
  • client_access_audit_events

    • One-to-many from client_accounts.
    • Audit log for client-access changes.

Relationships:

  • client_accounts.id -> client_users.client_account_id
  • client_accounts.id -> client_feature_access.client_account_id
  • client_accounts.id -> client_access_audit_events.client_account_id
  • client_users.id -> client_user_module_permissions.client_user_id

8. Internal Access Control

These tables are for internal staff login and permissions.

  • users

    • Internal users.
    • Stores per-user password_hash, role link, and active flag.
  • roles

    • Named roles like Admin, Operations, Full Access.
  • permissions

    • Atomic permission keys like view_mix_calculator.
  • role_permissions

    • Many-to-many join table between roles and permissions.

Relationships:

  • roles.id -> users.role_id
  • roles.id <-> permissions.id through role_permissions

Core Costing Flow

Raw Material Cost

The system calculates ingredient cost from:

  • market_value
  • waste_percentage
  • kg_per_unit

This produces:

  • loss cost
  • adjusted cost per unit
  • cost per kg

Mix Cost

There are now two formula sources:

  1. Preferred: product_ingredients
  2. Fallback: mix_ingredients

For mix calculator and product costing:

  • if a product has rows in product_ingredients, use them
  • otherwise use the linked shared mix from mix_ingredients

Product Cost

Product cost is built from:

  1. ingredient formula cost
  2. process costs
  3. packaging cost
  4. freight cost
  5. optional distributor / wholesale margin

Workbook Import Design

The seed/import logic is in backend/app/seed.py.

There are now two workbook roles:

  • Legacy costing workbook:
    • C- Raw Products Costs
    • M - All
    • Product Cost - Price
  • Product-formula workbook:
    • input_data/1.xlsx
    • sheet mix_quantites_per_client_per_pr

Current Import Behaviour

  • Raw materials are seeded from the legacy costing workbook.
  • Shared mixes are seeded from the legacy costing workbook.
  • Products are seeded from the legacy costing workbook.
  • Product-specific formulas are seeded from mix_quantites_per_client_per_pr.

Formula Matching Rule

Workbook formula rows are attached to products using:

  1. (client_name, product.name) if it matches directly.
  2. (client_name, product.mix.name) if the workbook row uses the mix/formula name instead of the sellable product label.

This is important for cases like:

  • workbook formula: HunterBird / Hunter - Budgie Mix
  • product row: HunterBird / Budgie Mix 20kg

Both product SKUs can inherit the same formula through the linked mix name.

Tenancy

Most business tables include tenant_id.

This includes:

  • raw materials
  • price versions
  • mixes
  • mix ingredients
  • product ingredients
  • products
  • scenarios
  • costing results
  • mix calculator sessions and lines
  • client-access tables
  • assumption tables

Startup migration logic backfills tenant_id where possible by deriving it from related parent tables.

Visibility Rules

The products.visible flag is used to hide client/product rows from normal UI paths.

Startup migration logic also auto-hides products for a configured list of client names in backend/app/db/migrations.py.

This means:

  • rows can exist in the database
  • but not be offered in normal mix calculator/product selection flows

Transaction vs Reference Data

Reference/master data:

  • raw_materials
  • raw_material_price_versions
  • mixes
  • mix_ingredients
  • products
  • product_ingredients
  • process_cost_rules
  • packaging_cost_rules
  • freight_cost_rules
  • access-control tables

Transactional/snapshot data:

  • mix_calculator_sessions
  • mix_calculator_session_lines
  • scenarios
  • costing_results
  • client_access_audit_events

Important Constraints

  • mix_ingredients is unique on (mix_id, raw_material_id).
  • product_ingredients is unique on (product_id, raw_material_id).
  • client_users is unique on (client_account_id, email).
  • client_feature_access is unique on (client_account_id, feature_key).
  • client_user_module_permissions is unique on (client_user_id, module_key).
  • mix_calculator_sessions is unique on (tenant_id, session_number).

These constraints prevent duplicate ingredient or access rows within the same parent scope.

Known Tradeoffs

  • RawMaterial.name is globally unique, not tenant-scoped. That is simple for now, but stricter than a multi-tenant design usually wants.
  • Product.mix_id is still required even though product-specific formulas now exist. That is useful for compatibility and navigation, but it means a product currently has both a shared mix link and potentially its own formula rows.
  • Some calculation outputs are denormalized into session/result tables for stability and history.
  • Migration logic is startup-driven and pragmatic rather than using a full migration framework like Alembic.

Use this as the working model of the schema:

  • raw_materials = ingredients
  • raw_material_price_versions = ingredient pricing history
  • mixes = shared recipe labels
  • mix_ingredients = shared recipe lines
  • products = saleable SKUs
  • product_ingredients = actual formula for a SKU
  • mix_calculator_sessions + lines = saved production calculations
  • scenarios + costing_results = stored what-if pricing outputs
  • client_* tables = client account access
  • users / roles / permissions = internal staff access

Files To Read Alongside This Document