Aequalis - Healthcare Price Comparison

User Name

User

Account Details

Username—

Email—

Role—

Organization—

Two-Factor—

Member Since—

Last Login—

Account & Settings

Full-Text Search

Search across all healthcare services with intelligent matching and filtering

Advanced Filters

Index:

Billing Code Type:

Billing Class:

Sort By:

Min Price ($):

Max Price ($):

Rate Estimator

Estimates derived from current Transparency in Coverage and Hospital Price Transparency files

Enter Procedure Information

Billing Code *

CPT, HCPCS, or DRG code

Location (Optional)

City within the selected state

Advanced Options

Minimum Quality Score

Filter low-quality data (0-100, 0=all data)

Network (Optional)

Filter by network type

Exclude ghost rates

Cost Estimator Guide

Cost Metrics

Median Rate: The middle value - half of providers charge more, half charge less. Most reliable estimate.
Mean Rate: Average of all rates. Can be skewed by outliers.
10th Percentile: Low-cost benchmark - 90% of providers charge more than this.
90th Percentile: High-cost benchmark - only 10% of providers charge more.

Confidence Score

80-100% (High): Large sample, consistent rates, high-quality data. Very reliable.
60-79% (Medium): Moderate sample, some variance. Generally reliable.
0-59% (Low): Small sample or high variance. Use with caution.
Formula: 50% sample size + 30% quality + 20% consistency

Value Score (Recommendations)

Formula: 60% Quality Score + 40% Affordability Score
Quality (0-100): Based on data plausibility and reliability metrics
Affordability (0-100): Inverse normalized rate (lower cost = higher score)
Higher is better: 90+ is excellent, 70-89 is good, 50-69 is fair

Provider Table Columns

Name: Provider organization or individual name (from NPPES registry)
Classification: Provider type (e.g., Medical Supply Company, DME Supplier)
Location: City and state where provider operates
Rate: Negotiated rate for this service
Percentile: Where this rate ranks vs all providers (lower is cheaper)
Quality: Data quality score (0-100). 50 = no data available.

Recommendation Badges

BEST VALUE: Highest overall value score (quality + affordability)
HIGH QUALITY: Quality score above 70
LOW COST: Rate in bottom 25% (25th percentile or lower)

Search Filters

Billing Code: CPT, HCPCS, or DRG code (required)
Min Quality: Filter out providers below this quality threshold
Exclude Ghost Rates: Remove potentially inaccurate $0.01-$1 rates
Network: Filter by insurance network type (PPO, HMO, etc.)

Tip: Click any column header to sort the provider table. Click the NPI link to view provider details in the NPPES registry.

Service Name

Billing Code Info

Estimated Cost Range

Min Max

MEDIAN COST

AVERAGE COST

TYPICAL RANGE

$0 - $0

25th - 75th percentile

Sample Size: 0 rates from 0 providers

Data Quality: 0/100

Locations: 0 cities

Provider Comparison (Top 20)

About Out-of-Network Services

Each card below represents a healthcare service or procedure (identified by billing codes like CPT, HCPCS, or DRG). These cards show the allowed amounts - the maximum amount an insurance plan will pay for out-of-network services.

What you'll see: Service name, billing code, average/min/max allowed amounts, billed charges, billing class (professional/institutional), provider count, and reporting entity information. Click any card to see detailed pricing records.

About In-Network Negotiated Rates (Click to expand)

Each card below represents a healthcare service or procedure with pre-negotiated rates between insurance plans and provider networks. These are the actual contracted prices that in-network providers have agreed to accept.

Multi-Rate Intelligence Features:

Rate Type Badges: Each service displays its negotiation type - Negotiated (fee-for-service), Bundle (bundled payment), Capitation (per-member), Derived (calculated), or Fee Schedule (standardized)
Expiration Tracking: Rates are automatically filtered to hide expired contracts by default. Services show expiration dates with color-coded warnings: red for expired, orange for expiring within 30 days, gray for valid
Rate Comparison: Services with multiple rates display an expandable section showing all rate variations, including each rate's type, amount, provider count, expiration status, and summary statistics (range, average, active count)
Advanced Filtering: Filter by specific rate types or toggle expired rates visibility. Use quality filters to show only rates from verified providers with NPPES taxonomy data
Quality Indicators: Badges indicate provider verification levels. Ghost rates (unverified providers) can be excluded for higher data quality

CMS Transparency in Coverage Schema:

Service Details: Billing code, code type, code version, service name, description
Negotiation Arrangement: Fee-for-service (FFS), bundled payments, or capitation
Rate Information: Negotiated rates (dollar amounts or percentages), billing class, expiration dates
Service Modifiers: Place of service codes, billing code modifiers, additional information
Provider Networks: Tax Identification Numbers (TINs), business names, network names, NPI lists
Special Cases: Severity of illness (for DRG codes), bundled codes, covered services (for capitation)

Note: Plan-level information (which specific insurance plans use these rates) is stored in separate Table of Contents files. Each data file can be associated with multiple insurance plans (plan name, plan ID, issuer name, market type) as defined in the CMS reporting structure.

Advanced Options

Filter by Location

State:

City:

Zip Code:

Filter by Insurance Network

Network Name:

Filter rates by specific insurance network (e.g., PPO, HMO, EPO)

Data Quality Filters

Show only high-quality rates (verified providers) (slower query) Hide expired rates

Hospital Price Transparency

Compare negotiated rates across 76 BCBS Michigan in-network hospitals

About Hospital Data: This data shows hospitals with BCBS Michigan contracts, extracted from Transparency in Coverage files.

Currently showing 76 hospitals with 100% having negotiated rates

Filter Hospitals

Hospital Type

Has Rates

Sort By

Per Page

Geographic Rate Analysis

Explore how healthcare service rates vary by location across Michigan and beyond

About Geographic Analysis: This tool visualizes provider locations and rate variations across different cities and regions.

Loading provider location data...

Select Service to Analyze

Service Code Type

Billing Code

State Filter

Interactive Rate Map

Tip: Click clusters to zoom in and see individual locations. At zoom level 12+, all markers are shown separately.

Data Analytics & Visualization

Explore pricing trends, build dashboards, and analyze healthcare data with Kibana

Getting Started: Kibana is loading below. If you see a blank screen, it may take 60-90 seconds to fully initialize. You can also open Kibana directly

Platform Dashboard

API

Checking...

PostgreSQL

Checking...

Elasticsearch

Checking...

Operations

Idle

Databases

—

Records

—

Users

—

Files Processed

—

Notifications & Alerts

Server Containers

Server Resources

User Activity

API Usage

User Management

Total Users

Admins

Active

User	Email	Organization	Role	2FA	Status	Last Login	Actions
Loading users...

Pending Registrations 0

Loading pending registrations...

Password Reset Requests 0

Users who proved their identity with 2FA and asked an admin to reset their password. Resolving generates a one-time temporary password the user must change on next login.

Loading password reset requests...

Role & Permission Management

Manage user roles and their permissions. Roles determine what features and data users can access.

Loading roles...

Available Permissions

Organization Management

Loading organizations...

New Customer Setup

Organization

License

Databases

Invite

Create the organization that will hold users, databases, and licensing.

Organization Name *

Slug * (URL-safe identifier)

Contact Name

Contact Email

Description

Choose a license tier. Limits can be adjusted later.

Tier

Max Users

Max Databases

Max API Keys

Expires (optional)

Notes

Grant access to payer databases. You can add more later.

Loading databases...

Send invite(s) to initial users. They'll receive a token to register.

Role

✓

Customer Created!

License Management

Available Tiers

Loading tiers...

Organization Licenses

System Health

Branding & Theme

Customize the platform appearance including logo, brand name, and color theme.

Brand Identity

Platform Name

Brand Icon (Emoji or URL)

Enter an emoji () or image URL

Tagline

Color Theme

Primary Color

Secondary Color

Accent Color

Quick Presets

Live Preview

Aequalis

Healthcare Price Comparison

Custom CSS Styles

Warning: Changes affect ALL users. Always test with Preview first before saving.

CSS Editor

Example Snippets & Quick Reference

Common Selectors

.header - Main navigation bar

.tab-content - Tab content areas

.search-box - Search input boxes

table th - Table headers

table td - Table cells

.toast - Toast notifications

Dark Header

.header {
  background: #1a1a2e !important;
}

Extra Rounded Cards

[style*="border-radius: 12px"] {
  border-radius: 20px !important;
}

Custom Font

body {
  font-family: 'Inter', sans-serif !important;
}

Larger Table Text

table {
  font-size: 1.1em !important;
}
table td, table th {
  padding: 16px !important;
}

Pink Accent Buttons

button[style*="background: #10b981"] {
  background: #ec4899 !important;
}

Hide Admin Tab

/* Hide for non-admins */
#admin-panel { display: none; }
.tab-button[data-tab="admin"] {
  display: none;
}

Click on a payer card to expand and see database details. Each payer has a raw database (for ETL processing) and a production database (for user queries).

Loading databases...

Database Operations

These operations affect the active database only. Reset File Tracking clears download history. Reset Database deletes all data.

Docker Containers

Loading container status...

Data Processing Monitor

Click "Start Monitoring" to track data processing in real-time

Database Statistics

Database Architecture

Database Overview

Total Tables

Total Records

50+

Indexes

Fully Enriched

Core Tables

Table	Records	Purpose
in_network_services	Loading...	Medical services/procedures with quality scores
provider_groups	Loading...	Provider organizations (enriched with NPPES)
negotiated_rate_provider_groups	Loading...	Junction table: Service Provider links
nppes_providers	Loading...	National provider lookup (taxonomy enrichment)
negotiated_prices	Loading...	Actual dollar amounts for services

Performance Note: With many provider groups, every query MUST use WHERE filters and LIMIT clauses to prevent memory exhaustion!

Complete Table List (19 Tables)

in_network_services

negotiated_rates

negotiated_prices

provider_groups

negotiated_rate_provider_groups

nppes_providers

reporting_entities

plans

allowed_amounts

allowed_amount_providers

provider_facilities

providers

services

tax_identifiers

service_facility_link

enrichment_log

file_processing_log

rate_pattern_flags

npi_registry_cache

Entity Relationship Diagram

Complete visual representation of all table relationships

┌─────────────────────────────────────────────────────────────────────────────────────┐
│                           INSURANCE PAYER HIERARCHY                                 │
└─────────────────────────────────────────────────────────────────────────────────────┘

    ┌──────────────────────────┐
    │  reporting_entities      │  Insurance Company (BCBS Michigan)
    │  ──────────────────────  │
    │  • id (PK)               │
    │  • name                  │
    │  • entity_type           │
    └────────────┬─────────────┘
                 │ 1:N
                 
    ┌──────────────────────────┐
    │  plans                   │  Insurance Plans (HMO, PPO, etc.)
    │  ──────────────────────  │
    │  • id (PK)               │
    │  • reporting_entity_id   │ (FK)
    │  • plan_name             │
    └────────────┬─────────────┘
                 │ 1:N
                 

┌─────────────────────────────────────────────────────────────────────────────────────┐
│                      CORE SERVICE & PRICING DATA                                    │
└─────────────────────────────────────────────────────────────────────────────────────┘

    ┌─────────────────────────────────────────────────┐
    │  in_network_services                            │  Medical Services
    │  ───────────────────────────────────────────    │
    │  • id (PK)                              418,652 │
    │  • billing_code (CPT/HCPCS)                     │
    │  • name                                         │
    │  • plausibility_score                          │
    │  • plausibility_flags                          │
    └───────────────────────┬─────────────────────────┘
                            │ 1:N
                            
    ┌─────────────────────────────────────────────────┐
    │  negotiated_rates                       418,652 │
    └───────────────┬──────────────────┬──────────────┘
                    │ 1:N              │ 1:N
                                      
    ┌──────────────────────────┐    ┌────────────────────────────────────┐
    │  negotiated_prices       │    │  negotiated_rate_provider_groups   │
    │  • negotiated_rate_value │   │                          34,011,942│
    └──────────────────────────┘    └───────────────┬────────────────────┘
                                                    │ N:1
                                                    
┌─────────────────────────────────────────────────────────────────────────────────────┐
│                          PROVIDER INFORMATION                                       │
└─────────────────────────────────────────────────────────────────────────────────────┘

                    ┌────────────────────────────────────────────┐
                    │  provider_groups                 34,011,942│  Providers
                    │  • tin_value, npi_list                     │
                    │  • primary_taxonomy_code                  │
                    │  • primary_specialization                 │
                    │  • city, state, zip_code                   │
                    └────────────────────────────────────────────┘
                                        
                                        │ Enrichment (JOIN on NPI)
                                        │
                    ┌────────────────────────────────────────────┐
                    │  nppes_providers                 9,236,343 │
                    │  • npi (PK)                                │
                    │  • provider_taxonomy_code                  │
                    │  • provider_specialization                 │
                    └────────────────────────────────────────────┘

Key Insight: The junction table negotiated_rate_provider_groups with 34M records connects services to providers, creating a many-to-many relationship. This is why queries must be carefully optimized!

Entity Relationships

Insurance Hierarchy (1:N:N)

reporting_entities (BCBS Michigan)
     1:N
plans (HMO, PPO, Medicare Advantage)
     1:N
in_network_services (418,652 services)

Service Pricing (1:N:N)

in_network_services (e.g., "MRI Brain - CPT 70543")
     1:N
negotiated_rates (rate containers)
     1:N                     1:N
negotiated_prices      negotiated_rate_provider_groups (34M links!)
    ($ amounts)               N:1
                        provider_groups (34M organizations)

Provider Enrichment (N:1 Lookup)

provider_groups.npi_list [NPI1, NPI2, ...]
     JOIN
nppes_providers.npi (9.2M records)
     COPY TAXONOMY FIELDS
provider_groups:
    • primary_taxonomy_code
    • primary_taxonomy_name
    • primary_specialization
    • primary_classification
    • taxonomy_enriched = true

Foreign Key Constraints:

CASCADE DELETE: Deleting a service auto-deletes all its rates, prices, and provider links
RESTRICT: Cannot delete reporting entities/plans if services exist
REFERENTIAL INTEGRITY: All relationships enforced at database level

Detailed Foreign Key Behavior

CASCADE DELETE CHAIN:

DELETE FROM in_network_services WHERE id = 123
   CASCADE
DELETE FROM negotiated_rates WHERE service_id = 123
   CASCADE                            CASCADE
DELETE FROM negotiated_prices    DELETE FROM negotiated_rate_provider_groups
WHERE negotiated_rate_id IN ...  WHERE negotiated_rate_id IN ...

Result: One service deletion removes all associated rates, prices, and provider associations

RESTRICT CONSTRAINTS:

in_network_services.reporting_entity_id reporting_entities.id (ON DELETE RESTRICT)
in_network_services.plan_id plans.id (ON DELETE RESTRICT)

Cannot delete insurance plans or reporting entities while services reference them - prevents orphaned data

Visual Relationship Tree Example:

BCBS Michigan (Reporting Entity)
└── Plan A - PPO Network
    └── Service: MRI Brain (CPT 70543)
        ├── Negotiated Rate #1
        │   ├── Price: $500 (in-network)
        │   └── Provider Group #1
        │       ├── Name: Henry Ford Hospital
        │       ├── Location: Detroit, MI
        │       ├── Taxonomy: 282N00000X (Hospital)
        │       └── Plausibility Score: 85/100 
        └── Negotiated Rate #2
            ├── Price: $750 (in-network)
            └── Provider Group #2
                ├── Name: Michigan Medicine
                ├── Location: Ann Arbor, MI
                ├── Taxonomy: 282N00000X (Hospital)
                └── Plausibility Score: 92/100

Index Strategy

in_network_services (418K records)

idx_in_network_services_billing_code (btree) Fast CPT/HCPCS lookup
idx_in_network_services_plausibility (btree) Quality filtering
idx_in_network_services_description_gin (GIN) Full-text search

provider_groups (34M records!) CRITICAL

Heavily indexed for performance on massive dataset:

idx_provider_groups_tin_value (btree) Provider lookup by Tax ID
idx_provider_groups_city (btree) Geographic queries
idx_provider_groups_state (btree) State filtering
idx_provider_groups_taxonomy_code (btree) Specialty filtering
idx_provider_groups_hospital_taxonomy (partial WHERE taxonomy ~~ '282%') Hospital-specific

negotiated_prices (418K records)

idx_negotiated_prices_rate_value (btree) Price filtering/sorting
idx_negotiated_prices_negotiated_type (btree) Rate type filtering

Critical Rule: With 34M records in provider_groups, EVERY query MUST use an indexed WHERE clause or the database will run out of memory!

Data Pipeline Flow

Phase 1: Data Ingestion

                                        [CMS Transparency Files]  src/parsers/cms_parser.py
                                    
                                         Extract JSON Data

                                        ├─ reporting_entities (Insurance company)

                                        ├─ plans (Insurance plans)

                                        ├─ in_network_services (418K services)

                                        ├─ negotiated_rates (Rate containers)

                                        ├─ negotiated_prices (Dollar amounts)

                                        └─ provider_groups (34M records created)

Phase 2: NPPES Enrichment

                                        [NPPES Data File - 10.29GB CSV]  src/scripts/load_nppes_data.py
                                    
                                         Load 9,236,343 providers into nppes_providers table

                                         src/scripts/enrich_provider_groups.py

                                         Match provider_groups.npi_list  nppes_providers.npi

                                        ├─ Copy primary_taxonomy_code

                                        ├─ Copy primary_taxonomy_name

                                        ├─ Copy primary_specialization

                                        ├─ Copy primary_classification

                                        └─ Set taxonomy_enriched = true 
                                    
                                        Result: 34M provider groups enriched with specialty data

Phase 3: Quality Scoring

                                        src/scripts/score_plausibility.py + taxonomy_validator.py
                                    
                                        For each service (418K):

                                        ├─ Get provider taxonomies

                                        ├─ Get service category (CPT code range)

                                        ├─ Check specialty match

                                        └─ Calculate plausibility_score (0-100)
                                    
                                        Distribution: 37.54% Good | 22.64% Fair | 39.58% Questionable

Phase 4: Elasticsearch Indexing

                                        src/elasticsearch/indexer.py
                                    
                                        Aggregate by billing_code:

                                        ├─ MIN(rate), MAX(rate), AVG(rate)

                                        ├─ COUNT(providers)

                                        └─ Nested provider/pricing arrays
                                    
                                        Result: 3,869 unique billing codes indexed  Kibana dashboards

Performance Best Practices

DO: Always Filter by Indexed Columns

SELECT * FROM provider_groups
WHERE state = 'MI'              --  Uses idx_provider_groups_state
  AND city = 'DETROIT'          --  Uses idx_provider_groups_city
  AND taxonomy_enriched = true  --  Uses idx_provider_groups_enriched
LIMIT 1000;                     --  Prevents memory exhaustion

DO: Use CTEs for Complex Queries

WITH filtered_services AS (
  SELECT id FROM in_network_services 
  WHERE billing_code IN ('99212', '70543')
    AND plausibility_score >= 75
)
SELECT pg.tin_value, AVG(np.negotiated_rate_value)
FROM provider_groups pg
  JOIN ... ON ...
  JOIN filtered_services fs ON ...
WHERE pg.state = 'MI'
GROUP BY pg.tin_value
LIMIT 100;

DON'T: Full Table Scans on 34M Records

--  BAD: Will crash with out of memory!
SELECT * FROM provider_groups;

--  BAD: No WHERE filter on 34M rows
SELECT AVG(negotiated_rate_value) 
FROM negotiated_rate_provider_groups;

DON'T: Complex Aggregations Without Filters

--  BAD: GROUP BY on 34M records without WHERE
SELECT provider_group_id, COUNT(*), AVG(rate)
FROM negotiated_rate_provider_groups
GROUP BY provider_group_id;  -- Database runs out of memory!

DO: Use EXISTS Instead of IN

--  GOOD: EXISTS stops at first match (faster)
SELECT s.billing_code, s.name
FROM in_network_services s
WHERE EXISTS (
  SELECT 1 FROM negotiated_rate_provider_groups nrpg
    JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
  WHERE nrpg.negotiated_rate_id = (
    SELECT id FROM negotiated_rates WHERE service_id = s.id
  )
  AND pg.state = 'MI'
  AND pg.city = 'Detroit'
);

DON'T: Use IN with Large Subqueries

--  BAD: IN evaluates full subquery (slower)
SELECT s.billing_code, s.name
FROM in_network_services s
WHERE s.id IN (
  SELECT service_id FROM negotiated_rates nr
    JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
    JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
  WHERE pg.state = 'MI'  -- Still processes all 34M rows!
);

DO: Use Proper Index Hints

--  GOOD: Sargable query (uses index)
SELECT * FROM provider_groups
WHERE primary_taxonomy_code = '207R00000X'  -- Exact match
  AND state = 'MI';

--  GOOD: Prefix LIKE (uses index)
SELECT * FROM provider_groups
WHERE primary_taxonomy_code LIKE '282%'  -- Hospital codes
  AND city = 'Detroit';

DON'T: Use Non-Sargable Queries

--  BAD: Leading wildcard can't use index
SELECT * FROM provider_groups
WHERE primary_taxonomy_code LIKE '%00000X';  -- Full scan!

--  BAD: Function on indexed column
SELECT * FROM provider_groups
WHERE LOWER(state) = 'mi';  -- Index not used!

--  GOOD: Store data in consistent case
SELECT * FROM provider_groups
WHERE state = 'MI';  -- Uses index

Advanced: Materialized Views

For frequently-run aggregations on 34M records, consider materialized views:

-- Create materialized view (refreshed periodically)
CREATE MATERIALIZED VIEW provider_state_summary AS
SELECT 
  state,
  COUNT(*) as provider_count,
  COUNT(*) FILTER (WHERE taxonomy_enriched = true) as enriched_count,
  COUNT(DISTINCT primary_taxonomy_code) as unique_specialties
FROM provider_groups
GROUP BY state;

CREATE INDEX ON provider_state_summary(state);

-- Query is instant (no 34M row scan!)
SELECT * FROM provider_state_summary WHERE state = 'MI';

Golden Rules for 34M Record Queries

Always use WHERE clauses with indexed columns (state, city, taxonomy_code)
Always use LIMIT to cap result sets
Filter early with CTEs before complex JOINs
Use EXISTS instead of IN for subqueries (stops at first match)
Avoid LIKE '%text' (leading wildcards can't use indexes)
Don't use functions on indexed columns in WHERE clauses
Consider materialized views for repeated complex aggregations
Test with EXPLAIN ANALYZE to verify index usage

Quality Scoring System

Plausibility Score (0-100)

Validates whether provider specialty matches service type

Score Ranges

90-100: Excellent

Perfect specialty match (e.g., Cardiologist doing heart surgery)

75-89: Good

Compatible provider-service match

50-74: Fair

Neutral/generic codes (DME, equipment)

25-49: Questionable

Specialty mismatch (e.g., Dentist doing orthopedic surgery)

Actual Distribution (418,652 services)

0.24%

Excellent (1,024)

37.54%

Good (157,144)

22.64%

Fair (94,796)

39.58%

Questionable (165,688)

Validation Logic

                            CPT Code Ranges:

                            • Surgery: 10000-69999

                            • Radiology: 70000-79999

                            • Lab: 80000-89999

                            • Medicine: 90000-99999

                            Scoring Algorithm:

                            1. Base score: 50 (neutral)

                            2. If provider classification matches service: +25

                            3. If specialty matches (e.g., Imaging for MRI): +25

                            4. Special rules (hospitals can do anything): 85

                            5. Max score: 100

Why This Matters: Quality scores help identify potentially questionable provider-service combinations, improving data reliability for price comparisons and network design.

Real-World Query Examples

1 Find Cheapest Provider for Service

SELECT 
  COALESCE(npp.provider_organization_name, npp.provider_name) as provider,
  pg.city, pg.state,
  pg.primary_specialization,
  np.negotiated_rate_value as rate,
  ins.plausibility_score
FROM in_network_services ins
  JOIN negotiated_rates nr ON ins.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
  JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
  LEFT JOIN nppes_providers npp ON SPLIT_PART(pg.npi_list, ',', 1) = npp.npi
WHERE ins.billing_code = '70543'  -- MRI Brain
  AND pg.state = 'MI'
  AND ins.plausibility_score >= 75  -- Only quality matches
ORDER BY np.negotiated_rate_value ASC
LIMIT 10;

Indexes Used: billing_code, state, plausibility_score

2 Provider Cost Efficiency Score

WITH market_avg AS (
  SELECT AVG(np.negotiated_rate_value) as avg_rate
  FROM in_network_services ins
    JOIN negotiated_rates nr ON ins.id = nr.service_id
    JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  WHERE ins.billing_code = '99212'
)
SELECT 
  pg.tin_value,
  COALESCE(npp.provider_organization_name, 'Unknown') as provider_name,
  pg.city,
  AVG(np.negotiated_rate_value) as provider_avg,
  ma.avg_rate as market_avg,
  ROUND((AVG(np.negotiated_rate_value) - ma.avg_rate) / ma.avg_rate * 100, 1) as pct_vs_market
FROM provider_groups pg
  LEFT JOIN nppes_providers npp ON SPLIT_PART(pg.npi_list, ',', 1) = npp.npi
  JOIN negotiated_rate_provider_groups nrpg ON pg.id = nrpg.provider_group_id
  JOIN negotiated_rates nr ON nrpg.negotiated_rate_id = nr.id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN in_network_services ins ON nr.service_id = ins.id
  CROSS JOIN market_avg ma
WHERE ins.billing_code = '99212'
  AND pg.state = 'MI'
GROUP BY pg.tin_value, npp.provider_organization_name, pg.city, ma.avg_rate
ORDER BY pct_vs_market ASC
LIMIT 20;

3 Geographic Price Variation

SELECT 
  pg.city,
  pg.state,
  COUNT(DISTINCT pg.id) as provider_count,
  MIN(np.negotiated_rate_value) as min_rate,
  AVG(np.negotiated_rate_value) as avg_rate,
  MAX(np.negotiated_rate_value) as max_rate,
  MAX(np.negotiated_rate_value) - MIN(np.negotiated_rate_value) as price_spread
FROM in_network_services ins
  JOIN negotiated_rates nr ON ins.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
  JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
WHERE ins.billing_code = '70543'
  AND pg.state = 'MI'
  AND pg.city IS NOT NULL
GROUP BY pg.city, pg.state
HAVING COUNT(DISTINCT pg.id) >= 3
ORDER BY price_spread DESC;

Pro Tip: These queries all use indexed columns (billing_code, state, city) and include LIMIT clauses to ensure good performance even with 34M records!

Key Design Decisions

Why Junction Table?

Without Junction Table (Bad):

negotiated_rates.provider_list = "[group1, group2, ...]"

Can't query efficiently
Can't enforce foreign keys
Can't index properly

With Junction Table (Good):

negotiated_rate_provider_groups:
  rate_id | provider_group_id
  --------|------------------
  1       | 12345
  1       | 67890

Fast lookups in both directions
Referential integrity maintained
Proper indexing on both FKs

Why Separate negotiated_prices?

Reason: One rate can have multiple price variations

                                    Rate ID 1 (CPT 99212 - Office Visit):

                                    ├─ Price 1: $75 (professional, office)

                                    ├─ Price 2: $120 (facility, hospital)

                                    └─ Price 3: $90 (professional, telehealth)

Supports: Different billing classes, settings, and modifiers for the same service

Why Denormalize Taxonomy?

Alternative (Normalized):

provider_groups.taxonomy_id  taxonomy_reference_table.id

Current Design (Denormalized):

provider_groups:
  • primary_taxonomy_code
  • primary_taxonomy_name
  • primary_specialization
  • primary_classification

Advantages:

Faster queries (no extra JOIN)
Direct filtering on specialization
Better index performance
Simpler application logic

Trade-off:

Data duplication (acceptable for analytics)
Read performance >> Storage cost

Design Philosophy

All design decisions optimize for read performance and query simplicity in an analytics workload. With 34M+ provider records, every decision prioritizes fast filtering and aggregation over storage efficiency.

Data Volume Summary

Comprehensive overview of record counts across all core tables.

Table Name	Record Count	Description	Index Strategy
provider_groups	Loading...	Healthcare organizations (TIN-based)	13+ indexes (tin, taxonomy, city, state, zip)
negotiated_rate_provider_groups	Loading...	Junction table (rates providers)	2 FK indexes (rate_id, provider_group_id)
nppes_providers	Loading...	NPPES registry (NPI-based)	NPI primary key + taxonomy indexes
in_network_services	Loading...	CPT/HCPCS services with quality scores	billing_code, plausibility, description_gin
negotiated_rates	Loading...	Rate containers (1:1 with services)	service_id FK index
negotiated_prices	Loading...	Actual dollar amounts	negotiated_rate_id FK index
reporting_entities	Loading...	Insurance companies (BCBS Michigan)	Primary key only
plans	Loading...	Insurance plans (PPO, HMO, etc.)	reporting_entity_id FK index
TOTAL	Loading...	Total records across all core tables

Performance Implications

provider_groups: REQUIRES indexed WHERE clauses or database runs out of memory
Junction table: Every provider price query touches this - indexes are critical
NPPES providers: NPI lookups must use primary key index
Total records: Full table scans are impossible - always filter first

Enrichment Status Tracking

The enrichment process populates provider taxonomy data from NPPES registry. This system tracks enrichment progress and status.

enrichment_log Table

Purpose: Tracks each enrichment batch run with timestamps and statistics

CREATE TABLE enrichment_log (
    id SERIAL PRIMARY KEY,
    batch_id VARCHAR(100) UNIQUE NOT NULL,
    records_processed INTEGER,
    records_enriched INTEGER,
    records_failed INTEGER,
    status VARCHAR(50),
    started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    completed_at TIMESTAMP,
    error_message TEXT
);

Provider Enrichment Flags

provider_groups table includes:

taxonomy_enriched (BOOLEAN) - Indicates if NPPES data was successfully applied
enriched_at (TIMESTAMP) - When enrichment occurred
enrichment_source (VARCHAR) - Source system (usually 'nppes')

Query enrichment status:

SELECT 
    COUNT(*) AS total_providers,
    COUNT(*) FILTER (WHERE taxonomy_enriched = true) AS enriched,
    COUNT(*) FILTER (WHERE taxonomy_enriched = false) AS pending,
    ROUND(100.0 * COUNT(*) FILTER (WHERE taxonomy_enriched = true) / COUNT(*), 2) AS pct_enriched
FROM provider_groups;

Enrichment Process Flow

1. LOAD NPPES DATA
   ├─ Download 10.29GB NPPES CSV
   ├─ Parse 9,236,343 provider records
   └─ Load into nppes_providers table

2. MATCH NPIs
   ├─ Extract NPI list from provider_groups.npi_list (JSON array)
   ├─ JOIN nppes_providers ON npi
   └─ Match primary NPI (first in array)

3. COPY TAXONOMY FIELDS
   ├─ primary_taxonomy_code (e.g., "207R00000X")
   ├─ primary_taxonomy_name (e.g., "Internal Medicine")
   ├─ primary_specialization (e.g., "Internal Medicine")
   ├─ primary_classification (e.g., "Allopathic & Osteopathic Physicians")
   └─ Set taxonomy_enriched = true

4. LOG RESULTS
   └─ Insert batch record into enrichment_log

Current Enrichment Status

34,011,942

Total Providers

9,236,343

NPPES Records

~100%

Enrichment Rate

Special Provider Fields

Hospital-Specific Optimizations

Challenge: Hospitals represent only ~0.5% of providers but handle 25%+ of high-value services

Hospital Taxonomy Codes (282%):

282N00000X - General Acute Care Hospital
282E00000X - Long Term Care Hospital
282J00000X - Religious Nonmedical Health Care Institution
282NC0060X - Critical Access Hospital
...and 20+ other 282% variants

Special Indexes:

-- Partial index for hospitals only
CREATE INDEX idx_provider_groups_hospital_taxonomy 
ON provider_groups(primary_taxonomy_code) 
WHERE primary_taxonomy_code LIKE '282%';

-- Fast hospital-only queries
SELECT * FROM provider_groups 
WHERE primary_taxonomy_code LIKE '282%' 
  AND city = 'Detroit'
  AND state = 'MI';

Performance: Uses partial index for hospital queries, reducing index size by 99.5%

Geographic Data Structure

Location fields in provider_groups:

Address Fields:

address_line1
address_line2
city Indexed
state Indexed
zip_code Indexed

NPPES Fields:

provider_city
provider_state
provider_zip
(From nppes_providers)

Common Geographic Queries:

-- Find all providers in Detroit
SELECT * FROM provider_groups WHERE city = 'Detroit';

-- Find all Michigan hospitals
SELECT * FROM provider_groups 
WHERE state = 'MI' 
  AND primary_taxonomy_code LIKE '282%';

-- Find providers in specific zip code
SELECT * FROM provider_groups WHERE zip_code = '48201';

Index Performance:

City/State/Zip indexes enable sub-second queries on 34M records. Without indexes, geographic queries would take 5-10 minutes or fail with out-of-memory errors.

Field Design Rationale

Partial Hospital Index: Optimizes for high-value queries without bloating main indexes
Denormalized Geography: Faster queries than normalizing to separate location table
Multiple Zip Formats: Supports both 5-digit and ZIP+4 formats
City/State Indexes: Enable fast regional provider searches

Schema Evolution Timeline

The database schema evolved through 4 major phases from initial load to production-ready analytics platform.

Phase 1: Initial Data Load

Parsed CMS transparency files
Created base tables: reporting_entities, plans, in_network_services
Loaded 418,652 services from BCBS Michigan
Basic indexes on primary keys only

Result: Basic data structure, slow queries, no enrichment

Phase 2: NPPES Enrichment

Downloaded 10.29GB NPPES CSV (9.2M providers)
Created nppes_providers table with NPI primary key
Added taxonomy fields to provider_groups
Enriched 34M provider groups with taxonomy data
Added enrichment_log tracking table

Result: Providers now have specialization data, enabling specialty-based queries

Phase 3: Quality Scoring

Added plausibility_score (0-100) to in_network_services
Added plausibility_flags (JSON) for validation warnings
Implemented taxonomy_validator.py logic
Scored all 418,652 services for provider-service appropriateness
Created plausibility index for fast filtering

Result: 37.54% Good scores, 39.58% Questionable - quality insights enabled

Phase 4: Elasticsearch Integration

Indexed 3,869 unique billing codes to Elasticsearch
Enabled fuzzy search and full-text queries
Added description_gin GIN index for PostgreSQL full-text
Hybrid search: Elasticsearch for discovery, PostgreSQL for analytics
Added 13+ performance indexes to provider_groups

Result: Production-ready platform with sub-second search and analytics on 77M+ records

Evolution Metrics

4 phases

Major Iterations

77M+

Final Record Count

50+

Indexes Created

Conceptual Mind Map

High-level visualization of the complete database architecture showing all major components and relationships.

                                    HEALTHCARE TRANSPARENCY DATABASE
                                                    │
                        ┌───────────────────────┼───────────────────────┐
                        │                       │                       │
                ┌─── INSURANCE ────┐  ┌──── SERVICES ────┐  ┌──── PROVIDERS ────┐
                │                  │  │                  │  │                   │
                │ reporting_       │  │ in_network_      │  │ provider_groups   │
                │   entities (1)   │  │   services       │  │   (34M records!)  │
                │        │         │  │   (418K)         │  │        │          │
                │                 │  │        │         │  │        │          │
                │   plans (~50)    │  │        │         │  │  ┌─────┴─────┐   │
                └──────────────────┘  │                 │  │  │           │   │
                                      │ negotiated_      │  │                │
                                      │   rates          │  │ Enriched   npi   │
                                      │   (418K)         │  │ Taxonomy  _list  │
                                      │        │         │  │  Fields   (JSON) │
                                      │        ├─────┐   │  │  │           │   │
                                      │                │  │                │
                                      │ negotiated  rate │  │ primary_  nppes_ │
                                      │   _prices  _pvdr │  │ taxonomy  pvdrs  │
                                      │   (418K)  _grps  │  │ _code    (9.2M)  │
                                      │           (34M!) │  │                  │
                                      └──────────────────┘  └──────────────────┘
                                                    │
                        ┌───────────────────────┼───────────────────────┐
                        │                       │                       │
                ┌─── ENRICHMENT ───┐  ┌──── QUALITY ─────┐  ┌───── SEARCH ─────┐
                │                  │  │                  │  │                  │
                │ enrichment_log   │  │ plausibility_    │  │ Elasticsearch    │
                │ • batch_id       │  │   score (0-100)  │  │ • 3,869 codes    │
                │ • records_       │  │ • flags (JSON)   │  │ • Fuzzy search   │
                │   processed      │  │ • validated_at   │  │ • Full-text      │
                │ • records_       │  │                  │  │                  │
                │   enriched       │  │ Distribution:    │  │ Hybrid Query:    │
                │ • status         │  │ • 37.54% Good    │  │ 1. ES discovery  │
                │                  │  │ • 22.64% Fair    │  │ 2. PG analytics  │
                │ NPPES Pipeline:  │  │ • 39.58% Quest.  │  │ 3. Combine       │
                │ └─ Match NPI    │  │                  │  │                  │
                │ └─ Copy Tax.    │  │ Taxonomy Logic:  │  │ GIN Index:       │
                │ └─ Set Flag     │  │ CPT  Taxonomy   │  │ description_gin  │
                └──────────────────┘  └──────────────────┘  └──────────────────┘
                                                    │
                        ┌───────────────────────┼───────────────────────┐
                        │                       │                       │
                ┌──── INDEXES ─────┐  ┌─── GEOGRAPHY ────┐  ┌──── ANALYTICS ───┐
                │                  │  │                  │  │                  │
                │ 50+ total        │  │ city, state, zip │  │ Price Comp.:     │
                │                  │  │ indexes          │  │ • Geographic     │
                │ Critical:        │  │                  │  │ • Provider       │
                │ • tin_value      │  │ Regional Search: │  │ • Specialty      │
                │ • taxonomy_code  │  │ └─ Fast city    │  │                  │
                │ • billing_code   │  │     queries      │  │ Quality Filter:  │
                │ • plausibility   │  │ └─ State agg.   │  │ • Score ranges   │
                │                  │  │ └─ Zip lookup   │  │ • Flag analysis  │
                │ Partial:         │  │                  │  │                  │
                │ • hospital_tax.  │  │ Hospital-spec:   │  │ Cost Analysis:   │
                │   (282% only)    │  │ └─ 282% codes   │  │ • Min/max/avg    │
                │                  │  │     + city       │  │ • Percentiles    │
                └──────────────────┘  └──────────────────┘  └──────────────────┘

KEY DESIGN PRINCIPLES:
├─ Scale First: Every decision optimized for 34M+ provider queries
├─ Denormalize: Taxonomy fields copied for performance (no JOIN needed)
├─ Index Everything: 50+ indexes enable sub-second queries on massive tables
├─ Junction Table: Many-to-many relationships properly normalized
├─ Quality Scores: Built-in validation for service-provider appropriateness
└─ Hybrid Search: Elasticsearch for discovery, PostgreSQL for analytics

Architecture Highlights

3 Main Entities

Insurance, Services, Providers - all interconnected

3 Enhancement Layers

Enrichment, Quality, Search - stacked capabilities

3 Performance Systems

Indexes, Geography, Analytics - speed optimized

ETL Pipeline Guide

Reference guide for running the ETL data loading pipeline. Commands are designed to run on the AWS Lightsail server via SSH.

Basic Commands

# Run full ETL pipeline (all files)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py
# Run with record limit (for testing)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --limit-records 10000
# Run with file limit
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --limit 5

State Filter

Filter data by provider state using NPPES enrichment. Only providers in the specified state(s) will be loaded.

# Filter to Michigan providers only
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --state-filter MI
# Filter to multiple states
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --state-filter MI,OH,IN

Billing Code Filter

Filter data by billing codes to load only specific procedures or services. Supports exact matching on billing_code and/or billing_code_type.

# Filter to a single billing code
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-codes 99213
# Filter to multiple billing codes
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-codes 99213,99214,99215
# Filter by billing code type (CPT, HCPCS, etc.)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-code-types CPT
# Combine filters
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-codes 99213,99214 --billing-code-types CPT

Tip: Billing code filter dramatically reduces data size by only loading matching services and their associated provider groups.

Combined Filters

Filters can be combined for precise data loading. Example: load only specific CPT codes for Michigan providers.

# Office visits (99213-99215) in Michigan only
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py \
  --state-filter MI \
  --billing-codes 99213,99214,99215
# All CPT codes in Michigan and Ohio, limit to first 5 files
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py \
  --state-filter MI,OH \
  --billing-code-types CPT \
  --limit 5

Performance Options

# Adjust batch size (default: 50000)
--batch-size 25000
# Provider group chunk size (default: 100000)
--provider-group-chunk-size 50000
# Enable parallel processing (multiple files)
--parallel --workers 4
# Sample percentage (for testing)
--sample-percent 10

Storage Optimization

Reduce disk usage when processing large files. The --stream-gz flag reads compressed files directly without decompressing to disk (saves 10x storage). The --delete-after-process flag removes files after loading.

# Read .gz files directly (no decompression to disk)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --stream-gz
# Delete files after processing
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --delete-after-process
# Combine both for minimal storage usage
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --stream-gz --delete-after-process

Tip: Using --stream-gz avoids 10x disk expansion (e.g., 5GB 50GB). Slightly slower but essential for limited storage.

Batch Download & Process Mode (Recommended)

The most efficient way to load large datasets. Downloads N files at a time, processes them with ETL, deletes them, then downloads the next batch. Minimizes disk usage by never having all files on disk at once.

# Batch mode: download 100 files, 5 at a time
docker compose -f docker-compose.production.yml exec api python src/scraper/download_and_process.py \
  --batch-mode --limit 100 --batch-size 5 \
  --file-types in_network --db-id bcbs_michigan
# With state and billing code filters
docker compose -f docker-compose.production.yml exec api python src/scraper/download_and_process.py \
  --batch-mode --limit 50 --batch-size 3 \
  --file-types in_network --db-id bcbs_michigan \
  --state-filter MI --billing-codes 99213,99214,99215
# Large-scale load for new payer
docker compose -f docker-compose.production.yml exec api python src/scraper/download_and_process.py \
  --batch-mode --limit 500 --batch-size 10 \
  --file-types in_network --db-id uhc_michigan \
  --stream-gz --etl-batch-size 10000

Old method: Download 100 files (2TB) Decompress Load
Peak disk: ~2TB

Batch mode: Download 5 files Load Delete Repeat
Peak disk: ~100GB

Post-ETL Steps

After loading data, run these commands to complete the pipeline:

# 1. Reindex Elasticsearch (with NPPES enrichment)
docker compose -f docker-compose.production.yml exec api python reindex_with_nppes.py
# 2. Check database status
docker compose -f docker-compose.production.yml exec postgres psql -U healthcare_admin -d healthcare_bcbs_michigan_production -c "SELECT COUNT(*) FROM in_network_services;"
# 3. View processing logs
docker compose -f docker-compose.production.yml logs -f --tail=100 api

CLI Arguments Quick Reference

Argument	Description	Example
`--billing-codes`	Comma-separated list of billing codes to filter	`99213,99214,99215`
`--billing-code-types`	Comma-separated list of code types (CPT, HCPCS, etc.)	`CPT,HCPCS`
`--state-filter`	Comma-separated list of state abbreviations	`MI,OH,IN`
`--limit`	Max number of files to process	`5`
`--limit-records`	Max records per file	`10000`
`--batch-size`	Records per batch commit (default: 50000)	`25000`
`--parallel`	Enable parallel file processing	`--parallel --workers 4`
`--sample-percent`	Random sample percentage (for testing)	`10`
`--stream-gz`	Read .gz files directly without decompression	(flag only)
`--delete-after-process`	Delete files after loading to save storage	(flag only)

Batch Download Mode Arguments

These arguments are for the download_and_process.py script with --batch-mode:

Argument	Description	Example
`--batch-mode`	Enable batch download/process/delete loop	(flag only)
`--batch-size`	Files per batch (default: 5)	`10`
`--limit`	Total files to process	`100`
`--file-types`	in_network, allowed_amount, or both	`in_network`
`--db-id`	Database ID for file organization	`bcbs_michigan`
`--stream-gz`	Read .gz directly (default: true)	(flag only)
`--etl-batch-size`	Records per ETL batch	`10000`
`--state-filter`	Filter providers by state	`MI`
`--billing-codes`	Filter by billing codes	`99213,99214`

Elasticsearch Management

Loading Elasticsearch stats...

Facility Consolidation

Loading facility stats...

Consolidated Facilities

Facility Type:

Min NPIs:

Request New Database

Configure the data pipeline you need and submit a request. A platform administrator will review and process it.

Data Source (Entity)

Request Title

Data Filters

State Filter

Provider state(s)

Billing Codes

Specific codes only

Code Types

CPT, HCPCS, DRG, etc.

File Types

File Limit

Max files to process

Browse Available Files (Optional)

0 selected

	File / Description	Type	Plan
Select an entity to browse files...

Showing 0 files

Page 1

Additional Notes (Optional)

My Data Requests

Data Requests

Quick Tips & Tricks

State Filter: Use MI to load only Michigan providers, or MI,OH,IN for multiple states
Billing Code Filter: Use 99213,99214,99215 to load only specific procedures
Combined Filters: State + billing code filters work together to drastically reduce data size
Testing: Set "Max Records Per File" to 1000 to quickly test your filters
Sequential Process: Select entity Download files Configure filters Load data
Monitor Progress: Use the live progress panel to track loading in real-time
Pipeline Dashboard: The dashboard below queries the server directly — it shows live progress for downloads, ETL loads, and enrichment even after page refresh or cache clear
Resume after crash: If ETL stops (OOM/crash), the dashboard shows which files failed — click "Resume Load" to continue from where you left off with the same filters

Data Management

Download the latest data from your selected reporting entity and load it into the database.

Select Data Source

Choose which insurance company's data to download:

Insurance Company:

Download Data

selected entity

Downloading will delete existing files in data/raw/. Database keeps old data until you "Load Data".

Advanced Options

Plan Filter:

File Type:

Download Limit:

Concurrent:

Batch Mode

Keep Old Files

ETL Filters (applied during batch load)

State Filter

Billing Codes

Code Types

ETL Batch Size

Records per DB commit (default: 50,000)

Lower = less RAM usage
Higher = faster loading

Stream from .gz (skip decompression, saves disk space) Parallel processing (process files in batch concurrently)

Leave filters empty to load all data. Batch mode: Download Load with filters Delete Repeat

Available Files from TOC

0 selected

	File / Description	Type	Plan
Click "Search" to load available files...

Showing 0 files

Page 1

Ready to download

2 Create Database for Custom Entity

Set up a dedicated PostgreSQL database for this new entity.

ℹ New Entity Detected:
A dedicated database will be created to store this entity's data separately.

Database Name:

Use lowercase letters, numbers, and underscores only

Payer Name (Optional):

Ready to create database

Note: After creating the database, proceed to download and load data as normal.

Load Into Database

Parse downloaded files and load data into the database.

Data Filters

Advanced

Ready to load data

Downloaded Files Ready to Load

Files in data/raw/

0 selected

	File Name	Type	Size	Status
Loading file list...

0 files ready

0 MB total

Pipeline Dashboard LIVE

Download Idle

No active download

ETL Load Idle

No active load

Enrichment Idle

No active enrichment

Completed

Failed

Processing

Total

Select All

	File Name	Status	Records	Size	Duration	Actions
Click "Refresh" to load file processing log

Enrich to Production

Copy raw data to production database and enrich with NPPES taxonomy data.

Source Database

Pipeline Options

Ready to enrich

Pipeline Steps:

Create production database (if needed)
Copy all tables using pg_dump streaming
Enrich providers with NPPES taxonomy
Update address/location fields

Enrichment Progress

Initialize

Copy Data

Enrich

Complete

0/7

Tables Copied

Providers Enriched

00:00

Elapsed Time

Current Operation:

Waiting to start...

Available Databases

Data Operations

Clear Application Cache

Clear any cached data to ensure fresh results

Reset Operation Status

Reset stuck data operations (download/load)

Interactive API Documentation

Swagger UI with purple gradient theme. All 140+ endpoints documented with examples and schemas.

Platform Reference

51-chapter developer reference covering architecture, schema, ETL, auth, and deployment.

Quick Actions

Server Status

Uptime

—

Updates

—

Reboot

Not needed

Resource Usage

Host Updates

Containers

Host Commands (SSH)

These commands must be run via SSH on the host. The API container cannot modify the host OS directly.

Update Package Lists

sudo apt-get update

Apply All Updates

sudo apt-get upgrade -y

Apply Security Updates Only

sudo unattended-upgrade --dry-run

Reboot Server

sudo reboot

Maintenance Mode

Current Status

Normal Operation

Schedule Maintenance

Banner Message

Duration (hours)

Maintenance Mode

Current Status

Normal Operation

Schedule Maintenance

Banner Message

Duration (hours)

API Usage

Audited Actions

—

Avg / Day

—

Active Orgs

—

Active Users

—

Peak Hour

—

Active Keys

—

Daily Request Volume

Requests by Hour (UTC)

Usage by Organization

Top Users

Top Actions

API Key Usage

Metric Legend (click to expand)

News & Updates

Loading articles…

About the Table of Contents

The Table of Contents (TOC) file is the index that maps insurance plans to their machine-readable price transparency files. According to CMS requirements, this file shows which plans share the same pricing data and where to download each file.

What you'll see:

Reporting Plans: Insurance plan names, IDs (HIOS or EIN), market types (individual/group), and issuers
In-Network Files: URLs to negotiated rate files used by each plan
Allowed Amount Files: URLs to out-of-network pricing files for each plan
File Sharing: Multiple plans often share the same pricing file to reduce redundancy

Note: This is the "index" referenced in the CMS Price Transparency Guide. View CMS TOC Schema on GitHub

Reporting Plans

Loading Table of Contents...

Raw SQL Search

Execute SQL queries directly against the active database with safety guardrails

Active Database: Loading...

Run on all selected payers

SQL Query Editor

Tip: Use LIMIT to restrict results. Read-only queries recommended. 0 characters

Example Queries

Top 10 Services by Provider Count

-- Uses materialized view for instant results
SELECT * FROM service_provider_counts 
ORDER BY provider_count DESC 
LIMIT 10;

Provider Groups by City/State

SELECT 
  city, state,
  COUNT(*) as group_count
FROM provider_groups
WHERE city IS NOT NULL
GROUP BY city, state
ORDER BY group_count DESC
LIMIT 20;

Price Stats for CPT Code

SELECT s.billing_code, s.name,
  MIN(np.negotiated_rate_value) as min_rate,
  MAX(np.negotiated_rate_value) as max_rate,
  ROUND(AVG(np.negotiated_rate_value)::numeric, 2) as avg_rate
FROM in_network_services s
  JOIN negotiated_rates nr ON s.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
WHERE s.billing_code = '99213'
GROUP BY s.billing_code, s.name;

Top Providers by Services Offered

-- Uses materialized view for instant results
SELECT * FROM provider_service_counts 
ORDER BY services_offered DESC 
LIMIT 20;

Search Services by Name

SELECT billing_code, billing_code_type, 
  name, description
FROM in_network_services
WHERE name ILIKE '%mri%' 
   OR description ILIKE '%mri%'
LIMIT 25;

Service Prices with Locations

SELECT s.billing_code, s.name,
  np.negotiated_rate_value as price,
  pg.city, pg.state
FROM in_network_services s
  JOIN negotiated_rates nr ON s.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
  JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
WHERE s.billing_code = '70553'
LIMIT 25;

Query Results

Execute a query to see results here

Payer Price Comparison

Compare healthcare pricing across your selected insurance payers side-by-side

Comparing:

Tip: Enter a specific billing code like 99213 for detailed comparison, or search by service name.

Compare Prices Across Payers

Enter a billing code or service name above to compare pricing data across your selected insurance payers.

Select at least 2 payers using the header selector to enable comparison.

AI Assistant

Ask questions about healthcare pricing — the assistant can search and analyze the data you have access to.

Hi! Ask me about negotiated rates, cost estimates, providers, or how payers compare. I can only see the databases your account is allowed to access.

Bring Your Own Key (optional)

By default this uses the platform's hosted model (subject to your monthly quota). Add your own provider key for unlimited, private inference billed to your account.

Healthcare price transparency, made usable.

Sign in to Aequalis

User Name

Account Details

Account & Settings

Full-Text Search

Rate Estimator

Enter Procedure Information

Cost Estimator Guide

Cost Metrics

Confidence Score

Value Score (Recommendations)

Provider Table Columns

Recommendation Badges

Search Filters

Service Name

Estimated Cost Range

Recommended Providers Best Value

Provider Comparison (Top 20)

Error

About Out-of-Network Services

Multi-Rate Intelligence Features:

CMS Transparency in Coverage Schema:

Filter by Location

Filter by Insurance Network

Data Quality Filters

Hospital Price Transparency

Filter Hospitals

Geographic Rate Analysis

Select Service to Analyze

Map Legend

Rate Variations by Location

Data Analytics & Visualization

Platform Dashboard

Notifications & Alerts

Server Containers

Server Resources

User Activity

API Usage

User Management Refresh

Invite a User to your organization

Pending Registrations 0 Refresh

Password Reset Requests 0 Refresh

Role & Permission Management Refresh New Role

Available Permissions Show/Hide

Organization Management

Create Organization

Organization Details

Delete Organization

Create License

Edit Tier

Create New Tier

New Customer Setup

Customer Created!

Invite Tokens (share with users)

License Management

Available Tiers

Organization Licenses

System Health Refresh

Branding & Theme Save Settings

Brand Identity

Color Theme

Live Preview

Custom CSS Styles

CSS Editor

Example Snippets & Quick Reference

Common Selectors

Database Cleanup Requests

Orphaned PostgreSQL Databases Scan

Drop Database

Database Operations

Docker Containers Refresh

Data Processing Monitor Start Monitoring Refresh

Database Statistics Refresh

Database Architecture

Database Overview

Core Tables

Complete Table List (19 Tables)

Entity Relationship Diagram

Entity Relationships

User Management

Pending Registrations 0

Password Reset Requests 0

Role & Permission Management

Available Permissions

System Health

Branding & Theme

Orphaned PostgreSQL Databases

Docker Containers

Data Processing Monitor

Database Statistics

Facility Consolidation