Aequalis

Loading payer data...

Scheduled Maintenance
AEQUALIS

Healthcare price transparency, made usable.

Negotiated rates, out-of-network amounts, and Medicare Advantage intelligence — across every contract and payer, with an AI assistant that answers your questions against the data directly.

The Aequalis Platform

Sign in to Aequalis

Sign in with your account to continue.

Need an account? Create one  ·  More options

U

User Name

User

Account Details

Username—
Email—
Role—
Organization—
Two-Factor—
Member Since—
Last Login—

Account & Settings

Full-Text Search

Search across all healthcare services with intelligent matching and filtering

Advanced Filters

Rate Estimator

Estimates derived from current Transparency in Coverage and Hospital Price Transparency files

Enter Procedure Information

CPT, HCPCS, or DRG code
City within the selected state
Advanced Options
Filter low-quality data (0-100, 0=all data)
Filter by network type

Cost Estimator Guide

Cost Metrics
  • Median Rate: The middle value - half of providers charge more, half charge less. Most reliable estimate.
  • Mean Rate: Average of all rates. Can be skewed by outliers.
  • 10th Percentile: Low-cost benchmark - 90% of providers charge more than this.
  • 90th Percentile: High-cost benchmark - only 10% of providers charge more.
Confidence Score
  • 80-100% (High): Large sample, consistent rates, high-quality data. Very reliable.
  • 60-79% (Medium): Moderate sample, some variance. Generally reliable.
  • 0-59% (Low): Small sample or high variance. Use with caution.
  • Formula: 50% sample size + 30% quality + 20% consistency
Value Score (Recommendations)
  • Formula: 60% Quality Score + 40% Affordability Score
  • Quality (0-100): Based on data plausibility and reliability metrics
  • Affordability (0-100): Inverse normalized rate (lower cost = higher score)
  • Higher is better: 90+ is excellent, 70-89 is good, 50-69 is fair
Provider Table Columns
  • Name: Provider organization or individual name (from NPPES registry)
  • Classification: Provider type (e.g., Medical Supply Company, DME Supplier)
  • Location: City and state where provider operates
  • Rate: Negotiated rate for this service
  • Percentile: Where this rate ranks vs all providers (lower is cheaper)
  • Quality: Data quality score (0-100). 50 = no data available.
Recommendation Badges
  • BEST VALUE: Highest overall value score (quality + affordability)
  • HIGH QUALITY: Quality score above 70
  • LOW COST: Rate in bottom 25% (25th percentile or lower)
Search Filters
  • Billing Code: CPT, HCPCS, or DRG code (required)
  • Min Quality: Filter out providers below this quality threshold
  • Exclude Ghost Rates: Remove potentially inaccurate $0.01-$1 rates
  • Network: Filter by insurance network type (PPO, HMO, etc.)
Tip: Click any column header to sort the provider table. Click the NPI link to view provider details in the NPPES registry.
Analyzing market data...

Service Name

Billing Code Info

Estimated Cost Range

Min Max
MEDIAN COST
$0
AVERAGE COST
$0
TYPICAL RANGE
$0 - $0
25th - 75th percentile
Sample Size: 0 rates from 0 providers
Data Quality: 0/100
Locations: 0 cities

Recommended Providers Best Value

Based on quality scores and competitive rates, we recommend these providers for the best value:

Provider Comparison (Top 20)

Error

About Out-of-Network Services

Each card below represents a healthcare service or procedure (identified by billing codes like CPT, HCPCS, or DRG). These cards show the allowed amounts - the maximum amount an insurance plan will pay for out-of-network services.

What you'll see: Service name, billing code, average/min/max allowed amounts, billed charges, billing class (professional/institutional), provider count, and reporting entity information. Click any card to see detailed pricing records.

About In-Network Negotiated Rates (Click to expand)

Each card below represents a healthcare service or procedure with pre-negotiated rates between insurance plans and provider networks. These are the actual contracted prices that in-network providers have agreed to accept.

Multi-Rate Intelligence Features:

  • Rate Type Badges: Each service displays its negotiation type - Negotiated (fee-for-service), Bundle (bundled payment), Capitation (per-member), Derived (calculated), or Fee Schedule (standardized)
  • Expiration Tracking: Rates are automatically filtered to hide expired contracts by default. Services show expiration dates with color-coded warnings: red for expired, orange for expiring within 30 days, gray for valid
  • Rate Comparison: Services with multiple rates display an expandable section showing all rate variations, including each rate's type, amount, provider count, expiration status, and summary statistics (range, average, active count)
  • Advanced Filtering: Filter by specific rate types or toggle expired rates visibility. Use quality filters to show only rates from verified providers with NPPES taxonomy data
  • Quality Indicators: Badges indicate provider verification levels. Ghost rates (unverified providers) can be excluded for higher data quality

CMS Transparency in Coverage Schema:

  • Service Details: Billing code, code type, code version, service name, description
  • Negotiation Arrangement: Fee-for-service (FFS), bundled payments, or capitation
  • Rate Information: Negotiated rates (dollar amounts or percentages), billing class, expiration dates
  • Service Modifiers: Place of service codes, billing code modifiers, additional information
  • Provider Networks: Tax Identification Numbers (TINs), business names, network names, NPI lists
  • Special Cases: Severity of illness (for DRG codes), bundled codes, covered services (for capitation)

Note: Plan-level information (which specific insurance plans use these rates) is stored in separate Table of Contents files. Each data file can be associated with multiple insurance plans (plan name, plan ID, issuer name, market type) as defined in the CMS reporting structure.

Advanced Options

Filter by Location

Filter by Insurance Network

Filter rates by specific insurance network (e.g., PPO, HMO, EPO)

Data Quality Filters

Hospital Price Transparency

Compare negotiated rates across 76 BCBS Michigan in-network hospitals

About Hospital Data: This data shows hospitals with BCBS Michigan contracts, extracted from Transparency in Coverage files.

Currently showing 76 hospitals with 100% having negotiated rates

Filter Hospitals

Loading hospitals...
No hospitals found matching your filters

Geographic Rate Analysis

Explore how healthcare service rates vary by location across Michigan and beyond

About Geographic Analysis: This tool visualizes provider locations and rate variations across different cities and regions.

Loading provider location data...

Select Service to Analyze

Interactive Rate Map
Tip: Click clusters to zoom in and see individual locations. At zoom level 12+, all markers are shown separately.

Map Legend

Low Rate (Below Avg)
Average Rate
High Rate (Above Avg)

Rate Variations by Location

Data Analytics & Visualization

Explore pricing trends, build dashboards, and analyze healthcare data with Kibana

Getting Started: Kibana is loading below. If you see a blank screen, it may take 60-90 seconds to fully initialize. You can also open Kibana directly

Admin Panel

System management & operations

Platform Dashboard

API
Checking...
PostgreSQL
Checking...
Elasticsearch
Checking...
Operations
Idle
Databases
—
Loading...
Records
—
Loading...
Users
—
Loading...
Files Processed
—
Loading...

Notifications & Alerts

Loading...

Server Containers

Loading...

Server Resources

Loading...

User Activity

Loading...

API Usage

Loading...

User Management

-
Total Users
-
Admins
-
Active

Invite a User to your organization

Generate an invite token and share it with the new user. They paste it during registration to join your organization (no admin approval needed). Invites expire after 7 days.

User Email Organization Role 2FA Status Last Login Actions
Loading users...

Pending Registrations 0

Loading pending registrations...

Password Reset Requests 0

Users who proved their identity with 2FA and asked an admin to reset their password. Resolving generates a one-time temporary password the user must change on next login.

Loading password reset requests...

Role & Permission Management

Manage user roles and their permissions. Roles determine what features and data users can access.

Loading roles...

Available Permissions

Loading permissions...

Organization Management

Loading organizations...

Create Organization

Organization Details

Loading...

Delete Organization

This action is irreversible. This will permanently delete the organization, all its users, their sessions, API keys, database grants, invites, and licenses.

To confirm, type below:

Create License

Edit Tier

Create New Tier

New Customer Setup

Organization
License
Databases
Invite

Create the organization that will hold users, databases, and licensing.

Choose a license tier. Limits can be adjusted later.

Grant access to payer databases. You can add more later.

Loading databases...

Send invite(s) to initial users. They'll receive a token to register.

✓

Customer Created!

Invite Tokens (share with users)

License Management

Available Tiers

Loading tiers...

Organization Licenses

Loading...

System Health

Loading...

Branding & Theme

Customize the platform appearance including logo, brand name, and color theme.

Brand Identity

Enter an emoji () or image URL

Color Theme

Live Preview

Aequalis
Healthcare Price Comparison

Custom CSS Styles

Warning: Changes affect ALL users. Always test with Preview first before saving.

CSS Editor

Example Snippets & Quick Reference

Common Selectors
.header - Main navigation bar
.tab-content - Tab content areas
.search-box - Search input boxes
table th - Table headers
table td - Table cells
.toast - Toast notifications
Dark Header
.header {
  background: #1a1a2e !important;
}
Extra Rounded Cards
[style*="border-radius: 12px"] {
  border-radius: 20px !important;
}
Custom Font
body {
  font-family: 'Inter', sans-serif !important;
}
Larger Table Text
table {
  font-size: 1.1em !important;
}
table td, table th {
  padding: 16px !important;
}
Pink Accent Buttons
button[style*="background: #10b981"] {
  background: #ec4899 !important;
}
Hide Admin Tab
/* Hide for non-admins */
#admin-panel { display: none; }
.tab-button[data-tab="admin"] {
  display: none;
}

Click on a payer card to expand and see database details. Each payer has a raw database (for ETL processing) and a production database (for user queries).

Loading databases...

Database Cleanup Requests

Orphaned PostgreSQL Databases

Databases on the PostgreSQL server that are not tracked in database_config.json. These may be leftovers from removed payers.

Click Scan to check for orphaned databases.

Drop Database

This action is irreversible. This will permanently drop the PostgreSQL database and all tables, records, and indexes within it. This data cannot be recovered.

To confirm, type below:

Database Operations

These operations affect the active database only. Reset File Tracking clears download history. Reset Database deletes all data.

Docker Containers

Loading container status...

Data Processing Monitor

Click "Start Monitoring" to track data processing in real-time

Database Statistics

Loading...

Database Architecture

Database Overview

19
Total Tables
Loading...
Total Records
50+
Indexes
Fully Enriched

Core Tables

Table Records Purpose
in_network_services Loading... Medical services/procedures with quality scores
provider_groups Loading... Provider organizations (enriched with NPPES)
negotiated_rate_provider_groups Loading... Junction table: Service Provider links
nppes_providers Loading... National provider lookup (taxonomy enrichment)
negotiated_prices Loading... Actual dollar amounts for services
Performance Note: With many provider groups, every query MUST use WHERE filters and LIMIT clauses to prevent memory exhaustion!

Complete Table List (19 Tables)

in_network_services
negotiated_rates
negotiated_prices
provider_groups
negotiated_rate_provider_groups
nppes_providers
reporting_entities
plans
allowed_amounts
allowed_amount_providers
provider_facilities
providers
services
tax_identifiers
service_facility_link
enrichment_log
file_processing_log
rate_pattern_flags
npi_registry_cache

Entity Relationship Diagram

Complete visual representation of all table relationships

┌─────────────────────────────────────────────────────────────────────────────────────┐ │ INSURANCE PAYER HIERARCHY │ └─────────────────────────────────────────────────────────────────────────────────────┘ ┌──────────────────────────┐ │ reporting_entities │ Insurance Company (BCBS Michigan) │ ────────────────────── │ │ • id (PK) │ │ • name │ │ • entity_type │ └────────────┬─────────────┘ │ 1:N ┌──────────────────────────┐ │ plans │ Insurance Plans (HMO, PPO, etc.) │ ────────────────────── │ │ • id (PK) │ │ • reporting_entity_id │ (FK) │ • plan_name │ └────────────┬─────────────┘ │ 1:N ┌─────────────────────────────────────────────────────────────────────────────────────┐ │ CORE SERVICE & PRICING DATA │ └─────────────────────────────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────┐ │ in_network_services │ Medical Services │ ─────────────────────────────────────────── │ │ • id (PK) 418,652 │ │ • billing_code (CPT/HCPCS) │ │ • name │ │ • plausibility_score │ │ • plausibility_flags │ └───────────────────────┬─────────────────────────┘ │ 1:N ┌─────────────────────────────────────────────────┐ │ negotiated_rates 418,652 │ └───────────────┬──────────────────┬──────────────┘ │ 1:N │ 1:N ┌──────────────────────────┐ ┌────────────────────────────────────┐ │ negotiated_prices │ │ negotiated_rate_provider_groups │ │ • negotiated_rate_value │ │ 34,011,942│ └──────────────────────────┘ └───────────────┬────────────────────┘ │ N:1 ┌─────────────────────────────────────────────────────────────────────────────────────┐ │ PROVIDER INFORMATION │ └─────────────────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────┐ │ provider_groups 34,011,942│ Providers │ • tin_value, npi_list │ │ • primary_taxonomy_code │ │ • primary_specialization │ │ • city, state, zip_code │ └────────────────────────────────────────────┘ │ Enrichment (JOIN on NPI) │ ┌────────────────────────────────────────────┐ │ nppes_providers 9,236,343 │ │ • npi (PK) │ │ • provider_taxonomy_code │ │ • provider_specialization │ └────────────────────────────────────────────┘
Key Insight: The junction table negotiated_rate_provider_groups with 34M records connects services to providers, creating a many-to-many relationship. This is why queries must be carefully optimized!

Entity Relationships

Insurance Hierarchy (1:N:N)
reporting_entities (BCBS Michigan)
     1:N
plans (HMO, PPO, Medicare Advantage)
     1:N
in_network_services (418,652 services)
Service Pricing (1:N:N)
in_network_services (e.g., "MRI Brain - CPT 70543")
     1:N
negotiated_rates (rate containers)
     1:N                     1:N
negotiated_prices      negotiated_rate_provider_groups (34M links!)
    ($ amounts)               N:1
                        provider_groups (34M organizations)
Provider Enrichment (N:1 Lookup)
provider_groups.npi_list [NPI1, NPI2, ...]
     JOIN
nppes_providers.npi (9.2M records)
     COPY TAXONOMY FIELDS
provider_groups:
    • primary_taxonomy_code
    • primary_taxonomy_name
    • primary_specialization
    • primary_classification
    • taxonomy_enriched = true 
Foreign Key Constraints:
  • CASCADE DELETE: Deleting a service auto-deletes all its rates, prices, and provider links
  • RESTRICT: Cannot delete reporting entities/plans if services exist
  • REFERENTIAL INTEGRITY: All relationships enforced at database level
Detailed Foreign Key Behavior
CASCADE DELETE CHAIN:
DELETE FROM in_network_services WHERE id = 123
   CASCADE
DELETE FROM negotiated_rates WHERE service_id = 123
   CASCADE                            CASCADE
DELETE FROM negotiated_prices    DELETE FROM negotiated_rate_provider_groups
WHERE negotiated_rate_id IN ...  WHERE negotiated_rate_id IN ...

Result: One service deletion removes all associated rates, prices, and provider associations

RESTRICT CONSTRAINTS:
  • in_network_services.reporting_entity_id reporting_entities.id (ON DELETE RESTRICT)
  • in_network_services.plan_id plans.id (ON DELETE RESTRICT)

Cannot delete insurance plans or reporting entities while services reference them - prevents orphaned data

Visual Relationship Tree Example:
BCBS Michigan (Reporting Entity)
└── Plan A - PPO Network
    └── Service: MRI Brain (CPT 70543)
        ├── Negotiated Rate #1
        │   ├── Price: $500 (in-network)
        │   └── Provider Group #1
        │       ├── Name: Henry Ford Hospital
        │       ├── Location: Detroit, MI
        │       ├── Taxonomy: 282N00000X (Hospital)
        │       └── Plausibility Score: 85/100 
        └── Negotiated Rate #2
            ├── Price: $750 (in-network)
            └── Provider Group #2
                ├── Name: Michigan Medicine
                ├── Location: Ann Arbor, MI
                ├── Taxonomy: 282N00000X (Hospital)
                └── Plausibility Score: 92/100 

Index Strategy

in_network_services (418K records)
  • idx_in_network_services_billing_code (btree) Fast CPT/HCPCS lookup
  • idx_in_network_services_plausibility (btree) Quality filtering
  • idx_in_network_services_description_gin (GIN) Full-text search
provider_groups (34M records!) CRITICAL

Heavily indexed for performance on massive dataset:

  • idx_provider_groups_tin_value (btree) Provider lookup by Tax ID
  • idx_provider_groups_city (btree) Geographic queries
  • idx_provider_groups_state (btree) State filtering
  • idx_provider_groups_taxonomy_code (btree) Specialty filtering
  • idx_provider_groups_hospital_taxonomy (partial WHERE taxonomy ~~ '282%') Hospital-specific
negotiated_prices (418K records)
  • idx_negotiated_prices_rate_value (btree) Price filtering/sorting
  • idx_negotiated_prices_negotiated_type (btree) Rate type filtering
Critical Rule: With 34M records in provider_groups, EVERY query MUST use an indexed WHERE clause or the database will run out of memory!

Data Pipeline Flow

Phase 1: Data Ingestion
[CMS Transparency Files] src/parsers/cms_parser.py
Extract JSON Data
├─ reporting_entities (Insurance company)
├─ plans (Insurance plans)
├─ in_network_services (418K services)
├─ negotiated_rates (Rate containers)
├─ negotiated_prices (Dollar amounts)
└─ provider_groups (34M records created)
Phase 2: NPPES Enrichment
[NPPES Data File - 10.29GB CSV] src/scripts/load_nppes_data.py
Load 9,236,343 providers into nppes_providers table
src/scripts/enrich_provider_groups.py
Match provider_groups.npi_list nppes_providers.npi
├─ Copy primary_taxonomy_code
├─ Copy primary_taxonomy_name
├─ Copy primary_specialization
├─ Copy primary_classification
└─ Set taxonomy_enriched = true
Result: 34M provider groups enriched with specialty data
Phase 3: Quality Scoring
src/scripts/score_plausibility.py + taxonomy_validator.py
For each service (418K):
├─ Get provider taxonomies
├─ Get service category (CPT code range)
├─ Check specialty match
└─ Calculate plausibility_score (0-100)
Distribution: 37.54% Good | 22.64% Fair | 39.58% Questionable
Phase 4: Elasticsearch Indexing
src/elasticsearch/indexer.py
Aggregate by billing_code:
├─ MIN(rate), MAX(rate), AVG(rate)
├─ COUNT(providers)
└─ Nested provider/pricing arrays
Result: 3,869 unique billing codes indexed Kibana dashboards

Performance Best Practices

DO: Always Filter by Indexed Columns
SELECT * FROM provider_groups
WHERE state = 'MI'              --  Uses idx_provider_groups_state
  AND city = 'DETROIT'          --  Uses idx_provider_groups_city
  AND taxonomy_enriched = true  --  Uses idx_provider_groups_enriched
LIMIT 1000;                     --  Prevents memory exhaustion
DO: Use CTEs for Complex Queries
WITH filtered_services AS (
  SELECT id FROM in_network_services 
  WHERE billing_code IN ('99212', '70543')
    AND plausibility_score >= 75
)
SELECT pg.tin_value, AVG(np.negotiated_rate_value)
FROM provider_groups pg
  JOIN ... ON ...
  JOIN filtered_services fs ON ...
WHERE pg.state = 'MI'
GROUP BY pg.tin_value
LIMIT 100;
DON'T: Full Table Scans on 34M Records
--  BAD: Will crash with out of memory!
SELECT * FROM provider_groups;

--  BAD: No WHERE filter on 34M rows
SELECT AVG(negotiated_rate_value) 
FROM negotiated_rate_provider_groups;
DON'T: Complex Aggregations Without Filters
--  BAD: GROUP BY on 34M records without WHERE
SELECT provider_group_id, COUNT(*), AVG(rate)
FROM negotiated_rate_provider_groups
GROUP BY provider_group_id;  -- Database runs out of memory!
DO: Use EXISTS Instead of IN
--  GOOD: EXISTS stops at first match (faster)
SELECT s.billing_code, s.name
FROM in_network_services s
WHERE EXISTS (
  SELECT 1 FROM negotiated_rate_provider_groups nrpg
    JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
  WHERE nrpg.negotiated_rate_id = (
    SELECT id FROM negotiated_rates WHERE service_id = s.id
  )
  AND pg.state = 'MI'
  AND pg.city = 'Detroit'
);
DON'T: Use IN with Large Subqueries
--  BAD: IN evaluates full subquery (slower)
SELECT s.billing_code, s.name
FROM in_network_services s
WHERE s.id IN (
  SELECT service_id FROM negotiated_rates nr
    JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
    JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
  WHERE pg.state = 'MI'  -- Still processes all 34M rows!
);
DO: Use Proper Index Hints
--  GOOD: Sargable query (uses index)
SELECT * FROM provider_groups
WHERE primary_taxonomy_code = '207R00000X'  -- Exact match
  AND state = 'MI';

--  GOOD: Prefix LIKE (uses index)
SELECT * FROM provider_groups
WHERE primary_taxonomy_code LIKE '282%'  -- Hospital codes
  AND city = 'Detroit';
DON'T: Use Non-Sargable Queries
--  BAD: Leading wildcard can't use index
SELECT * FROM provider_groups
WHERE primary_taxonomy_code LIKE '%00000X';  -- Full scan!

--  BAD: Function on indexed column
SELECT * FROM provider_groups
WHERE LOWER(state) = 'mi';  -- Index not used!

--  GOOD: Store data in consistent case
SELECT * FROM provider_groups
WHERE state = 'MI';  -- Uses index
Advanced: Materialized Views

For frequently-run aggregations on 34M records, consider materialized views:

-- Create materialized view (refreshed periodically)
CREATE MATERIALIZED VIEW provider_state_summary AS
SELECT 
  state,
  COUNT(*) as provider_count,
  COUNT(*) FILTER (WHERE taxonomy_enriched = true) as enriched_count,
  COUNT(DISTINCT primary_taxonomy_code) as unique_specialties
FROM provider_groups
GROUP BY state;

CREATE INDEX ON provider_state_summary(state);

-- Query is instant (no 34M row scan!)
SELECT * FROM provider_state_summary WHERE state = 'MI';
Golden Rules for 34M Record Queries
  • Always use WHERE clauses with indexed columns (state, city, taxonomy_code)
  • Always use LIMIT to cap result sets
  • Filter early with CTEs before complex JOINs
  • Use EXISTS instead of IN for subqueries (stops at first match)
  • Avoid LIKE '%text' (leading wildcards can't use indexes)
  • Don't use functions on indexed columns in WHERE clauses
  • Consider materialized views for repeated complex aggregations
  • Test with EXPLAIN ANALYZE to verify index usage

Quality Scoring System

Plausibility Score (0-100)

Validates whether provider specialty matches service type

Score Ranges
90-100: Excellent

Perfect specialty match (e.g., Cardiologist doing heart surgery)

75-89: Good

Compatible provider-service match

50-74: Fair

Neutral/generic codes (DME, equipment)

25-49: Questionable

Specialty mismatch (e.g., Dentist doing orthopedic surgery)

Actual Distribution (418,652 services)
0.24%
Excellent (1,024)
37.54%
Good (157,144)
22.64%
Fair (94,796)
39.58%
Questionable (165,688)
Validation Logic
CPT Code Ranges:
• Surgery: 10000-69999
• Radiology: 70000-79999
• Lab: 80000-89999
• Medicine: 90000-99999

Scoring Algorithm:
1. Base score: 50 (neutral)
2. If provider classification matches service: +25
3. If specialty matches (e.g., Imaging for MRI): +25
4. Special rules (hospitals can do anything): 85
5. Max score: 100
Why This Matters: Quality scores help identify potentially questionable provider-service combinations, improving data reliability for price comparisons and network design.

Real-World Query Examples

1 Find Cheapest Provider for Service
SELECT 
  COALESCE(npp.provider_organization_name, npp.provider_name) as provider,
  pg.city, pg.state,
  pg.primary_specialization,
  np.negotiated_rate_value as rate,
  ins.plausibility_score
FROM in_network_services ins
  JOIN negotiated_rates nr ON ins.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
  JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
  LEFT JOIN nppes_providers npp ON SPLIT_PART(pg.npi_list, ',', 1) = npp.npi
WHERE ins.billing_code = '70543'  -- MRI Brain
  AND pg.state = 'MI'
  AND ins.plausibility_score >= 75  -- Only quality matches
ORDER BY np.negotiated_rate_value ASC
LIMIT 10;
Indexes Used: billing_code, state, plausibility_score
2 Provider Cost Efficiency Score
WITH market_avg AS (
  SELECT AVG(np.negotiated_rate_value) as avg_rate
  FROM in_network_services ins
    JOIN negotiated_rates nr ON ins.id = nr.service_id
    JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  WHERE ins.billing_code = '99212'
)
SELECT 
  pg.tin_value,
  COALESCE(npp.provider_organization_name, 'Unknown') as provider_name,
  pg.city,
  AVG(np.negotiated_rate_value) as provider_avg,
  ma.avg_rate as market_avg,
  ROUND((AVG(np.negotiated_rate_value) - ma.avg_rate) / ma.avg_rate * 100, 1) as pct_vs_market
FROM provider_groups pg
  LEFT JOIN nppes_providers npp ON SPLIT_PART(pg.npi_list, ',', 1) = npp.npi
  JOIN negotiated_rate_provider_groups nrpg ON pg.id = nrpg.provider_group_id
  JOIN negotiated_rates nr ON nrpg.negotiated_rate_id = nr.id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN in_network_services ins ON nr.service_id = ins.id
  CROSS JOIN market_avg ma
WHERE ins.billing_code = '99212'
  AND pg.state = 'MI'
GROUP BY pg.tin_value, npp.provider_organization_name, pg.city, ma.avg_rate
ORDER BY pct_vs_market ASC
LIMIT 20;
3 Geographic Price Variation
SELECT 
  pg.city,
  pg.state,
  COUNT(DISTINCT pg.id) as provider_count,
  MIN(np.negotiated_rate_value) as min_rate,
  AVG(np.negotiated_rate_value) as avg_rate,
  MAX(np.negotiated_rate_value) as max_rate,
  MAX(np.negotiated_rate_value) - MIN(np.negotiated_rate_value) as price_spread
FROM in_network_services ins
  JOIN negotiated_rates nr ON ins.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
  JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
WHERE ins.billing_code = '70543'
  AND pg.state = 'MI'
  AND pg.city IS NOT NULL
GROUP BY pg.city, pg.state
HAVING COUNT(DISTINCT pg.id) >= 3
ORDER BY price_spread DESC;
Pro Tip: These queries all use indexed columns (billing_code, state, city) and include LIMIT clauses to ensure good performance even with 34M records!

Key Design Decisions

Why Junction Table?
Without Junction Table (Bad):
negotiated_rates.provider_list = "[group1, group2, ...]"
  • Can't query efficiently
  • Can't enforce foreign keys
  • Can't index properly
With Junction Table (Good):
negotiated_rate_provider_groups:
  rate_id | provider_group_id
  --------|------------------
  1       | 12345
  1       | 67890
  • Fast lookups in both directions
  • Referential integrity maintained
  • Proper indexing on both FKs
Why Separate negotiated_prices?

Reason: One rate can have multiple price variations

Rate ID 1 (CPT 99212 - Office Visit):
├─ Price 1: $75 (professional, office)
├─ Price 2: $120 (facility, hospital)
└─ Price 3: $90 (professional, telehealth)

Supports: Different billing classes, settings, and modifiers for the same service

Why Denormalize Taxonomy?
Alternative (Normalized):
provider_groups.taxonomy_id  taxonomy_reference_table.id
Current Design (Denormalized):
provider_groups:
  • primary_taxonomy_code
  • primary_taxonomy_name
  • primary_specialization
  • primary_classification
Advantages:
  • Faster queries (no extra JOIN)
  • Direct filtering on specialization
  • Better index performance
  • Simpler application logic
Trade-off:
  • Data duplication (acceptable for analytics)
  • Read performance >> Storage cost
Design Philosophy

All design decisions optimize for read performance and query simplicity in an analytics workload. With 34M+ provider records, every decision prioritizes fast filtering and aggregation over storage efficiency.

Data Volume Summary

Comprehensive overview of record counts across all core tables.

Table Name Record Count Description Index Strategy
provider_groups Loading... Healthcare organizations (TIN-based) 13+ indexes (tin, taxonomy, city, state, zip)
negotiated_rate_provider_groups Loading... Junction table (rates providers) 2 FK indexes (rate_id, provider_group_id)
nppes_providers Loading... NPPES registry (NPI-based) NPI primary key + taxonomy indexes
in_network_services Loading... CPT/HCPCS services with quality scores billing_code, plausibility, description_gin
negotiated_rates Loading... Rate containers (1:1 with services) service_id FK index
negotiated_prices Loading... Actual dollar amounts negotiated_rate_id FK index
reporting_entities Loading... Insurance companies (BCBS Michigan) Primary key only
plans Loading... Insurance plans (PPO, HMO, etc.) reporting_entity_id FK index
TOTAL Loading... Total records across all core tables
Performance Implications
  • provider_groups: REQUIRES indexed WHERE clauses or database runs out of memory
  • Junction table: Every provider price query touches this - indexes are critical
  • NPPES providers: NPI lookups must use primary key index
  • Total records: Full table scans are impossible - always filter first

Enrichment Status Tracking

The enrichment process populates provider taxonomy data from NPPES registry. This system tracks enrichment progress and status.

enrichment_log Table

Purpose: Tracks each enrichment batch run with timestamps and statistics

CREATE TABLE enrichment_log (
    id SERIAL PRIMARY KEY,
    batch_id VARCHAR(100) UNIQUE NOT NULL,
    records_processed INTEGER,
    records_enriched INTEGER,
    records_failed INTEGER,
    status VARCHAR(50),
    started_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    completed_at TIMESTAMP,
    error_message TEXT
);
Provider Enrichment Flags

provider_groups table includes:

  • taxonomy_enriched (BOOLEAN) - Indicates if NPPES data was successfully applied
  • enriched_at (TIMESTAMP) - When enrichment occurred
  • enrichment_source (VARCHAR) - Source system (usually 'nppes')
Query enrichment status:
SELECT 
    COUNT(*) AS total_providers,
    COUNT(*) FILTER (WHERE taxonomy_enriched = true) AS enriched,
    COUNT(*) FILTER (WHERE taxonomy_enriched = false) AS pending,
    ROUND(100.0 * COUNT(*) FILTER (WHERE taxonomy_enriched = true) / COUNT(*), 2) AS pct_enriched
FROM provider_groups;
Enrichment Process Flow
1. LOAD NPPES DATA
   ├─ Download 10.29GB NPPES CSV
   ├─ Parse 9,236,343 provider records
   └─ Load into nppes_providers table

2. MATCH NPIs
   ├─ Extract NPI list from provider_groups.npi_list (JSON array)
   ├─ JOIN nppes_providers ON npi
   └─ Match primary NPI (first in array)

3. COPY TAXONOMY FIELDS
   ├─ primary_taxonomy_code (e.g., "207R00000X")
   ├─ primary_taxonomy_name (e.g., "Internal Medicine")
   ├─ primary_specialization (e.g., "Internal Medicine")
   ├─ primary_classification (e.g., "Allopathic & Osteopathic Physicians")
   └─ Set taxonomy_enriched = true

4. LOG RESULTS
   └─ Insert batch record into enrichment_log
Current Enrichment Status
34,011,942
Total Providers
9,236,343
NPPES Records
~100%
Enrichment Rate

Special Provider Fields

Hospital-Specific Optimizations

Challenge: Hospitals represent only ~0.5% of providers but handle 25%+ of high-value services

Hospital Taxonomy Codes (282%):
  • 282N00000X - General Acute Care Hospital
  • 282E00000X - Long Term Care Hospital
  • 282J00000X - Religious Nonmedical Health Care Institution
  • 282NC0060X - Critical Access Hospital
  • ...and 20+ other 282% variants
Special Indexes:
-- Partial index for hospitals only
CREATE INDEX idx_provider_groups_hospital_taxonomy 
ON provider_groups(primary_taxonomy_code) 
WHERE primary_taxonomy_code LIKE '282%';

-- Fast hospital-only queries
SELECT * FROM provider_groups 
WHERE primary_taxonomy_code LIKE '282%' 
  AND city = 'Detroit'
  AND state = 'MI';

Performance: Uses partial index for hospital queries, reducing index size by 99.5%

Geographic Data Structure

Location fields in provider_groups:

Address Fields:
  • address_line1
  • address_line2
  • city Indexed
  • state Indexed
  • zip_code Indexed
NPPES Fields:
  • provider_city
  • provider_state
  • provider_zip
  • (From nppes_providers)
Common Geographic Queries:
-- Find all providers in Detroit
SELECT * FROM provider_groups WHERE city = 'Detroit';

-- Find all Michigan hospitals
SELECT * FROM provider_groups 
WHERE state = 'MI' 
  AND primary_taxonomy_code LIKE '282%';

-- Find providers in specific zip code
SELECT * FROM provider_groups WHERE zip_code = '48201';
Index Performance:

City/State/Zip indexes enable sub-second queries on 34M records. Without indexes, geographic queries would take 5-10 minutes or fail with out-of-memory errors.

Field Design Rationale
  • Partial Hospital Index: Optimizes for high-value queries without bloating main indexes
  • Denormalized Geography: Faster queries than normalizing to separate location table
  • Multiple Zip Formats: Supports both 5-digit and ZIP+4 formats
  • City/State Indexes: Enable fast regional provider searches

Schema Evolution Timeline

The database schema evolved through 4 major phases from initial load to production-ready analytics platform.

Phase 1: Initial Data Load
  • Parsed CMS transparency files
  • Created base tables: reporting_entities, plans, in_network_services
  • Loaded 418,652 services from BCBS Michigan
  • Basic indexes on primary keys only
Result: Basic data structure, slow queries, no enrichment
Phase 2: NPPES Enrichment
  • Downloaded 10.29GB NPPES CSV (9.2M providers)
  • Created nppes_providers table with NPI primary key
  • Added taxonomy fields to provider_groups
  • Enriched 34M provider groups with taxonomy data
  • Added enrichment_log tracking table
Result: Providers now have specialization data, enabling specialty-based queries
Phase 3: Quality Scoring
  • Added plausibility_score (0-100) to in_network_services
  • Added plausibility_flags (JSON) for validation warnings
  • Implemented taxonomy_validator.py logic
  • Scored all 418,652 services for provider-service appropriateness
  • Created plausibility index for fast filtering
Result: 37.54% Good scores, 39.58% Questionable - quality insights enabled
Phase 4: Elasticsearch Integration
  • Indexed 3,869 unique billing codes to Elasticsearch
  • Enabled fuzzy search and full-text queries
  • Added description_gin GIN index for PostgreSQL full-text
  • Hybrid search: Elasticsearch for discovery, PostgreSQL for analytics
  • Added 13+ performance indexes to provider_groups
Result: Production-ready platform with sub-second search and analytics on 77M+ records
Evolution Metrics
4 phases
Major Iterations
77M+
Final Record Count
50+
Indexes Created

Conceptual Mind Map

High-level visualization of the complete database architecture showing all major components and relationships.

                                    HEALTHCARE TRANSPARENCY DATABASE
                                                    │
                        ┌───────────────────────┼───────────────────────┐
                        │                       │                       │
                ┌─── INSURANCE ────┐  ┌──── SERVICES ────┐  ┌──── PROVIDERS ────┐
                │                  │  │                  │  │                   │
                │ reporting_       │  │ in_network_      │  │ provider_groups   │
                │   entities (1)   │  │   services       │  │   (34M records!)  │
                │        │         │  │   (418K)         │  │        │          │
                │                 │  │        │         │  │        │          │
                │   plans (~50)    │  │        │         │  │  ┌─────┴─────┐   │
                └──────────────────┘  │                 │  │  │           │   │
                                      │ negotiated_      │  │                │
                                      │   rates          │  │ Enriched   npi   │
                                      │   (418K)         │  │ Taxonomy  _list  │
                                      │        │         │  │  Fields   (JSON) │
                                      │        ├─────┐   │  │  │           │   │
                                      │                │  │                │
                                      │ negotiated  rate │  │ primary_  nppes_ │
                                      │   _prices  _pvdr │  │ taxonomy  pvdrs  │
                                      │   (418K)  _grps  │  │ _code    (9.2M)  │
                                      │           (34M!) │  │                  │
                                      └──────────────────┘  └──────────────────┘
                                                    │
                        ┌───────────────────────┼───────────────────────┐
                        │                       │                       │
                ┌─── ENRICHMENT ───┐  ┌──── QUALITY ─────┐  ┌───── SEARCH ─────┐
                │                  │  │                  │  │                  │
                │ enrichment_log   │  │ plausibility_    │  │ Elasticsearch    │
                │ • batch_id       │  │   score (0-100)  │  │ • 3,869 codes    │
                │ • records_       │  │ • flags (JSON)   │  │ • Fuzzy search   │
                │   processed      │  │ • validated_at   │  │ • Full-text      │
                │ • records_       │  │                  │  │                  │
                │   enriched       │  │ Distribution:    │  │ Hybrid Query:    │
                │ • status         │  │ • 37.54% Good    │  │ 1. ES discovery  │
                │                  │  │ • 22.64% Fair    │  │ 2. PG analytics  │
                │ NPPES Pipeline:  │  │ • 39.58% Quest.  │  │ 3. Combine       │
                │ └─ Match NPI    │  │                  │  │                  │
                │ └─ Copy Tax.    │  │ Taxonomy Logic:  │  │ GIN Index:       │
                │ └─ Set Flag     │  │ CPT  Taxonomy   │  │ description_gin  │
                └──────────────────┘  └──────────────────┘  └──────────────────┘
                                                    │
                        ┌───────────────────────┼───────────────────────┐
                        │                       │                       │
                ┌──── INDEXES ─────┐  ┌─── GEOGRAPHY ────┐  ┌──── ANALYTICS ───┐
                │                  │  │                  │  │                  │
                │ 50+ total        │  │ city, state, zip │  │ Price Comp.:     │
                │                  │  │ indexes          │  │ • Geographic     │
                │ Critical:        │  │                  │  │ • Provider       │
                │ • tin_value      │  │ Regional Search: │  │ • Specialty      │
                │ • taxonomy_code  │  │ └─ Fast city    │  │                  │
                │ • billing_code   │  │     queries      │  │ Quality Filter:  │
                │ • plausibility   │  │ └─ State agg.   │  │ • Score ranges   │
                │                  │  │ └─ Zip lookup   │  │ • Flag analysis  │
                │ Partial:         │  │                  │  │                  │
                │ • hospital_tax.  │  │ Hospital-spec:   │  │ Cost Analysis:   │
                │   (282% only)    │  │ └─ 282% codes   │  │ • Min/max/avg    │
                │                  │  │     + city       │  │ • Percentiles    │
                └──────────────────┘  └──────────────────┘  └──────────────────┘

KEY DESIGN PRINCIPLES:
├─ Scale First: Every decision optimized for 34M+ provider queries
├─ Denormalize: Taxonomy fields copied for performance (no JOIN needed)
├─ Index Everything: 50+ indexes enable sub-second queries on massive tables
├─ Junction Table: Many-to-many relationships properly normalized
├─ Quality Scores: Built-in validation for service-provider appropriateness
└─ Hybrid Search: Elasticsearch for discovery, PostgreSQL for analytics
                            
Architecture Highlights
3 Main Entities
Insurance, Services, Providers - all interconnected
3 Enhancement Layers
Enrichment, Quality, Search - stacked capabilities
3 Performance Systems
Indexes, Geography, Analytics - speed optimized

ETL Pipeline Guide

Reference guide for running the ETL data loading pipeline. Commands are designed to run on the AWS Lightsail server via SSH.

Basic Commands
# Run full ETL pipeline (all files)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py
# Run with record limit (for testing)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --limit-records 10000
# Run with file limit
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --limit 5
State Filter

Filter data by provider state using NPPES enrichment. Only providers in the specified state(s) will be loaded.

# Filter to Michigan providers only
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --state-filter MI
# Filter to multiple states
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --state-filter MI,OH,IN
Billing Code Filter

Filter data by billing codes to load only specific procedures or services. Supports exact matching on billing_code and/or billing_code_type.

# Filter to a single billing code
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-codes 99213
# Filter to multiple billing codes
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-codes 99213,99214,99215
# Filter by billing code type (CPT, HCPCS, etc.)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-code-types CPT
# Combine filters
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --billing-codes 99213,99214 --billing-code-types CPT
Tip: Billing code filter dramatically reduces data size by only loading matching services and their associated provider groups.
Combined Filters

Filters can be combined for precise data loading. Example: load only specific CPT codes for Michigan providers.

# Office visits (99213-99215) in Michigan only
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py \ --state-filter MI \ --billing-codes 99213,99214,99215
# All CPT codes in Michigan and Ohio, limit to first 5 files
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py \ --state-filter MI,OH \ --billing-code-types CPT \ --limit 5
Performance Options
# Adjust batch size (default: 50000)
--batch-size 25000
# Provider group chunk size (default: 100000)
--provider-group-chunk-size 50000
# Enable parallel processing (multiple files)
--parallel --workers 4
# Sample percentage (for testing)
--sample-percent 10
Storage Optimization

Reduce disk usage when processing large files. The --stream-gz flag reads compressed files directly without decompressing to disk (saves 10x storage). The --delete-after-process flag removes files after loading.

# Read .gz files directly (no decompression to disk)
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --stream-gz
# Delete files after processing
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --delete-after-process
# Combine both for minimal storage usage
docker compose -f docker-compose.production.yml exec api python src/etl/load_data.py --stream-gz --delete-after-process
Tip: Using --stream-gz avoids 10x disk expansion (e.g., 5GB 50GB). Slightly slower but essential for limited storage.
Batch Download & Process Mode (Recommended)

The most efficient way to load large datasets. Downloads N files at a time, processes them with ETL, deletes them, then downloads the next batch. Minimizes disk usage by never having all files on disk at once.

# Batch mode: download 100 files, 5 at a time
docker compose -f docker-compose.production.yml exec api python src/scraper/download_and_process.py \ --batch-mode --limit 100 --batch-size 5 \ --file-types in_network --db-id bcbs_michigan
# With state and billing code filters
docker compose -f docker-compose.production.yml exec api python src/scraper/download_and_process.py \ --batch-mode --limit 50 --batch-size 3 \ --file-types in_network --db-id bcbs_michigan \ --state-filter MI --billing-codes 99213,99214,99215
# Large-scale load for new payer
docker compose -f docker-compose.production.yml exec api python src/scraper/download_and_process.py \ --batch-mode --limit 500 --batch-size 10 \ --file-types in_network --db-id uhc_michigan \ --stream-gz --etl-batch-size 10000
Old method: Download 100 files (2TB) Decompress Load
Peak disk: ~2TB
Batch mode: Download 5 files Load Delete Repeat
Peak disk: ~100GB
Post-ETL Steps

After loading data, run these commands to complete the pipeline:

# 1. Reindex Elasticsearch (with NPPES enrichment)
docker compose -f docker-compose.production.yml exec api python reindex_with_nppes.py
# 2. Check database status
docker compose -f docker-compose.production.yml exec postgres psql -U healthcare_admin -d healthcare_bcbs_michigan_production -c "SELECT COUNT(*) FROM in_network_services;"
# 3. View processing logs
docker compose -f docker-compose.production.yml logs -f --tail=100 api
CLI Arguments Quick Reference
Argument Description Example
--billing-codes Comma-separated list of billing codes to filter 99213,99214,99215
--billing-code-types Comma-separated list of code types (CPT, HCPCS, etc.) CPT,HCPCS
--state-filter Comma-separated list of state abbreviations MI,OH,IN
--limit Max number of files to process 5
--limit-records Max records per file 10000
--batch-size Records per batch commit (default: 50000) 25000
--parallel Enable parallel file processing --parallel --workers 4
--sample-percent Random sample percentage (for testing) 10
--stream-gz Read .gz files directly without decompression (flag only)
--delete-after-process Delete files after loading to save storage (flag only)
Batch Download Mode Arguments

These arguments are for the download_and_process.py script with --batch-mode:

Argument Description Example
--batch-mode Enable batch download/process/delete loop (flag only)
--batch-size Files per batch (default: 5) 10
--limit Total files to process 100
--file-types in_network, allowed_amount, or both in_network
--db-id Database ID for file organization bcbs_michigan
--stream-gz Read .gz directly (default: true) (flag only)
--etl-batch-size Records per ETL batch 10000
--state-filter Filter providers by state MI
--billing-codes Filter by billing codes 99213,99214

Elasticsearch Management

Loading Elasticsearch stats...
Status:

Facility Consolidation

Loading facility stats...

Consolidated Facilities

Request New Database

Configure the data pipeline you need and submit a request. A platform administrator will review and process it.

Enter a Table of Contents URL and click Load TOC to browse available files
Data Filters
Provider state(s)
Specific codes only
CPT, HCPCS, DRG, etc.
Max files to process
Browse Available Files (Optional)
0 selected
File / Description Type Plan
Select an entity to browse files...
Showing 0 files
Page 1

My Data Requests

Data Requests

Quick Tips & Tricks
  • State Filter: Use MI to load only Michigan providers, or MI,OH,IN for multiple states
  • Billing Code Filter: Use 99213,99214,99215 to load only specific procedures
  • Combined Filters: State + billing code filters work together to drastically reduce data size
  • Testing: Set "Max Records Per File" to 1000 to quickly test your filters
  • Sequential Process: Select entity Download files Configure filters Load data
  • Monitor Progress: Use the live progress panel to track loading in real-time
  • Pipeline Dashboard: The dashboard below queries the server directly — it shows live progress for downloads, ETL loads, and enrichment even after page refresh or cache clear
  • Resume after crash: If ETL stops (OOM/crash), the dashboard shows which files failed — click "Resume Load" to continue from where you left off with the same filters

Data Management

Download the latest data from your selected reporting entity and load it into the database.

Select Data Source

Choose which insurance company's data to download:

Remove this custom data source

Discovered TOC Files:

Download Data

selected entity
Downloading will delete existing files in data/raw/. Database keeps old data until you "Load Data".
Advanced Options
files/batch
ETL Filters (applied during batch load)
Records per DB commit (default: 50,000)
Lower = less RAM usage
Higher = faster loading
(1-4, lower = less RAM)
Leave filters empty to load all data. Batch mode: Download Load with filters Delete Repeat
Available Files from TOC
0 selected
File / Description Type Plan
Click "Search" to load available files...
Showing 0 files
Page 1
Ready to download
Download Progress 0%
Batch 1 of 1
Files: 0 / 0
Speed: -- files/min
Elapsed: 0:00
ETA: --
Waiting for files...
Downloaded: 0 MB

2 Create Database for Custom Entity

Set up a dedicated PostgreSQL database for this new entity.

ℹ New Entity Detected:
A dedicated database will be created to store this entity's data separately.
Use lowercase letters, numbers, and underscores only
Ready to create database
Note: After creating the database, proceed to download and load data as normal.

Load Into Database

Parse downloaded files and load data into the database.

Data Filters
Provider state filter
Specific codes only
CPT, HCPCS, DRG, etc.
For testing
Advanced
Active Filters:
Ready to load data

Downloaded Files Ready to Load

Files in data/raw/
0 selected
File Name Type Size Status
Loading file list...
0 files ready
0 MB total

Live Progress

00:00:00
0/0
Files
0
Records
0
Filtered Out
Waiting...

Pipeline Dashboard LIVE

Download Idle
No active download
0:00 0%
ETL Load Idle
No active load
0:00 0%
Enrichment Idle
No active enrichment
0:00 0%
-
Completed
-
Failed
-
Processing
-
Total
File Name Status Records Size Duration Actions
Click "Refresh" to load file processing log
0 selected
Error Details

Enrich to Production

Copy raw data to production database and enrich with NPPES taxonomy data.

Pipeline Options
Limit to N provider groups for testing
Ready to enrich
Pipeline Steps:
  1. Create production database (if needed)
  2. Copy all tables using pg_dump streaming
  3. Enrich providers with NPPES taxonomy
  4. Update address/location fields

Enrichment Progress

1
Initialize
2
Copy Data
3
Enrich
Complete
0%
0/7
Tables Copied
0
Providers Enriched
00:00
Elapsed Time
Current Operation:
Waiting to start...
Available Databases
Loading...

Data Operations

Clear Application Cache

Clear any cached data to ensure fresh results

Reset Operation Status

Reset stuck data operations (download/load)

Interactive API Documentation

Swagger UI with purple gradient theme. All 140+ endpoints documented with examples and schemas.

Platform Reference

51-chapter developer reference covering architecture, schema, ETL, auth, and deployment.

Quick Actions

Server Status

OS
Loading...
Uptime
—
Updates
—
Reboot
Not needed

Resource Usage

Loading...

Host Updates

Loading...

Containers

Loading...

Host Commands (SSH)

These commands must be run via SSH on the host. The API container cannot modify the host OS directly.

Update Package Lists
sudo apt-get update
Apply All Updates
sudo apt-get upgrade -y
Apply Security Updates Only
sudo unattended-upgrade --dry-run
Reboot Server
sudo reboot

Maintenance Mode

Current Status

Normal Operation
Message:
Started:
Ends:
Set by:

Schedule Maintenance

Maintenance Mode

Current Status

Normal Operation
Message:
Started:
Ends:
Set by:

Schedule Maintenance

API Usage

Audited Actions
—
Avg / Day
—
Active Orgs
—
Active Users
—
Peak Hour
—
Active Keys
—

Daily Request Volume

Loading...

Requests by Hour (UTC)

Loading...

Usage by Organization

Loading...

Top Users

Loading...

Top Actions

Loading...

API Key Usage

Loading...

Metric Legend (click to expand)

Audited Actions — number of audited API actions (logins, queries, exports, etc.) in the selected period.
Avg / Day — total audited actions divided by the number of days in the period.
Active Orgs — organizations with at least one recorded action in the period.
Active Users — distinct users who performed at least one action (top 10 shown).
Peak Hour — the hour of day (UTC) with the most cumulative actions across the period.
Active Keys — number of API keys used at least once during the selected period.
Daily Request Volume — bar chart showing the number of audit log entries per calendar day.
Requests by Hour — shows which hours of the day are busiest, aggregated across all days in the period.
Usage by Organization — breakdown of actions per tenant organization, with percentage share.
Top Users — the 10 most active users by action count.
Top Actions — the 10 most common action types (e.g. login, logout, query) across all users.
API Key Usage — individual API keys ranked by lifetime usage count, with owner and last-used info.

Export API Usage Report

News & Updates

Loading articles…

New Article

Documentation

Loading…

About the Table of Contents

The Table of Contents (TOC) file is the index that maps insurance plans to their machine-readable price transparency files. According to CMS requirements, this file shows which plans share the same pricing data and where to download each file.

What you'll see:

  • Reporting Plans: Insurance plan names, IDs (HIOS or EIN), market types (individual/group), and issuers
  • In-Network Files: URLs to negotiated rate files used by each plan
  • Allowed Amount Files: URLs to out-of-network pricing files for each plan
  • File Sharing: Multiple plans often share the same pricing file to reduce redundancy

Note: This is the "index" referenced in the CMS Price Transparency Guide. View CMS TOC Schema on GitHub

Reporting Plans

Loading...

Loading Table of Contents...

Downloaded Data Sources

Raw SQL Search

Execute SQL queries directly against the active database with safety guardrails

Active Database: Loading...
Query will run on:

SQL Query Editor

Tip: Use LIMIT to restrict results. Read-only queries recommended. 0 characters

Example Queries

Top 10 Services by Provider Count

-- Uses materialized view for instant results
SELECT * FROM service_provider_counts 
ORDER BY provider_count DESC 
LIMIT 10;

Provider Groups by City/State

SELECT 
  city, state,
  COUNT(*) as group_count
FROM provider_groups
WHERE city IS NOT NULL
GROUP BY city, state
ORDER BY group_count DESC
LIMIT 20;

Price Stats for CPT Code

SELECT s.billing_code, s.name,
  MIN(np.negotiated_rate_value) as min_rate,
  MAX(np.negotiated_rate_value) as max_rate,
  ROUND(AVG(np.negotiated_rate_value)::numeric, 2) as avg_rate
FROM in_network_services s
  JOIN negotiated_rates nr ON s.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
WHERE s.billing_code = '99213'
GROUP BY s.billing_code, s.name;

Top Providers by Services Offered

-- Uses materialized view for instant results
SELECT * FROM provider_service_counts 
ORDER BY services_offered DESC 
LIMIT 20;

Search Services by Name

SELECT billing_code, billing_code_type, 
  name, description
FROM in_network_services
WHERE name ILIKE '%mri%' 
   OR description ILIKE '%mri%'
LIMIT 25;

Service Prices with Locations

SELECT s.billing_code, s.name,
  np.negotiated_rate_value as price,
  pg.city, pg.state
FROM in_network_services s
  JOIN negotiated_rates nr ON s.id = nr.service_id
  JOIN negotiated_prices np ON nr.id = np.negotiated_rate_id
  JOIN negotiated_rate_provider_groups nrpg ON nr.id = nrpg.negotiated_rate_id
  JOIN provider_groups pg ON nrpg.provider_group_id = pg.id
WHERE s.billing_code = '70553'
LIMIT 25;

Query Results

Execute a query to see results here

Payer Price Comparison

Compare healthcare pricing across your selected insurance payers side-by-side

Comparing:
Tip: Enter a specific billing code like 99213 for detailed comparison, or search by service name.

Provider Rate Comparison

Compare Prices Across Payers

Enter a billing code or service name above to compare pricing data across your selected insurance payers.

Select at least 2 payers using the header selector to enable comparison.

AI Assistant

Ask questions about healthcare pricing — the assistant can search and analyze the data you have access to.

Hi! Ask me about negotiated rates, cost estimates, providers, or how payers compare. I can only see the databases your account is allowed to access.
Bring Your Own Key (optional)

By default this uses the platform's hosted model (subject to your monthly quota). Add your own provider key for unlimited, private inference billed to your account.

Service Details


                

                

Select Insurance Payers

Choose one or more payer databases to compare pricing data

0 selected

No payer databases configured yet.

Contact an administrator to add payer data.

Welcome

Sign in to access all features

Forgot your password?
If you received an invite token from your organization admin, paste it here.

Request a password reset from your administrator. To protect your account, you must verify your identity with your two-factor authentication (2FA) code. An admin will then issue you a temporary password.

Back to Sign In

Enter the 6-digit code from your authenticator app

Use backup code instead
Cancel and return to login

Enter one of your backup codes

Use authenticator app instead

Two-Factor Authentication

Secure your account with 2FA

2FA is Not Enabled

Add an extra layer of security to your account by requiring a verification code in addition to your password.

2FA is Enabled

Your account is protected with two-factor authentication.

Backup codes remaining: 10

Step 1: Scan QR Code

Scan this QR code with your authenticator app (Google Authenticator, Authy, etc.)

QR Code

Can't scan? Enter this code manually:

Step 2: Enter Verification Code

Enter the 6-digit code from your authenticator app to confirm setup:

Cancel setup
Important: Save These Backup Codes

Store these codes in a safe place. Each code can only be used once to access your account if you lose your authenticator.

Enter a verification code to disable two-factor authentication.

Cancel

Enter a verification code from your authenticator app to generate new backup codes.

Cancel

API Keys

Create keys for programmatic access to the Aequalis API

Copy your API key now

This key will only be shown once. Store it securely.

Create New Key

Your API Keys

Loading...

Usage

Include your API key in the X-API-Key header:

curl -H "X-API-Key: ak_your_key_here" -H "X-Database-Id: bcbs_michigan" https://your-server/api/search?q=MRI

AI MCP Bridge

Connect Aequalis to your AI coding assistant — no technical setup needed

Pick the app you want to use Aequalis from. We'll create a secure key and give you a one-click install (or copy-paste setup) for that app.

1. Choose your app

2. Payer database (optional)

Copy your key now

Your bridge key is shown once. The one-click install below already includes it, so for that path you don't need to copy anything.