Server data from the Official MCP Registry
Production MCP server for Postgres, Oracle, Snowflake, BigQuery, Redshift, DuckDB, MotherDuck.
Production MCP server for Postgres, Oracle, Snowflake, BigQuery, Redshift, DuckDB, MotherDuck.
Valid MCP server (1 strong, 1 medium validity signals). 2 known CVEs in dependencies (0 critical, 2 high severity) Package registry verified. Imported from the Official MCP Registry. Trust signals: 3 highly-trusted packages.
8 files analyzed · 3 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: MCP_TRANSPORT
Environment variable: WAREHOUSE_TYPE
Environment variable: MCP_API_KEYS
Environment variable: DUCKDB_PATH
Environment variable: MOTHERDUCK_TOKEN
Environment variable: PG_HOST
Environment variable: PG_DATABASE
Environment variable: PG_USER
Environment variable: PG_PASSWORD
Environment variable: ORACLE_USER
Environment variable: ORACLE_PASSWORD
Environment variable: ORACLE_CONNECT_STRING
Environment variable: SNOWFLAKE_ACCOUNT
Environment variable: SNOWFLAKE_USER
Environment variable: SNOWFLAKE_PRIVATE_KEY_PATH
Environment variable: BIGQUERY_PROJECT
Environment variable: GOOGLE_APPLICATION_CREDENTIALS
Environment variable: GUARDRAIL_PII_MASK
Environment variable: MCP_RATE_LIMIT_RPM
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-kalehdoo-warehouse-mcp": {
"env": {
"PG_HOST": "your-pg-host-here",
"PG_USER": "your-pg-user-here",
"DUCKDB_PATH": "your-duckdb-path-here",
"ORACLE_USER": "your-oracle-user-here",
"PG_DATABASE": "your-pg-database-here",
"PG_PASSWORD": "your-pg-password-here",
"MCP_API_KEYS": "your-mcp-api-keys-here",
"MCP_TRANSPORT": "your-mcp-transport-here",
"SNOWFLAKE_USER": "your-snowflake-user-here",
"WAREHOUSE_TYPE": "your-warehouse-type-here",
"ORACLE_PASSWORD": "your-oracle-password-here",
"BIGQUERY_PROJECT": "your-bigquery-project-here",
"MOTHERDUCK_TOKEN": "your-motherduck-token-here",
"SNOWFLAKE_ACCOUNT": "your-snowflake-account-here",
"GUARDRAIL_PII_MASK": "your-guardrail-pii-mask-here",
"MCP_RATE_LIMIT_RPM": "your-mcp-rate-limit-rpm-here",
"ORACLE_CONNECT_STRING": "your-oracle-connect-string-here",
"SNOWFLAKE_PRIVATE_KEY_PATH": "your-snowflake-private-key-path-here",
"GOOGLE_APPLICATION_CREDENTIALS": "your-google-application-credentials-here"
},
"args": [
"-y",
"warehouse-mcp",
"start"
],
"command": "npx"
}
}
}From the project's GitHub README.
Production MCP (Model Context Protocol) server for your data warehouse. Read-only enforcement, five-tier role-based access, optional warehouse-role impersonation (for native RLS / CLS), JSONL audit log, optional output PII masking, optional semantic-metadata layer (glossary + table docs) with per-session toggle. Self-host the Docker image, install via npx, or wait for the upcoming managed cloud variant.
Status: v0.4.x — production-ready for the v1 warehouse list. See CHANGELOG for what shipped in each release.
Warehouse MCP in action using Cursor
| Warehouse | Adapter docs |
|---|---|
| Postgres 12+ | docs/adapters/postgres.md |
| Oracle 12c+ (Thin mode, no Instant Client) | docs/adapters/oracle.md |
| Amazon Redshift (cluster + Serverless) | docs/adapters/redshift.md |
| Snowflake (key-pair auth) | docs/adapters/snowflake.md |
| Google BigQuery | docs/adapters/bigquery.md |
DuckDB (local file or :memory:) | docs/adapters/duckdb.md |
MotherDuck (cloud-hosted DuckDB, md: paths) | docs/adapters/duckdb.md |
Databricks SQL is a fast-follow.
| Tool | Purpose |
|---|---|
query | Execute a SELECT (validator-enforced read-only, dialect-aware) |
list_schemas, list_tables, describe_table | Browse the catalog |
find_columns | Search column names across the warehouse with a LIKE pattern |
get_foreign_keys | Discover declared FK relationships for safe joins |
get_view_definition | Read the SQL body of a view (where business logic lives) |
sample_table, count_rows | Peek at data, check size before scanning |
column_stats, top_values | Profile a single column |
time_series | Bucket by hour/day/week/month/quarter/year — dialect-correct everywhere |
search_value | Find a literal across a table's text columns |
glossary_lookup | Read business-glossary terms from the semantic layer (in-memory, no warehouse I/O) |
schema_lookup | Read schema-level docs — purpose, owner, refresh, table list |
table_lookup | Read full semantic doc for one table — description + column metadata |
| Role | Tools allowed |
|---|---|
semantic_only | Only the three semantic-lookup tools — zero warehouse access; pair with semantic=on for a docs-viewer persona |
metadata_only | Catalog discovery + semantic lookups — never reads row data |
reader_restricted | Aggregates / samples / time series — no arbitrary SELECT |
reader | Adds query and search_value (the general analyst tier) |
admin | Everything; future write tools when ENABLE_WRITE_TOOLS ships |
Per-key role assigned via MCP_API_KEYS=key:role[:set_role=warehouse_role][:semantic=on|off]. The optional set_role= directive issues SET ROLE on Postgres/Redshift so the warehouse's own RLS / CLS / masking policies enforce per-key access — no policy duplication in MCP. The optional semantic=on|off directive overrides SEMANTIC_DEFAULT for that one key, controlling whether the session sees the warehouse://semantic/* resources (see docs/semantic-metadata.md).
For deployments with multiple existing DB roles (finance, hr, payroll, etc.) and many human users, see docs/multi-role-deployment.md — walks through mapping ~10 DB roles to MCP keys, the recommended <area> / <area>_restricted pattern, and when to graduate from static keys to OIDC.
{ "mcpServers": { "warehouse-admin-semanticon": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer admin-key-change-me" ] }, "warehouse-admin-semanticoff": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer admin-key-nosemantic" ] }, "warehouse-reader-semanticon": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer reader-key-change-me" ] }, "warehouse-reader-semanticoff": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer reader-key-nosemantic" ] }, "warehouse-reader-restricted-semanticon": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer restricted-key-change-me" ] }, "warehouse-reader-restricted-semanticoff": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer restricted-key-nosemantic" ] }, "warehouse-metadata-only-semanticon": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer metadata-only-key-change-me" ] }, "warehouse-metadata-only-semanticoff": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer metadata-only-key-nosemantic" ] }, "warehouse-docs-viewer-only": { "command": "npx", "args": [ "-y", "mcp-remote@latest", "http://localhost:3001/mcp", "--header", "Authorization: Bearer docs_viewer_key_change_me" ] } } }
git clone https://github.com/kalehdoo/warehouse-mcp.git
cd warehouse-mcp
docker compose up
# server on http://localhost:3001, seeded ecommerce data in Postgres
The same image bundles every adapter; pick one with WAREHOUSE_TYPE plus the matching credentials. For credentials, prefer --env-file (or your secrets manager) over inline -e flags so passwords don't end up in shell history.
# Postgres (REDSHIFT_* env vars for Redshift; same driver under the hood)
docker run -d -p 3001:3001 \
-e WAREHOUSE_TYPE=postgres \
-e PG_HOST=db -e PG_DATABASE=analytics -e PG_USER=mcp_reader -e PG_PASSWORD=... \
-e MCP_API_KEYS="$(openssl rand -hex 24):reader" \
ghcr.io/kalehdoo/warehouse-mcp:latest
# Oracle (Thin mode, no Instant Client)
docker run -d -p 3001:3001 \
-e WAREHOUSE_TYPE=oracle \
-e ORACLE_USER=MCP_READER -e ORACLE_PASSWORD=... \
-e ORACLE_CONNECT_STRING="db.host:1521/SERVICE" \
ghcr.io/kalehdoo/warehouse-mcp:latest
# Snowflake (key-pair, mount the .p8)
docker run -d -p 3001:3001 \
-e WAREHOUSE_TYPE=snowflake \
-e SNOWFLAKE_ACCOUNT=xy12345.us-east-1 -e SNOWFLAKE_USER=MCP_READER \
-e SNOWFLAKE_PRIVATE_KEY_PATH=/keys/snowflake.p8 \
-e SNOWFLAKE_WAREHOUSE=COMPUTE_WH -e SNOWFLAKE_DATABASE=ANALYTICS \
-v /opt/keys:/keys:ro \
ghcr.io/kalehdoo/warehouse-mcp:latest
# BigQuery (mount the service-account JSON)
docker run -d -p 3001:3001 \
-e WAREHOUSE_TYPE=bigquery \
-e GOOGLE_APPLICATION_CREDENTIALS=/keys/bq-sa.json \
-e BIGQUERY_PROJECT=my-gcp-project \
-v /opt/keys:/keys:ro \
ghcr.io/kalehdoo/warehouse-mcp:latest
# DuckDB (file or in-memory)
docker run -d -p 3001:3001 \
-e WAREHOUSE_TYPE=duckdb -e DUCKDB_PATH=:memory: \
ghcr.io/kalehdoo/warehouse-mcp:latest
# MotherDuck (hosted DuckDB)
docker run -d -p 3001:3001 \
-e WAREHOUSE_TYPE=duckdb -e DUCKDB_PATH=md:my_database \
-e MOTHERDUCK_TOKEN=... \
ghcr.io/kalehdoo/warehouse-mcp:latest
npx warehouse-mcp@latest init # interactive setup; writes .env, prints Claude Desktop snippet
npx warehouse-mcp doctor # verify the connection without booting the server
npx warehouse-mcp start # bind the MCP server to MCP_SERVER_PORT (default 3001)
Then point an AI client at it. Drop-in configs:
For a step-by-step walkthrough from "I have a warehouse" to "Claude is querying it", see the onboarding guide. For common errors, see troubleshooting.
nvm use # Node 20
npm install
npm test # unit tests, sub-second
npm run lint
cp .env.example .env
For testcontainers integration tests against real Postgres: npm run test:integration (Docker required).
Teach the AI agent what your warehouse means, not just what's in it. Drop YAML files describing your business glossary and table semantics into a directory, point SEMANTIC_DIR at it, and the agent gets two access channels into the same in-memory index:
warehouse://semantic/* — what tool-aware clients (Cursor, MCP Inspector) read proactively before issuing queries.glossary_lookup, schema_lookup, table_lookup — same data via the tool channel, for clients (Claude Desktop) whose UI is tool-centric and may suppress resource-only servers. Pure in-memory Map.gets — no warehouse I/O, no audit cost beyond a normal tool log line.Format follows dbt's schema.yml v2 with one extension (meta.schema: per model). Customers using dbt can point SEMANTIC_DIR at their existing models/ directory and reuse most of what they have. See docs/semantic-metadata.md and the starter docs/semantic-templates/.
Whether a session sees the semantic layer is independent of whether the YAMLs are loaded — SEMANTIC_DIR controls loading; SEMANTIC_DEFAULT=on|off and the per-key semantic=on|off option (or include_semantic JWT claim) control per-session visibility. The YAMLs are always loaded and validated at boot so you can flip a kill-switch without restarting. Three precedence layers, highest first:
include_semantic: true|false (OIDC).semantic=on|off in MCP_API_KEYS.SEMANTIC_DEFAULT=on|off (default on).Each tool-call audit row records the resolved include_semantic value so you can correlate query quality with semantic exposure after the fact.
Off by default. Set OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318/v1/traces in your env and the server will emit per-tool-call spans. Resource attributes: service.name=warehouse-mcp, service.version=<package.json version>. Span attributes: warehouse.tenant, warehouse.role. Works with any OTLP/HTTP backend (Grafana Tempo, Honeycomb, Datadog APM, New Relic, SigNoz).
Off by default. Set GUARDRAIL_PII_MASK=on and the server masks emails, SSNs, formatted phones, IPv4 addresses, and Luhn-validated credit cards in result rows. Mask level depends on the caller's role: admin sees raw, reader sees partial (a***@example.com), reader_restricted sees full redaction tags.
Read the architecture doc for the request flow — what files execute when an MCP client makes a call, how the guardrail pipeline composes around the tool handler, and the recipes for adding a new tool / guardrail / adapter. Single-page orientation for operators, security reviewers, and contributors.
Read the threat model before deploying. It covers the OWASP Top 10 mapping, what the codebase mitigates, and what is left to your deployment (TLS, secrets management, network isolation, cost guardrails). Report vulnerabilities per SECURITY.md.
Issues and PRs welcome. Start with CONTRIBUTING.md — it covers the dev workflow, the adapter contract, and how to add a new warehouse. By participating you agree to the Code of Conduct.
Apache-2.0
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Web content fetching and conversion for efficient LLM usage
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.