MCP Marketplace
BrowseHow It WorksFor CreatorsDocs
Sign inSign up
MCP Marketplace

The curated, security-first marketplace for AI tools.

Product

Browse ToolsSubmit a ToolDocumentationHow It WorksBlogFAQ

Legal

Terms of ServicePrivacy PolicyCommunity Guidelines

Connect

support@mcp-marketplace.ioTwitter / XDiscord

MCP Marketplace © 2026. All rights reserved.

Back to Browse

Spark Sql MCP Server

by Aidancorrell
Developer ToolsModerate6.9MCP RegistryLocal
Free

Server data from the Official MCP Registry

Query Spark SQL clusters via Thrift/HiveServer2. Works with Spark, EMR, Hive, Impala.

About

Query Spark SQL clusters via Thrift/HiveServer2. Works with Spark, EMR, Hive, Impala.

Security Report

6.9
Moderate6.9Moderate Risk

Valid MCP server (1 strong, 5 medium validity signals). 1 code issue detected. 3 known CVEs in dependencies (0 critical, 3 high severity) Package registry verified. Imported from the Official MCP Registry. 1 finding(s) downgraded by scanner intelligence.

12 files analyzed · 5 issues found

Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.

Permissions Required

This plugin requests these system permissions. Most are normal for its category.

env_vars

Check that this permission is expected for this type of plugin.

file_system

Check that this permission is expected for this type of plugin.

Shell Command Execution

Runs commands on your machine. Be cautious — only use if you trust this plugin.

What You'll Need

Set these up before or after installing:

Hostname of the Spark Thrift ServerOptional

Environment variable: SPARK_HOST

Port of the Spark Thrift Server (default: 10000)Optional

Environment variable: SPARK_PORT

Default database to useOptional

Environment variable: SPARK_DATABASE

Authentication method: NONE, LDAP, KERBEROS, CUSTOM, or NOSASLOptional

Environment variable: SPARK_AUTH

Username for LDAP authenticationOptional

Environment variable: SPARK_USERNAME

Password for LDAP authenticationRequired

Environment variable: SPARK_PASSWORD

Kerberos service name (default: hive)Optional

Environment variable: SPARK_KERBEROS_SERVICE_NAME

How to Install

Add this to your MCP configuration file:

{
  "mcpServers": {
    "io-github-aidancorrell-spark-sql-mcp-server": {
      "env": {
        "SPARK_AUTH": "your-spark-auth-here",
        "SPARK_HOST": "your-spark-host-here",
        "SPARK_PORT": "your-spark-port-here",
        "SPARK_DATABASE": "your-spark-database-here",
        "SPARK_PASSWORD": "your-spark-password-here",
        "SPARK_USERNAME": "your-spark-username-here",
        "SPARK_KERBEROS_SERVICE_NAME": "your-spark-kerberos-service-name-here"
      },
      "args": [
        "spark-sql-mcp-server"
      ],
      "command": "uvx"
    }
  }
}

Documentation

View on GitHub

From the project's GitHub README.

Spark SQL MCP Server

An MCP server that enables AI assistants to query Spark SQL clusters via the Thrift/HiveServer2 protocol.

Works with any HiveServer2-compatible system: Apache Spark, AWS EMR, Hive, Impala, Presto.

Features

  • Query Spark SQL — Execute read-only SQL queries against your Spark cluster
  • Schema Discovery — List databases, tables, and describe table structures
  • Multiple Auth Methods — NONE, LDAP, NOSASL, CUSTOM, and Kerberos authentication
  • EMR Compatible — Works with AWS EMR clusters out of the box
  • Read-Only Enforcement — Only SELECT, SHOW, DESCRIBE, EXPLAIN, and WITH statements are allowed
  • Safety Defaults — Automatic LIMIT clause on unbounded queries, sanitized error messages

Installation

pip install spark-sql-mcp-server

Or run directly with uvx:

uvx spark-sql-mcp-server

Quick Start

1. Set Environment Variables

export SPARK_HOST="your-emr-master-node.amazonaws.com"
export SPARK_PORT="10000"        # default
export SPARK_DATABASE="default"  # default
export SPARK_AUTH="NONE"         # NONE | LDAP | KERBEROS | CUSTOM | NOSASL

2. Add to Claude Code

Global (all projects) — add to ~/.claude.json under your project's mcpServers:

{
  "mcpServers": {
    "spark-sql": {
      "command": "uvx",
      "args": ["spark-sql-mcp-server"],
      "env": {
        "SPARK_HOST": "your-emr-master-node.amazonaws.com",
        "SPARK_PORT": "10000",
        "SPARK_AUTH": "NONE"
      }
    }
  }
}

Project-level — add to .claude/mcp.json in your repo:

{
  "mcpServers": {
    "spark-sql": {
      "command": "uvx",
      "args": ["spark-sql-mcp-server"],
      "env": {
        "SPARK_HOST": "your-emr-master-node.amazonaws.com",
        "SPARK_PORT": "10000",
        "SPARK_AUTH": "NONE"
      }
    }
  }
}

3. Add to Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "spark-sql": {
      "command": "uvx",
      "args": ["spark-sql-mcp-server"],
      "env": {
        "SPARK_HOST": "your-emr-master-node.amazonaws.com",
        "SPARK_PORT": "10000"
      }
    }
  }
}

4. Query

Ask Claude things like:

  • "What databases are available in our Spark cluster?"
  • "Show me the schema of the sales.transactions table"
  • "Query the top 10 customers by revenue from the analytics database"

Available Tools

ToolDescription
list_databasesList all available databases
list_tablesList tables in a database
describe_tableGet table schema (columns, types)
execute_queryRun read-only SQL queries with formatted results

Authentication

No Auth (default)

export SPARK_AUTH="NONE"

LDAP

export SPARK_AUTH="LDAP"
export SPARK_USERNAME="your-username"
export SPARK_PASSWORD="your-password"

Kerberos

export SPARK_AUTH="KERBEROS"
export SPARK_KERBEROS_SERVICE_NAME="hive"  # default
# Ensure you have a valid Kerberos ticket (kinit)

AWS EMR Setup

  1. Security Group — Allow inbound traffic on port 10000 from your IP
  2. SSH Tunnel (recommended):
    ssh -i your-key.pem -L 10000:localhost:10000 hadoop@your-emr-master
    
  3. Set SPARK_HOST=localhost

Development

git clone https://github.com/aidancorrell/spark-sql-mcp-server.git
cd spark-sql-mcp-server
pip install -e ".[dev]"
pytest
ruff check .

Local Testing with Docker

A Docker Compose setup provides a local Spark Thrift Server with sample data for integration testing.

# Start the Spark Thrift Server
cd docker && docker compose up -d

# Wait for it to be ready (takes ~30s on first start)
docker logs -f spark-thrift-server  # look for "Sample data loaded."

# Run integration tests
pytest -m integration -v

# Tear down
cd docker && docker compose down -v

The local server comes with sample tables: default.employees, default.orders, and test_db.metrics.

Unit tests run by default with pytest (integration tests are skipped unless -m integration is specified).

Using the local server with Claude Code

With the Docker Spark server running, add it to your MCP config to test the server interactively.

Global — add to ~/.claude.json under your project's mcpServers:

{
  "spark-sql": {
    "command": "uvx",
    "args": ["spark-sql-mcp-server"],
    "env": {
      "SPARK_HOST": "localhost",
      "SPARK_PORT": "10000",
      "SPARK_AUTH": "NONE"
    }
  }
}

Project-level — add to .claude/mcp.json:

{
  "mcpServers": {
    "spark-sql": {
      "command": "uvx",
      "args": ["spark-sql-mcp-server"],
      "env": {
        "SPARK_HOST": "localhost",
        "SPARK_PORT": "10000",
        "SPARK_AUTH": "NONE"
      }
    }
  }
}

Then start a new Claude Code session and ask it to query the sample data.

Security

Read-Only Enforcement

The execute_query tool only allows read-only SQL statements. Queries must start with one of: SELECT, SHOW, DESCRIBE, DESC, EXPLAIN, or WITH. All other statement types (DROP, INSERT, DELETE, CREATE, ALTER, SET, ADD JAR, etc.) are rejected before reaching the Spark cluster.

Error Sanitization

Database errors are sanitized before being returned to the MCP client. Internal details such as server hostnames, file paths, and stack traces are not exposed. Connection failures report only the target host/port and error type.

Credential Handling

  • Passwords are never included in log output or error messages
  • The SparkConfig object masks passwords in its string representation
  • SPARK_PASSWORD is marked as a secret in the MCP registry schema

Known Limitations

  • No TLS/SSL support — Thrift connections are unencrypted. For production use with LDAP auth, use an SSH tunnel to protect credentials in transit.
  • No query timeout — Long-running queries are not automatically cancelled. Rely on Spark cluster-level timeout configuration.
  • No per-user access control — All queries execute with the privileges of the configured Spark user. Use HiveServer2 authorization (Ranger, Sentry) to restrict access at the database level.
  • Auth mode defaults to NONE — Appropriate for local development but not for production. Set SPARK_AUTH to LDAP or KERBEROS for authenticated environments.

License

MIT

Reviews

No reviews yet

Be the first to review this server!

0

installs

New

no ratings yet

Is this your server?

Claim ownership to manage your listing, respond to reviews, and track installs from your dashboard.

Claim with GitHub

Sign up with the GitHub account that owns this repo

Links

Source CodePyPI Package

Details

Published February 24, 2026
Version 0.1.2
0 installs
Local Plugin

More Developer Tools MCP Servers

Git

Free

by Modelcontextprotocol · Developer Tools

Read, search, and manipulate Git repositories programmatically

80.0K
Stars
4
Installs
6.5
Security
No ratings yet
Local

Toleno

Free

by Toleno · Developer Tools

Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.

137
Stars
483
Installs
8.0
Security
4.8
Local

mcp-creator-python

Free

by mcp-marketplace · Developer Tools

Create, build, and publish Python MCP servers to PyPI — conversationally.

-
Stars
65
Installs
10.0
Security
4.6
Local

MarkItDown

Free

by Microsoft · Content & Media

Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption

120.0K
Stars
22
Installs
6.0
Security
5.0
Local

mcp-creator-typescript

Free

by mcp-marketplace · Developer Tools

Scaffold, build, and publish TypeScript MCP servers to npm — conversationally

-
Stars
16
Installs
10.0
Security
5.0
Local

FinAgent

Free

by mcp-marketplace · Finance

Free stock data and market news for any MCP-compatible AI assistant.

-
Stars
16
Installs
10.0
Security
No ratings yet
Local