Registry
Navil's MCP package registry crawler discovers MCP servers from public package registries and feeds them into the security scanning pipeline. Know what's out there before your agents install it.
Overview
The registry crawler automatically discovers MCP server packages across multiple sources, building a comprehensive catalog that Navil uses for threat intelligence and security scanning. This powers the community blocklist and supply chain detection features.
Supported Sources
The crawler pulls from three public sources:
awesome-mcp-serversCommunity-curated GitHub README with links to MCP servers. Parsed via regex pattern matching on markdown links.
npmnpm registry keyword search for packages tagged mcp-server. Includes auto-generated config examples for each package.
PyPIPython Package Index search via XML-RPC API with automatic fallback to the Simple API if XML-RPC is unavailable.
Crawl Result Format
Each discovered MCP server is returned as a structured entry with source provenance:
{
"server_name": "@modelcontextprotocol/server-filesystem",
"source": "npm",
"url": "https://www.npmjs.com/package/@modelcontextprotocol/server-filesystem",
"description": "MCP server for filesystem operations",
"config_example": {
"mcpServers": {
"@modelcontextprotocol/server-filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem"]
}
}
}
}Rate Limiting & Resilience
The crawler is designed to be a responsible citizen of public APIs:
- Per-domain rate limiting — Token bucket limiter (default: 2 requests/second per domain)
- Exponential back-off — Automatic retry with exponential delays on HTTP 429 and 5xx responses
- Concurrency control — Global semaphore limits concurrent requests (default: 10)
- Graceful fallbacks — If one source fails, the others continue. PyPI XML-RPC automatically falls back to the Simple API.
Security Scanning
Crawled packages feed into Navil's security scanning pipeline. Use the navil scan command to analyze any MCP configuration for vulnerabilities:
# Scan your MCP config for known vulnerabilities
navil scan mcp_config.json
# Output in SARIF format for CI/CD integration
navil scan mcp_config.json --format sarif --output results.sarifThe scanner checks for hardcoded credentials, insecure protocols, known malicious patterns, and packages flagged by the community threat network.
Programmatic Usage
The crawler can be used programmatically in Python:
from navil.crawler.registry_crawler import crawl_registries
# Crawl all sources (async)
results = await crawl_registries()
# Limit results
results = await crawl_registries(limit=100)
# Each result has: server_name, source, url, description
for r in results:
print(f"{r.source}: {r.server_name} — {r.url}")