The Model Context Protocol (MCP) is quickly becoming the standard interface for connecting AI assistants to external tools and data sources. At Wiv, we build workflow automation infrastructure for operations teams,  so we wanted our users to invoke Wiv workflows directly from AI assistants like Cursor, VS Code Copilot, Claude, and ChatGPT, without copy-pasting IDs or breaking their flow.

This post walks through how we built and deployed a hosted MCP server for the Wiv platform on AWS App Runner: full OAuth 2.1, dual transport (Streamable HTTP and SSE), and the design decisions we made along the way.

Why MCP, and Why Hosted?

MCP defines a simple protocol: an AI client connects to a server that exposes tools,  callable functions with JSON schemas. The client discovers tools, calls them with arguments, and receives structured responses.

Most early MCP servers are local, they run as a subprocess on the developer’s machine over stdio. That works fine for personal tooling, but breaks down for a SaaS product:

  • Every user must install and configure the server themselves
  • Local processes can’t share authentication context with your cloud
  • You can’t push updates without users reinstalling
  • Local servers don’t support multi-tenant auth

A hosted server solves all of this. One deployment, zero client-side installation, full control over auth. Users paste a URL,  or click a link.

We support two HTTP transports:

  • Streamable HTTP (/mcp):  the primary transport. Stateless request/response over a single endpoint. No persistent connections, faster tool discovery, and friendly to load balancers and auto-scaling.
  • SSE (/sse): the legacy transport for backward compatibility. A persistent Server-Sent Events connection with a separate POST endpoint for messages.

Streamable HTTP is the default for all new clients. It reduces round-trips for tool discovery from multiple SSE handshakes to a single HTTP request, which matters when your server runs behind AWS App Runner with TLS termination at the edge.

Infrastructure: AWS App Runner

We evaluated Lambda (cold-start latency is bad for SSE), ECS Fargate (operational overhead), and EC2 (too much to manage). App Runner won for simplicity: give it a container image, and it handles scaling, HTTPS, and health checks automatically.

CDK Stack

The core stack builds a Docker image from the server source, pushes it to ECR, and creates an App Runner service:

image_asset = ecr_assets.DockerImageAsset(      self, "McpServerImage",      directory=src_dir,      platform=ecr_assets.Platform.LINUX_AMD64,  )  service = apprunner.Service(      self, "McpService",      source=apprunner.Source.from_asset(          image_configuration=apprunner.ImageConfiguration(port=8080),          asset=image_asset,      ),      service_name=f"wiv-mcp-server-{env_name}",      cpu=apprunner.Cpu.ONE_VCPU,      memory=apprunner.Memory.TWO_GB,      auto_deployments_enabled=False,  )

App Runner provisions HTTPS automatically on the default .awsapprunner.com domain. We associate a custom domain (mcp.wiv.ai) via the console. All secrets are injected as environment variables at deploy time — nothing is baked into the image.

💡  We set auto_deployments_enabled=False so deployments are intentional, not triggered by every ECR push. We deploy explicitly via CI on a tagged release.

Streamable HTTP and App Runner Scaling

SSE holds a persistent connection for the entire session, as long as Cursor is open, the connection stays alive and the App Runner instance can’t scale to zero. Streamable HTTP makes short-lived requests only when tools are discovered or called. Between requests, there’s no connection at all, and App Runner can scale down freely.

Server Architecture

The server is a single Python file: wiv_api_server.py , built on the MCP SDK, Starlette, and httpx. It acts as both an MCP server and its own OAuth 2.1 Authorization Server, proxying authentication to PropelAuth.

Building a Hosted MCP Server on AWS App Runner

HTTP Endpoints

EndpointPurpose
/mcpPrimary MCP endpoint (Streamable HTTP)
/sseLegacy MCP endpoint (SSE)
/connectOAuth sign-in landing page + client setup guide
/authorizeOAuth authorization (redirects to PropelAuth)
/oauth/callbackReceives auth code, exchanges for API key
/tokenToken endpoint (exchanges auth code for access token)
/registerDynamic Client Registration (RFC 7591)
/.well-known/oauth-authorization-serverOAuth AS discovery metadata

Dual Transport Layer

The MCP Python SDK supports both transports natively. SSE uses SseServerTransport with a persistent GET connection and a /messages/ POST endpoint. Streamable HTTP uses StreamableHTTPSessionManager in stateless mode, each request is independent, no session state needed.

from mcp.server.streamable_http_manager import StreamableHTTPSessionManager  _session_manager = StreamableHTTPSessionManager(      app=server,      event_store=None,      json_response=True,      stateless=True,  )

One tricky problem: Starlette’s Mount directive for /mcp caused a 307 Temporary Redirect to /mcp/, which silently dropped the POST method and Authorization header. The fix was to handle /mcp requests directly in an ASGI middleware wrapper, bypassing Starlette’s routing for that path entirely:

async def api_key_asgi_wrapper(scope, receive, send):      path = scope.get("path", "")      if _session_manager and path in ("/mcp", "/mcp/"):          resolved = await _resolve_token_from_scope(scope)          if not resolved:              resp = JSONResponse({"error": "unauthorized"}, status_code=401,                                  headers={"WWW-Authenticate": "Bearer"})              await resp(scope, receive, send)              return          token = _connection_api_key.set(resolved)          try:              await _session_manager.handle_request(scope, receive, send)          finally:              _connection_api_key.reset(token)          return      await sse_app(scope, receive, send)  # Standard Starlette routing for everything else

Context Variables for Per-Connection Auth

Both transports share the same challenge: how do you pass per-connection auth (an API key extracted from HTTP headers) into a tool handler that has no direct access to the request?

The answer is Python’s contextvars.ContextVar:

_connection_api_key: ContextVar[Optional[str]] = ContextVar(      "connection_api_key", default=None  )

For SSE, the key is set when the connection opens and held for its lifetime. For Streamable HTTP, it’s set per-request and reset afterward. Context variables are async-safe, each task tree gets its own context, so there’s no leakage between concurrent connections.

Authentication: Acting as Your Own OAuth 2.1 AS

This was the most complex part of the build — and it evolved significantly from the first approach.

The Problem

Wiv uses PropelAuth for identity. PropelAuth supports OAuth 2.1 with PKCE, but its flow returns opaque tokens, 96-character hex strings, not JWTs. Wiv’s backend validates requests using propelauth_py, which expects either JWTs (for user sessions) or API keys (for programmatic access). Opaque tokens don’t fit either path.

The initial plan was to have MCP clients authenticate directly with PropelAuth. But modern clients like Cursor expect the MCP server itself to be the OAuth Authorization Server — they discover it via /.well-known/oauth-authorization-server at the server’s own URL, then call /register, /authorize, and /token there.

The Solution: Wiv MCP as Its Own OAuth AS

We implemented the full OAuth 2.1 AS surface on the MCP server, proxying actual user authentication to PropelAuth behind the scenes. The high-level flow:

  • Client registers via POST /register and receives a client_id + client_secret
  • Client sends GET /authorize with a PKCE code_challenge
  • Server redirects the user to PropelAuth for sign-in (with its own internal PKCE chain)
  • PropelAuth calls back to /oauth/callback with an auth code
  • Server exchanges the code for an opaque token, introspects it to get the user identity, then creates a Wiv API key via PropelAuth’s end-user API key endpoint
  • Server redirects the client back with a short-lived auth code
  • Client exchanges the code at POST /token and receives the Wiv API key as the access token

From the client’s perspective, it completed a standard OAuth 2.1 PKCE flow. The PropelAuth proxying is entirely invisible.

Dual PKCE Chains

The server manages two independent PKCE chains simultaneously:

  • Client → Wiv MCP: The client sends a code_challenge with /authorize, and proves knowledge of the original code_verifier at /token time.
  • Wiv MCP → PropelAuth: The server generates its own code_verifier / code_challenge pair for the PropelAuth leg. This is fully internal and invisible to the client.

The client’s code_challenge travels through PropelAuth’s callback in a base64-encoded state parameter, so it’s still available when the server issues its own auth code.

Token Pre-Caching

When /token issues an API key, it immediately pre-caches it in an in-memory map. This means the first /mcp request after authentication resolves the key instantly — no round-trip to PropelAuth’s introspect endpoint:

api_key = entry["api_key"]  token_hash = hashlib.sha256(api_key.encode()).hexdigest()[:32]  _token_to_api_key[token_hash] = {      "api_key": api_key,      "expires_at": time.time() + TOKEN_CACHE_TTL,  }

Multi-Org Handling

Wiv supports multi-tenant organizations. A user may belong to multiple orgs.
Organization select done via IDP integration

MCP Tools: Wrapping the Wiv API

The server exposes 26 tools, each mapping to a Wiv REST API endpoint. Tool definitions are declarative: a name, description, and JSON Schema for inputs. A routing table dispatches calls to the correct API request:

tool_mapping = {      "wiv_list_workflows":       ("GET",  "/workflows",                     None,           None),      "wiv_start_execution":      ("POST", "/workflows/{workflow_id}/run",   "trigger_event", None),      "wiv_get_execution_status": ("GET",  "/monitoring/workflows/{workflow_id}/executions/{execution_id}/status", None, None),      "wiv_search_cases":         ("POST", "/v2/cases/search",               "all",          None),      # ... 22 more  }

Path parameters like {workflow_id} are substituted from the tool arguments. Query parameters are extracted for GET requests; the body is built based on whether the tool passes all arguments or a specific field.

Tool categories:

  • Workflow management: list, get, update workflows and folders
  • Execution control:start, stop, and monitor executions
  • Cases: search, filter, summarize, and query (including MSP/multi-tenant variants)
  • AI workflow generation: generate workflows from natural language prompts
  • Spaces: list and inspect Wiv spaces and their resources

Every tool call resolves the API key from two sources in priority order: an explicit _wiv_api_key argument in the call, or the context variable set at connection time. If the resolved key looks like a JWT (three dot-separated segments), it’s sent as a Bearer token. Otherwise it goes as X-API-Key, making the server compatible with both auth styles.

The /connect Page: Multi-Client Onboarding

Not all MCP clients support native OAuth. The /connect page serves as both the OAuth callback landing and a universal setup guide. After sign-in, it renders collapsible cards for seven clients:

ClientAuth MethodsTransport
CursorOAuth (native) or API KeyStreamable HTTP
Claude DesktopOAuth (native) or API Key via mcp-remoteStreamable HTTP
Claude.aiOAuth (native)Streamable HTTP
VS CodeAPI KeyStreamable HTTP
ChatGPTOAuth (native)Streamable HTTP
Gemini CLIAPI Key via mcp-remoteStreamable HTTP
WindsurfAPI KeyStreamable HTTP

Each card shows a badge indicating OAuth, API Key, or both, and provides ready-to-paste JSON configuration. For OAuth-native clients, the config is minimal:

{    "mcpServers": {      "wiv": {        "url": "https://mcp.wiv.ai/mcp"      }    }  }

Cursor and VS Code also get one-click deeplinks that auto-install the MCP server. The deeplink encodes a base64 config object:

var config = { url: mcpUrl };  var configB64 = btoa(JSON.stringify(config));  var cursorLink = 'cursor://anysphere.cursor-deeplink/mcp/install?name=wiv&config='      + encodeURIComponent(configB64);

💡  For the API key flow, the key is passed only as a URL fragment (#api_key=…), which browsers never send to the server. It lives only in client-side JavaScript and is embedded in the headers config, never in a query string or path parameter.

Building a Hosted MCP Server on AWS App Runner

Lessons Learned

1. Be your own OAuth Authorization Server

Modern MCP clients expect to discover OAuth endpoints at the MCP server’s own URL. They call /register, /authorize, and /token there. If your identity provider doesn’t match what the client expects, you need to act as the AS yourself and proxy authentication behind the scenes.

2. Prefer Streamable HTTP for hosted deployments

SSE holds persistent connections, which keeps App Runner instances warm and blocks scale-to-zero. Streamable HTTP is stateless: clients make on-demand requests. This aligns much better with auto-scaling infrastructure and cuts tool discovery latency to a single HTTP round-trip.

3. Use ASGI middleware to avoid framework routing quirks

Starlette’s Mount directive adds trailing-slash redirects (/mcp → /mcp/) that silently drop the HTTP method and auth headers. Handling /mcp directly in ASGI middleware, before Starlette’s router, solved this without fighting the framework.

4. Opaque tokens are not JWTs

PropelAuth’s OAuth flow issues opaque tokens, not JWTs. validate_access_token_and_get_user() only handles JWTs. Token introspection followed by API key creation adds a couple of round trips but gives users durable credentials and fits cleanly into the existing auth model.

5. Pre-cache tokens at exchange time

The /token endpoint already has the API key when it creates it. Caching immediately eliminates the introspect round-trip on the first /mcp request, removing a noticeable delay when tools first load in the client.

6. Use URL fragments for sensitive data

API keys should never appear in server logs. A URL fragment (#api_key=…) is handled entirely client-side, browsers don’t send fragments to servers. This is the same technique used by OAuth implicit flows.

7. Use context variables, not global state

contextvars.ContextVar is the right pattern for per-connection state in async Python. Global dictionaries keyed by connection ID are error-prone and leak between concurrent requests. Context variables are async-safe and scoped precisely to the task tree.

8. Path-aware well-known discovery

Cursor uses path-aware discovery: when connecting to https://mcp.wiv.ai/mcp, it fetches /.well-known/oauth-protected-resource/mcp, not just /.well-known/oauth-protected-resource. The resource URL in the response must match the actual transport path. Register both the root and path-suffixed endpoints.

9. Always CNAME to App Runner. Never an A record

App Runner terminates TLS at its load balancer. If you point an A record directly at the App Runner IP, you bypass TLS termination and end up serving plain HTTP. Always use a CNAME to the App Runner default domain.

Summary

Building a hosted MCP server for Wiv took us from simple stdio tooling to a full production deployment with native OAuth 2.1, dual transport, and multi-client support. The key pieces:

  • AWS App Runner: zero-ops container hosting with automatic HTTPS
  • Streamable HTTP: stateless primary transport, scaling-friendly
  • SSE: fallback transport for backward compatibility
  • Wiv MCP as its own OAuth 2.1 AS: proxying authentication to PropelAuth
  • Dynamic Client Registration (RFC 7591): so clients like Cursor can self-register
  • Dual PKCE chains: one between client and Wiv MCP, one between Wiv MCP and PropelAuth
  • Token introspection + API key exchange: bridging opaque OAuth tokens to Wiv’s API key auth
  • Token pre-caching: instant first-request resolution
  • contextvars: safe per-connection state in async Python
  • ASGI middleware: transport routing without framework conflicts
  • Multi-client /connect page: OAuth and API key options for seven clients

The result: any Wiv user can connect their AI assistant in under 30 seconds native OAuth in Cursor, Claude, and ChatGPT; simple API key configs for everything else.