-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Have you searched for this feature request?
- I searched but did not find similar requests
Problem Statement
Today the MCP requires all the spark servers to be configured in advance in config.yaml.
This is not useable for teams that have dozens and hundreds of static or ephemeral EMR clusters.
Possible Solution
The MCP tools should be able to accept cluster ID/name/ARN dynamically from the agent.
- In case it's ARN, create the client with the same logic.
- in case it's ID, get the cluster ARN by ID using EMR API and then create the client with the same logic.
- in case it's name, find the cluster ARN by name using EMR API and then create the client with the same logic.
cache:
global cache by id/arn
session scoped cache by name (since different emr cluster can reuse a terminated cluster's name)
current static servers configuration options should be kept. In order to use the new feature, set dynamic_emr_clusters_mode=true
in the configuration or env.
in case dynamic_emr_clusters_mode=true
, server
can not be specified (mutual exclusion of modes).
example prompts:
-
by name:
use spark mcp to understand how long did application_1711941627784_93864
take on cluster in-site-graviton-prod -
by ID:
use spark mcp to understand how long did application_1711941627784_93864
take on cluster j-17MUJH7WF1HKH -
by ARN:
use spark mcp to understand how long did application_1711941627784_93864
take on cluste with ARN arn:aws:elasticmapreduce:us-east-1:135511037392:cluster/j-I4VIWMNGOIP7
* I have a draft PR for this that I will open soon
Alternatives Considered
No response