Skip to content

Commit 4faffd1

Browse files
authored
fix: built api route image (#108)
* fix: built api route image * fix: code clean up * fix: fixed stream output log level * fix: code cleanup
1 parent 112cecb commit 4faffd1

File tree

6 files changed

+157
-84
lines changed

6 files changed

+157
-84
lines changed

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ python = "^3.9"
2222
typer = {extras = ["all"], version = "^0.13.0"}
2323
rich = "^13.9.4"
2424
boto3 = "^1.35.0"
25-
questionary = "^2.0.1"
25+
questionary = "^2.1.0"
2626
requests = "^2.26"
2727
pydantic = "^2.9.2"
2828
sagemaker = "^2.237.0"

src/emd/cfn/ecs/template.yaml

Lines changed: 39 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,10 @@ Parameters:
1818
Description: The listener to be used for the ECS Endpoint
1919
PublicLoadBalancerSecurityGroup:
2020
Type: String
21-
Default: "sg-0d059e1ba522921fe"
2221
Description: The security group to be used for the ECS Endpoint
22+
APIRouterSecurityGroup:
23+
Type: String
24+
Description: The security group to be used for the API Router
2325
LambdaDeploymentHelperArn:
2426
Type: String
2527
Description: The ARN of the Lambda function for capacity provider association
@@ -35,23 +37,19 @@ Parameters:
3537
Description: The instance type to be used for the ECS Endpoint
3638
ModelId:
3739
Type: String
38-
Default: "qwen2-5-7b"
3940
Description: The emd model ID to be used for the ECS Endpoint
4041
ModelTag:
4142
Type: String
4243
Default: "dev"
4344
Description: The model tag to be used for the ECS Endpoint
4445
FrameWorkType:
4546
Type: String
46-
Default: "awq"
4747
Description: The framework type to be used for the ECS Endpoint
4848
ServiceType:
4949
Type: String
50-
Default: "instruct"
5150
Description: The service type to be used for the ECS Endpoint
5251
EngineType:
5352
Type: String
54-
Default: "qwen2-5-72b"
5553
Description: The engine type to be used for the ECS Endpoint
5654
DesiredCapacity:
5755
Type: Number
@@ -171,6 +169,12 @@ Resources:
171169
Type: Custom::GetPriorityNumber
172170
Properties:
173171
ServiceToken: !Ref LambdaDeploymentHelperArn
172+
ConvertDnsName:
173+
Type: Custom::ConvertDnsName
174+
Properties:
175+
ServiceToken: !Ref LambdaDeploymentHelperArn
176+
ModelName: !Join ['', [!Ref ModelId, '/', !Ref ModelTag]]
177+
174178
ContainerHostSecurityGroup:
175179
Type: AWS::EC2::SecurityGroup
176180
Properties:
@@ -249,6 +253,8 @@ Resources:
249253
PortMappings:
250254
- ContainerPort: !Ref ContainerPort
251255
HostPort: !Ref ContainerPort
256+
Name: model-port
257+
AppProtocol: http
252258
- ContainerPort: 80
253259
HostPort: 80
254260
# Command: [--gpus, all]
@@ -265,7 +271,7 @@ Resources:
265271
interval: 120
266272
retries: 10
267273
timeout: 60
268-
startPeriod: 120
274+
startPeriod: 300
269275
LogConfiguration:
270276
LogDriver: awslogs
271277
Options:
@@ -310,11 +316,31 @@ Resources:
310316
- ContainerName: !Sub '${AWS::StackName}'
311317
ContainerPort: !Ref ContainerPort
312318
TargetGroupArn: !Ref ServiceTargetGroup
319+
ServiceConnectConfiguration:
320+
Enabled: true
321+
Namespace: emd-service-connect-namespace
322+
Services:
323+
- PortName: model-port
324+
DiscoveryName: model-service
325+
ClientAliases:
326+
- Port: !Ref ContainerPort
327+
DnsName: !GetAtt ConvertDnsName.DnsName
313328
ServiceSecurityGroup:
314329
Type: 'AWS::EC2::SecurityGroup'
315330
Properties:
316331
GroupDescription: Security group for service
317332
VpcId: !Ref VPCID
333+
334+
ServiceSecurityGroupIngress:
335+
Type: AWS::EC2::SecurityGroupIngress
336+
Properties:
337+
GroupId: !GetAtt ServiceSecurityGroup.GroupId
338+
IpProtocol: tcp
339+
FromPort: !Ref ContainerPort
340+
ToPort: !Ref ContainerPort
341+
SourceSecurityGroupId: !Ref APIRouterSecurityGroup
342+
Description: Allow traffic from API router service to model service
343+
318344
ServiceTargetGroup:
319345
Type: 'AWS::ElasticLoadBalancingV2::TargetGroup'
320346
Properties:
@@ -362,8 +388,11 @@ Resources:
362388

363389
Outputs:
364390
Model:
365-
Description: Model ID used to generate the response
391+
Description: Model ID used to generate the response.
366392
Value: !Join ['', [!Ref ModelId, '/', !Ref ModelTag]]
367-
PublicLoadBalancerDNSName:
368-
Description: The DNS name of the public load balancer. To use HTTPS, create an SSL certificate in AWS Certificate Manager and attach it to the load balancer.
369-
Value: !Join ['', ['http://', !Ref DNSName, '/', !Ref ModelId, '/', !Ref ModelTag]]
393+
BaseURL:
394+
Description: Please use this URL for the OpenAI-compatible model API, like /v1/chat/completions. To use HTTPS, create an SSL certificate in AWS Certificate Manager and attach it to the load balancer.
395+
Value: !Join ['', ['http://', !Ref DNSName]]
396+
ECSServiceConnect:
397+
Description: Service Connect allows for service-to-service communications with automatic discovery using short names and standard ports.
398+
Value: !Join ['', ['http://', !GetAtt ConvertDnsName.DnsName, ':', !Ref ContainerPort]]

src/emd/cfn/sagemaker_realtime/template.yaml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@ Parameters:
3838
Type: String
3939
Description: The name of the SageMaker Endpoint
4040
Default: "Auto-generate"
41+
DNSName:
42+
Type: String
43+
Description: The DNS name of the public load balancer. To use HTTPS, create an SSL certificate in AWS Certificate Manager and attach it to the load balancer.
4144

4245
Conditions:
4346
UseDefaultEndpointName: !Equals [!Ref SageMakerEndpointName, "Auto-generate"]
@@ -149,8 +152,11 @@ Resources:
149152

150153
Outputs:
151154
Model:
152-
Description: Model ID used to generate the response
155+
Description: Model ID used to generate the response.
153156
Value: !Join ['', [!Ref ModelId, '/', !Ref ModelTag]]
157+
BaseURL:
158+
Description: Please use this URL for the OpenAI-compatible model API, like /v1/chat/completions. To use HTTPS, create an SSL certificate in AWS Certificate Manager and attach it to the load balancer.
159+
Value: !Join ['', ['http://', !Ref DNSName]]
154160
SageMakerEndpointName:
155-
Description: The name of the SageMaker Endpoint
161+
Description: You can use invoke the endpoint by using the SageMaker runtime API.
156162
Value: !GetAtt SageMakerEndpoint.EndpointName

src/emd/cfn/shared/ecs_cluster.py

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ def deploy_vpc_template(region):
170170
return vpc_id, subnets
171171

172172

173-
def deploy_ecs_cluster_template(region, vpc_id, subnets, api_router_uri, use_spot):
173+
def deploy_ecs_cluster_template(region, vpc_id, subnets, use_spot):
174174
client = boto3.client("cloudformation", region_name=region)
175175
stack_name = "EMD-ECS-Cluster"
176176
template_path = f"{CFN_ROOT_PATH}/shared/ecs_cluster.yaml"
@@ -187,10 +187,6 @@ def deploy_ecs_cluster_template(region, vpc_id, subnets, api_router_uri, use_spo
187187
"ParameterKey": "Subnets",
188188
"ParameterValue": subnets,
189189
},
190-
{
191-
"ParameterKey": "APIRouterImageURI",
192-
"ParameterValue": api_router_uri,
193-
},
194190
{
195191
"ParameterKey": "UseSpot",
196192
"ParameterValue": "yes" if use_spot else "no",
@@ -222,10 +218,10 @@ def deploy_ecs_cluster(region, vpc_id=None, subnets=None, use_spot=False):
222218
update_parameters_file("parameters.json", {"VPCID": vpc_id, "Subnets": subnets})
223219

224220
# Build and push Fargate image to ECR as the OpenAI compatible API router
225-
api_router_uri = build_router_image(region)
221+
# api_router_uri = build_router_image(region)
226222

227223
# Deploy the ECS cluster
228-
deploy_ecs_cluster_template(region, vpc_id, subnets, api_router_uri, use_spot)
224+
deploy_ecs_cluster_template(region, vpc_id, subnets, use_spot)
229225

230226
if __name__ == "__main__":
231227
deploy_ecs_cluster("us-east-1")

src/emd/cfn/shared/ecs_cluster.yaml

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Parameters:
1010
APIRouterImageURI:
1111
Type: String
1212
Description: The URI of OpenAI compatbile API router image. If provided, the router will be deployed.
13-
Default: ""
13+
Default: "public.ecr.aws/aws-gcr-solutions/dmaa/api-router:latest"
1414
UseSpot:
1515
Type: String
1616
Description: Use Fargate Spot capacity?
@@ -28,9 +28,11 @@ Resources:
2828
ECSCluster:
2929
Type: AWS::ECS::Cluster
3030
Properties:
31+
ServiceConnectDefaults:
32+
Namespace: emd-service-connect-namespace
3133
ClusterSettings:
3234
- Name: containerInsights
33-
Value: enabled
35+
Value: disabled
3436
PublicLoadBalancerSecurityGroup:
3537
Type: AWS::EC2::SecurityGroup
3638
Properties:
@@ -219,6 +221,25 @@ Resources:
219221
# Send response back to CloudFormation
220222
cfnresponse.send(event, context, response_status, response_data)
221223
224+
def convert_dns_name(event, context):
225+
"""
226+
Converts a DNS name to comply with AWS Service Connect naming rules.
227+
DNS names can only contain lowercase letters and numbers.
228+
"""
229+
import re
230+
231+
# Get the input DNS name from the event
232+
dns_name = event['ResourceProperties'].get('ModelName', '')
233+
234+
# Step 1: Convert to lowercase
235+
dns_name = dns_name.lower()
236+
237+
# Step 2: Replace any character that is not a lowercase letter or number with a hyphen
238+
dns_name = re.sub(r'[^a-z0-9]', '-', dns_name)
239+
240+
response_data = {'DnsName': dns_name}
241+
cfnresponse.send(event, context, cfnresponse.SUCCESS, response_data)
242+
222243
def handler(event, context):
223244
print(event)
224245
print(context)
@@ -371,6 +392,9 @@ Resources:
371392
Weight: 1
372393
- CapacityProvider: FARGATE
373394
Weight: 1
395+
ServiceConnectConfiguration:
396+
Enabled: true
397+
Namespace: emd-service-connect-namespace
374398

375399
APIRouterSecurityGroup:
376400
Type: AWS::EC2::SecurityGroup
@@ -472,3 +496,6 @@ Outputs:
472496
APIRouterTargetGroupArn:
473497
Description: The ARN of the OpenAI compatbile API outer Target Group
474498
Value: !If [DeployAPIRouterCondition, !Ref APIRouterTargetGroup, ""]
499+
APIRouterSecurityGroup:
500+
Description: The security group for the OpenAI compatbile API router
501+
Value: !GetAtt APIRouterSecurityGroup.GroupId

0 commit comments

Comments
 (0)