A REST API to support Reef Guide (AIMS), built with Express, TypeScript, Zod and Prisma, deployable to AWS using CDK.
- Features
- Quick Start
- Prerequisites
- Local Development Setup
- Testing Your Setup
- Development Workflow
- Database Management
- Testing
- Project Structure
- API Endpoints
- AWS Deployment
- CDK Infrastructure
- Configuring CDK
- Using Prisma ORM
- Job System
- Security
- Notes
- Troubleshooting
- Express.js backend with TypeScript
- Prisma ORM for database operations
- Passport based JWT authentication
- AWS CDK for infrastructure as code
- Serverless deployment using AWS Lambda and API Gateway
- Environment-based configuration with Zod validation
- Basic docker compose
Get up and running with the API in minutes:
# Clone the repository
git clone https://github.com/open-AIMS/reefguide-web-api.git
cd reefguide-web-api
# Install dependencies
nvm use 20 # Or your preferred Node.js v18+ version
npm install
# Configure environment
cp .env.example .env
npm run local-keys # Generate JWT keys
# Start PostgreSQL and setup database
docker-compose up -d
npm run prisma-generate
npm run db-reset
# Start development server
npm run dev
# Create a test user
curl -X POST http://localhost:5000/api/auth/register \
-H "Content-Type: application/json" \
-d '{"email":"test@example.com","password":"password123"}'
# Login to get JWT token
curl -X POST http://localhost:5000/api/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"test@example.com","password":"password123"}'
# Use the token for authenticated requests
curl -X GET http://localhost:5000/api/auth/profile \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
For detailed setup instructions, see the Local Development Setup section.
- Node.js (v18+)
- AWS CLI configured with appropriate permissions
- Docker (for local development with Prisma)
- Clone the repository
- Install dependencies:
npm install
- Generate Prisma client:
npm run prisma-generate
- Set up environment variables: Copy
.env.example
to.env
- Generate JWT keys:
npm run local-keys
- Start up psql
docker compose up
- Run DB migration
npm run db-reset
- Start server with auto-restart
npm run dev
If you experience connection issues with the database, try using 127.0.0.1
instead of localhost
in your connection strings:
DATABASE_URL=postgresql://admin:password@127.0.0.1:5432
DIRECT_URL=postgresql://admin:password@127.0.0.1:5432
Verify the API is running:
curl http://localhost:5000/api/health
# Expected response: {"status":"ok"}
Create a test user:
curl -X POST http://localhost:5000/api/auth/register \
-H "Content-Type: application/json" \
-d '{"email":"test@example.com","password":"password123"}'
Get an authentication token:
curl -X POST http://localhost:5000/api/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"test@example.com","password":"password123"}'
After obtaining your JWT token:
- In Postman, create a new request
- Go to the "Authorization" tab
- Select "Bearer Token" from the Type dropdown
- Paste your token into the "Token" field
Alternatively, use environment variables in Postman to store your token for reuse across multiple requests.
# Start the development server with auto-restart
npm run dev
# Run the linter
npm run lint
# Run type checking
npm run typecheck
# Run tests
npm run test
# View and edit data with Prisma Studio
npm run studio
# Reset the database (WARNING: Deletes all data)
npm run db-reset
# Create a new migration after changing schema.prisma
npx prisma migrate dev --name your_migration_name
# Other Prisma operations
npx prisma ...
WARNING: Running the tests involves resetting the specified DB in your .env file. Double check you are not targeting a production DB before running tests.
Setup a local psql DB for integration testing:
docker-compose up
Then, keeping this running, change your .env to specify the local DB.
If you get a connection error, try 127.0.0.1
instead of localhost
.
DATABASE_URL=postgresql://admin:password@localhost:5432
DIRECT_URL=postgresql://admin:password@localhost:5432
Then migrate the DB to latest:
npm run db-reset
Also ensure the rest of your .env file is suitable, specifically, generating local keys is needed:
npm run local-keys
Now run tests:
npm run test
Tests use Jest and Supertest for API testing.
src/
: Source codeapi/
: API-related codedb/
: Database schemas and migrationsinfra/
: AWS CDK infrastructure code
test/
: Test filesconfigs/
: Environment-specific configurationsscripts/
: Utility scriptsconfig.ts
: Loads and validates environment variablesinfra_config.ts
: Defines CDK stack configuration schema
All routes are prefixed with /api
.
All auth routes are prefixed with /auth
.
- Endpoint: POST
/register
- Body:
{ "email": "string", "password": "string" }
- Response:
- Success (200):
{ "userId": "string" }
- Error (400):
{ "message": "User already exists" }
- Success (200):
- Endpoint: POST
/login
- Body:
{ "email": "string", "password": "string" }
- Response:
- Success (200):
{ "token": "JWT_token_string", "refreshToken": "refresh_token_string" }
- Error (401):
{ "message": "Invalid credentials" }
- Success (200):
- Endpoint: POST
/token
- Body:
{ "refreshToken": "string" }
- Response:
- Success (200):
{ "token": "new_JWT_token_string" }
- Error (401):
{ "message": "Invalid refresh token" }
- Success (200):
- Endpoint: GET
/profile
- Headers:
Authorization: Bearer JWT_token_string
- Response:
- Success (200):
{ "user": { "id": "string", "email": "string", "roles": ["string"] } }
- Error (401):
{ "message": "Unauthorized" }
- Error (500):
{ "message": "User object was not available after authorisation." }
- Success (200):
Notes:
- All endpoints use JSON for request and response bodies.
- The register endpoint returns only the user ID on success.
- The login endpoint returns both a JWT token and a refresh token.
- The profile endpoint includes user roles in the response.
- Error responses may vary based on the specific error encountered.
Retrieves a specific polygon by ID.
- Authentication: Required (JWT)
- Authorization: User must own the polygon or be an admin
- Parameters:
id
(path parameter): ID of the polygon
- Response: Returns the polygon object
Retrieves all polygons for the authenticated user, or all polygons if the user is an admin.
- Authentication: Required (JWT)
- Response: Returns an array of polygon objects
Creates a new polygon.
- Authentication: Required (JWT)
- Request Body:
polygon
(JSON): GeoJSON representation of the polygon
- Response: Returns the created polygon object
Updates an existing polygon.
- Authentication: Required (JWT)
- Authorization: User must own the polygon or be an admin
- Parameters:
id
(path parameter): ID of the polygon to update
- Request Body:
polygon
(JSON): Updated GeoJSON representation of the polygon
- Response: Returns the updated polygon object
Deletes a polygon.
- Authentication: Required (JWT)
- Authorization: User must own the polygon or be an admin
- Parameters:
id
(path parameter): ID of the polygon to delete
- Response: 204 No Content on success
Retrieves all notes for the authenticated user, or all notes if the user is an admin.
- Authentication: Required (JWT)
- Response: Returns an array of note objects
Retrieves all notes for a specific polygon.
- Authentication: Required (JWT)
- Authorization: User must own the polygon or be an admin
- Parameters:
id
(path parameter): ID of the polygon
- Response: Returns an array of note objects associated with the polygon
Creates a new note for a given polygon.
- Authentication: Required (JWT)
- Authorization: User must own the polygon or be an admin
- Request Body:
content
(string): Content of the notepolygonId
(number): ID of the polygon to associate the note with
- Response: Returns the created note object
Updates an existing note.
- Authentication: Required (JWT)
- Authorization: User must own the note or be an admin
- Parameters:
id
(path parameter): ID of the note to update
- Request Body:
content
(string): Updated content of the note
- Response: Returns the updated note object
All endpoints require JWT authentication. Admin users have access to all resources, while regular users can only access their own resources. Invalid requests or unauthorized access attempts will result in appropriate error responses.
Routes for managing the ReefGuideAPI ECS cluster scaling and status. All routes are prefixed with /admin
.
Retrieves the current status of the ECS cluster.
- Authentication: Required (JWT)
- Authorization: Admin only
- Parameters: None
- Response: Returns the cluster status object
{
"runningCount": 5,
"pendingCount": 0,
"desiredCount": 3,
"deployments": [
{
"status": "PRIMARY",
"taskDefinition": "arn:aws:ecs:ap-southeast-2:xxx:task-definition/xxx:8",
"desiredCount": 3,
"pendingCount": 0,
"runningCount": 3,
"failedTasks": 0,
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/1311621013630114425 in progress."
},
{
"status": "ACTIVE",
"taskDefinition": "arn:aws:ecs:ap-southeast-2:xxx:task-definition/xxx:8",
"desiredCount": 0,
"pendingCount": 0,
"runningCount": 2,
"failedTasks": 0,
"rolloutState": "COMPLETED",
"rolloutStateReason": "ECS deployment ecs-svc/4408993298399279146 completed."
}
],
"events": [
{
"createdAt": "2024-10-23T01:03:38.329Z",
"message": "(service reefguide-reefguideapireefguideserviceService9CF43A7C-warFx9zMuB8k) registered 2 targets in (target-group arn:aws:elasticloadbalancing:ap-southeast-2:xxx:targetgroup/reefgu-reefg-PAREXETNHOCW/e5dd5bec27f8064b)"
},
{
"createdAt": "2024-10-23T01:03:17.851Z",
"message": "(service reefguide-reefguideapireefguideserviceService9CF43A7C-warFx9zMuB8k) registered 1 targets in (target-group arn:aws:elasticloadbalancing:ap-southeast-2:xxx:targetgroup/reefgu-reefg-PAREXETNHOCW/e5dd5bec27f8064b)"
},
{
"createdAt": "2024-10-23T01:02:36.941Z",
"message": "(service reefguide-reefguideapireefguideserviceService9CF43A7C-warFx9zMuB8k) has started 2 tasks: (task 04dfb8c773cc47a1b0442dc1cabc497a) (task 239bccf79b8841c2afc9009496987dc0)."
},
{
"createdAt": "2024-10-23T01:02:14.081Z",
"message": "(service reefguide-reefguideapireefguideserviceService9CF43A7C-warFx9zMuB8k, taskSet ecs-svc/4408993298399279146) has begun draining connections on 1 tasks."
},
{
"createdAt": "2024-10-23T01:02:14.077Z",
"message": "(service reefguide-reefguideapireefguideserviceService9CF43A7C-warFx9zMuB8k) deregistered 1 targets in (target-group arn:aws:elasticloadbalancing:ap-southeast-2:xxx:targetgroup/reefgu-reefg-PAREXETNHOCW/e5dd5bec27f8064b)"
}
],
"serviceStatus": "ACTIVE"
}
Scales the ECS cluster to a specified number of tasks.
- Authentication: Required (JWT)
- Authorization: Admin only
- Parameters: None
- Request Body:
{
"desiredCount": number // Between 0 and 10 inclusive
}
- Response: 200 OK
Forces a new deployment of the service, which will pull the latest version of the container image.
- Authentication: Required (JWT)
- Authorization: Admin only
- Parameters: None
- Request Body: None
- Response: 200 OK
- Notes:
- This operation will perform a rolling update with zero downtime
- Old tasks will be replaced with new tasks pulling the latest image
- The deployment uses a minimum healthy percent of 50% and maximum percent of 200% to ensure service availability
- Progress can be monitored via the
/admin/status
endpoint
Ensure you have your npm
environment setup and installed:
nvm use 20
npm install
Start by creating a configuration file, using either the local or remote method below.
- Create a config JSON file in
configs/
(e.g.,dev.json
). Thesample.json
includes example values. - Run
npm run aws-keys -- <secret name e.g. dev-reefguide-creds>
with appropriate AWS credentials. - Open the created secret in AWS Secrets Manager and add
DATABASE_URL
andDIRECT_URL
fields for the PostgreSQL DB provider.
With a configuration repo that follows the specifications in the Configuring CDK section:
./config <stage> <namespace> --target <repo clone string>
Example:
./config org dev --target git@github.com:org/repo.git
-
Export your config file path:
export CONFIG_FILE_NAME=<file name.json e.g. dev.json>
-
Ensure you have AWS credentials active for the target environment:
aws sts get-caller-identity
-
Bootstrap CDK environment (if on new account):
npx cdk bootstrap
-
Preview changes:
npx cdk diff
-
Deploy:
npx cdk deploy
The above npm run aws-keys
script will create a secret with the following keys:
JWT_PRIVATE_KEY
JWT_PUBLIC_KEY
JWT_KEY_ID
However it will not include the
DATABASE_URL
DIRECT_URL
This has been intentionally left out of the CDK managed portion of the infrastructure to allow easily plugging in any PostgreSQL DB connection string, however it does mean additional setup.
You can either:
- use the provided RDS DB
- use your own DB
In your configuration, include the db
field like so:
"db": {
"instanceSize": "small",
"storageGb": 50
},
The instance size is one of the standard CDK RDS T4G instance size enums - defined here for reference ec2 instance size.
Deploy the system, then go and look at the auto generated credentials secret in AWS Secret Manager, and retrieve the host, username and password. Build the connection strings with the following format:
DATABASE_URL="postgresql://reefguide:<PASSWORD>@<HOST>/reefguide?sslmode=require?connect_timeout=15&pool_timeout=15"
DIRECT_URL="postgresql://reefguide:<PASSWORD>@<HOST>/reefguide?connect_timeout=15&pool_timeout=15"
replacing the password and host from the connection secret.
Update the secret which you generated earlier, using the AWS console, to include the
DATABASE_URL
DIRECT_URL
keys/values as above.
Your web-api will now use this database.
If you would like to use an alternative psql DB, then
- do not include the
db
field in the config (to avoid deploying an unused instance) - build your own db and associated connection strings in the below format (noting you'll need to specify the db name as prisma will target this DB)
DATABASE_URL="postgresql://reefguide:<PASSWORD>@<HOST>/<DB NAME>?sslmode=require?connect_timeout=15&pool_timeout=15"
DIRECT_URL="postgresql://reefguide:<PASSWORD>@<HOST>/<DB NAME>?connect_timeout=15&pool_timeout=15"
- update your generated keys secret with these key/values
Your web-api will now target your DB.
- VPC: A Virtual Private Cloud with public subnets.
- ECS Cluster: Hosts the ReefGuideAPI Fargate service.
- Application Load Balancer (ALB): Handles incoming traffic and distributes it to the ECS services.
- API Gateway: Manages the REST API for the Web API service.
- Lambda Function: Runs the Web API service.
- EFS (Elastic File System): Provides persistent storage for the ReefGuideAPI service.
- S3 Bucket: Used for intermediary data transfer between the user and the EC2 service instance which mounts the EFS.
- EC2 Instance: Manages the EFS filesystem.
- Route 53: Handles DNS routing.
- ACM (AWS Certificate Manager): Manages SSL/TLS certificates.
- Secrets Manager: Stores sensitive configuration data.
- Runs as a Fargate service in the ECS cluster.
- Uses an Application Load Balancer for traffic distribution.
- Implements auto-scaling based on CPU and memory utilization.
- Utilizes EFS for persistent storage.
- Deployed as a Lambda function.
- Exposed via API Gateway.
- Uses AWS Secrets Manager for storing sensitive data.
- Uses a shared Application Load Balancer for the ReefGuideAPI.
- API Gateway handles routing for the WebAPI.
- Route 53 manages DNS records for both services.
- SSL/TLS certificates are managed through ACM.
- Secrets are stored in AWS Secrets Manager.
- IAM roles control access to various AWS services.
- Ensure AWS CLI is configured with appropriate permissions.
- Create a configuration file in
configs/
(e.g.,dev.json
). - Run
npm run aws-keys -- <secret name>
to set up JWT keys in Secrets Manager. - Add database connection strings to the created secret.
- Bootstrap CDK environment:
npx cdk bootstrap
- Review changes:
npx cdk diff
- Deploy:
CONFIG_FILE_NAME=[env].json npx cdk deploy
- Modify
src/infra/components/
files to adjust individual service configurations. - Update
src/infra/infra.ts
to change overall stack structure. - Adjust auto-scaling, instance types, and other parameters in the configuration JSON files.
This config management system is courtesy of github.com/provena/provena
This repo features a detached configuration management approach. This means that configuration should be stored in a separate private repository. This repo provides a set of utilities which interact with this configuration repository, primarily the ./config
bash script.
config - Configuration management tool for interacting with a private configuration repository
Usage:
config NAMESPACE STAGE [OPTIONS]
config --help | -h
config --version | -v
Options:
--target, -t REPO_CLONE_STRING
The repository clone string
--repo-dir, -d PATH
Path to the pre-cloned repository
--help, -h
Show this help
--version, -v
Show version number
Arguments:
NAMESPACE
The namespace to use (e.g., 'rrap')
STAGE
The stage to use (e.g., 'dev', 'stage')
Environment Variables:
DEBUG
Set to 'true' for verbose output
The central idea of this configuration approach is that each namespace/stage combination contains a set of files, which are gitignored by default in this repo, which are 'merged' into the user's clone of the this repository, allowing temporary access to private information without exposing it in git.
The script builds in functionality to cache the repo which makes available a given namespace/stage combination. These are stored in env.json
, at the repository root, which has a structure like so:
{
"namespace": {
"stage1": "git@github.com:org/repo.git",
"stage2": "git@github.com:org/repo.git",
"stage3": "git@github.com:org/repo.git"
}
}
This saves using the --target
option on every ./config
invocation. You can share this file between team members, but we do not recommend committing it to your repository.
Namespace: This is a grouping that we provide to allow you to separate standalone sets of configurations into distinct groups. For example, you may manage multiple organisation's configurations in one repo. You can just use a single namespace if suitable.
Stage: A stage is a set of configurations within a namespace. This represents a 'deployment' of Provena.
The configuration repository contains configuration files for the this project.
The configuration repo does not contain sample cdk.context.json
files, but we recommend including this in this repo to make sure deployments are deterministic. This will be generated upon first CDK deploy.
The configuration repository is organized using a hierarchical structure based on namespaces and stages:
.
├── README.md
└── <your-namespace>
├── base
├── dev
└── feat
A namespace represents a set of related deployment stages, usually one namespace per organisation/use case.
Within each namespace, there are multiple stages representing different environments/deployment specifications:
base
: Contains common base configurations shared across all stages within the namespacedev
: Sample development environment configurationsfeat
: Sample feature branch workflow environment configurations
The feat stage supports the feature branch deployment workflow which is now a part of the open-source workflow. This makes use of environment variable substitution which is described later.
Configuration files are placed within the appropriate namespace and stage directories. Currently:
.
├── README.md
└── your-namespace
└──── dev
└── configs
└── dev.json
Files in the base
directory of a namespace are applied first, regardless of the target stage. This allows you to define common configurations that are shared across all stages within a namespace.
Files in stage-specific directories (e.g., dev
, test
, prod
) are applied after the base configurations. They can override or extend the base configurations as needed.
The main repository contains a configuration management script that interacts with this configuration repository. Here's how it works:
- The script clones or uses a pre-cloned version of this configuration repository.
- It then copies the relevant configuration files based on the specified namespace and stage.
- The process follows these steps:
a. Copy all files from the
<namespace>/base/
directory (if it exists). b. Copy all files from the<namespace>/<stage>/
directory, potentially overwriting files from the base configuration. - The copied configuration files are then used by the this system for the specified namespace and stage.
- Use version control: Commit and push changes to this repository regularly.
- Document changes: Use clear, descriptive commit messages and update this README if you make structural changes.
- Minimize secrets: Avoid storing sensitive information like passwords or API keys directly in these files. Instead, use secure secret management solutions.
- Update the
src/db/schema.prisma
with your new models - Apply the migration - heeding warnings
npx prisma migrate dev
A job processing system built on top of Postgres and ECS. Uses Postgres as a job queue and tracks job status, assignments, and results.
The system consists of three main components:
- Job Queue (Postgres table)
- API Server (Express)
- Worker Nodes (ECS Tasks)
src/
├── api/
│ ├── routes/
│ │ └── jobs.ts # API routes for job management
│ └── services/
│ └── jobs.ts # Business logic for job processing
prisma/
└── schema.prisma # Database schema including job tables
erDiagram
Job ||--o{ JobAssignment : "has"
Job ||--o{ JobResult : "has"
JobAssignment ||--o| JobResult : "produces"
User ||--o{ Job : "creates"
Job {
int id
datetime created_at
datetime updated_at
enum type
enum status
int user_id
json input_payload
}
JobAssignment {
int id
datetime created_at
datetime updated_at
int job_id
string ecs_task_arn
string ecs_cluster_arn
datetime expires_at
enum storage_scheme
string storage_uri
datetime heartbeat_at
datetime completed_at
}
JobResult {
int id
datetime created_at
int job_id
int assignment_id
json result_payload
enum storage_scheme
string storage_uri
json metadata
}
stateDiagram-v2
[*] --> PENDING: Job Created
PENDING --> IN_PROGRESS: Worker Assignment
note right of IN_PROGRESS
Job is assigned to ECS task
Has 1:1 assignment record
end note
IN_PROGRESS --> SUCCEEDED: Task Completed
IN_PROGRESS --> FAILED: Task Error
IN_PROGRESS --> TIMED_OUT: Assignment Expired
PENDING --> CANCELLED: User Cancelled
IN_PROGRESS --> CANCELLED: User Cancelled
SUCCEEDED --> [*]
FAILED --> [*]
CANCELLED --> [*]
TIMED_OUT --> [*]
sequenceDiagram
critical Launch Job
User->>Web API: launch(payload)
Web API->>DB: create Job
DB-->>Web API: job_id
Web API-->>User: job_id
end
critical Job Manager Polling
loop Poll loop
Manager->>Web API: /poll
Web API->>DB: query (unassigned)
DB-->>Web API: jobs[]
Web API-->>Manager: jobs[]
Note left of Manager: If jobs.length > 0...
end
critical Job Manager Capacity
option Cooldown period complete
Manager->>ECS: launch task type
Note left of Manager: Task dfn corresponds<br/>to task type
end
critical Worker lifecycle
loop Worker poll loop
Worker->>Web API: /poll
Web API-->>Worker: jobs[]
end
option Assign job
Worker->>Web API: assign(task info)
Web API->>DB: Create assignment<br/>update status
option Complete job
Worker->>Worker: Start job
activate Worker
Worker-->>Worker: Job finished
deactivate Worker
option Store result files
Note left of Worker: job includes S3 location
Worker->>S3: Upload files
option Report result
Worker->>Web API: Result(status, payload)
Web API->>DB: Create JobResult<br/>Update JobAssignment<br/>Update Job
end
critical User workflow
loop User status checks
User->>Web API: GET /:id
Web API-->>User: Job status + details
end
option complete task
User->>Web API: GET /:id
Web API-->>User: Job results
User->>Web API: GET /:id/download
Web API->>S3: Presign URLs
S3-->>Web API: Presigned URLs[]
Web API-->>User: urls: Map<key, url>
User->>User: download files
User->>User: visualise results
end
end
- POST /api/jobs - Create new job
- GET /api/jobs/:id - Get job details
- POST /api/jobs/:id/cancel - Cancel a job
- GET /api/jobs/poll - Get available jobs, optional query param: jobType. Returns jobs that are PENDING and have no valid assignments.
- POST /api/jobs/assign - Assign job to worker. Creates assignment record with storage location.
- POST /api/jobs/assignments/:id/result - Submit job results - Updates job status and stores result if successful.
Each job type must define:
- Input payload schema (required)
- Result payload schema (optional)
Example job type configuration:
const jobTypeSchemas = {
SUITABILITY_ASSESSMENT: {
input: z.object({
fieldsHere: z.string(),
}),
result: z
.object({
fieldsHere: z.string(),
})
.optional(),
},
};
- Uses
helmet
for HTTP headers - JWT-based authentication with RS256 algorithm
- Secrets management using AWS Secrets Manager
- Include the JWT token in the Authorization header for authenticated requests
- Handle token expiration (default 1 hour) by refreshing or redirecting to login
-
Issue: Can't connect to database with
localhost
- Solution: Try using
127.0.0.1
instead in your connection string
- Solution: Try using
-
Issue: "Connection refused" error
- Solution: Make sure Docker is running and the container is up
- Check:
docker ps
to confirm PostgreSQL container is running
- Issue: Invalid token errors
- Solution: Make sure you've run
npm run local-keys
to generate keys - Check:
.env
file for JWT-related environment variables
- Solution: Make sure you've run
-
Issue: Prisma client not found
- Solution: Run
npm run prisma-generate
- Solution: Run
-
Issue: Migration errors
- Solution: Check your database connection and try
npm run db-reset
- Solution: Check your database connection and try
- connect to the instance using AWS SSM Connect (go to AWS -> EC2 -> Find the service instance -> Connect -> Connect using SSM)
then get into the ubuntu user and mount the volume
sudo su - ubuntu
cd ~
./mountefs.sh
This should mount the data into /efs/data
with /efs/data/reefguide
being the targeted data directory.