Skip to content

Conversation

@AbdullahKaratas
Copy link
Collaborator

@AbdullahKaratas AbdullahKaratas commented Jun 11, 2025

Testing Instructions

Quick Test

# Copy test files from local-test-environment branch (if needed)
git checkout local-test-environment -- docker-compose.yml config/ init-db.sql init-test-data.sh test.sh clean-setup.sh

# Run complete test suite
./test.sh

local-test-environment

Expected Results

The test verifies:

  • ✅ Database setup with real Akamai datacenter IDs
  • ✅ andromeda-server startup and RPC communication
  • ✅ andromeda-akamai-reports metrics collection
  • ✅ Prometheus metrics export on :9091/metrics

Sample Metrics Output

andromeda_akamai_requests_5m{datacenter_id="...",domain="...",project_id="..."} 6706
andromeda_akamai_status_5m{datacenter_id="...",domain="...",project_id="..."} 1

See test.sh for detailed step-by-step testing process

abdullah-karatas and others added 11 commits May 26, 2025 13:07
…lnerability, implement domain ownership validation, add project scope enforcement, change API parameter from property-name to domain-id
- Add unit tests for AkamaiMetricsController covering cache operations, initialization, data mapping, and error handling
- Add unit tests for RPCHandlerAkamai covering sync operations, time calculations, datacenter aggregation, and edge cases
- Tests verify business logic including percentage calculations, IP extraction, and data transformation
- All tests are self-contained with no external dependencies and pass consistently
…ison

Addresses staticcheck QF1003 warning by replacing if-else chain with
more idiomatic switch statement for string comparison.
- Remove prometheus.NewMetricWithTimestamp() to avoid data gaps
- Export only most recent values as current metrics
- Add andromeda_akamai_last_update_timestamp metric for data freshness
- Add comprehensive unit tests for historical data handling

This fixes the issue where Prometheus would skip data points and create
gaps due to historical timestamps in the exporter.
This fixes the controller test failures where CommonController.agent
was nil during test execution, causing nil pointer dereferences.
The agent is only used in production, so skipping sync in tests is safe.
Copy link
Contributor

@ronchi-oss ronchi-oss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember anymore if the patch to collector.go is how Andy wanted to fix the Prometheus "warnings". Otherwise, the unit testing in place is not adding value as it mostly is not testing real code (they tests the test code itself). I advise against fixing these tests; it'll be easier to take them out of the PR and make sure the patch to collector.go works and is what we need.

- Remove fake unit tests that were testing test code instead of real code
- Keep functional RPC handler test that tests actual behavior
- Timestamp fix in collector.go remains (main implementation)
- Use configured domain_type from akamai.yaml instead of auto-detection only
- Handle permission errors gracefully when domain already exists
- Enables successful RPC agent startup and communication

These fixes resolve the core RPC communication problems that were
blocking metrics collection. The agent now starts successfully and
can respond to RPC calls from the API server.
Add safety check in collector.go to handle cases where Akamai API
returns dataRows with no datacenters, preventing index out of bounds
panic on dataRow.Datacenters[0]. Logs warning and continues processing.
@AbdullahKaratas AbdullahKaratas changed the base branch from review-akamai-metrics-security-fixes to main July 16, 2025 22:54
@AbdullahKaratas AbdullahKaratas requested review from a team and notandy as code owners July 16, 2025 22:54
Resolved conflicts by:
- Removing HTTP metrics API that was deleted upstream (akamai_metrics.go, swagger endpoints)
- Keeping working RPC handler functionality for andromeda-akamai-reports
- Adding new upstream RPC functions alongside existing handlers
- Updating dependencies to latest versions
- Preserving functional unit tests and Prometheus metrics collection

The andromeda-akamai-reports service remains fully functional.
- Add missing akamai agent import
- Fix CidrBlocksController struct initialization with proper field names
- Ensure compatibility with upstream changes while maintaining functionality
The restapi/** pattern was listed twice in the annotations path array.
This removes the duplicate entry on line 21, keeping only the one on line 16.
The CidrBlocksController is initialized inline in controller.go,
making this constructor function redundant.
The package name is already 'akamai', so the alias is unnecessary.
The package name is 'akamai', not 'agent', so the alias is unnecessary.
…lers

- Removed global akamaiAgent variable as it was unnecessary
- Removed standalone Sync and GetCidrs functions
- Removed manual srv.Handle registrations for sync and get_cidrs
- The RPC server now uses the methods from RPCHandlerAkamai directly
  through the registered RPC service interfaces
- Removed GetDNSMetricsAkamai from rpc_agent_akamai.proto
- Removed implementation from rpc_handler.go
- Regenerated protobuf files to reflect the changes
- This RPC method was not being used anywhere in the codebase
- Removed unused imports from agent.go (net/http)
- Removed unused imports from rpc_handler.go (encoding/json, fmt, net/url, strings, time)
- These imports were only used by the removed GetDNSMetricsAkamai function
These files were not meant to be pushed to the repository:
- clean-setup.sh
- config/akamai.yaml
- config/andromeda.yaml
- docker-compose.yml
- init-db.sql
- init-test-data.sh
- test.sh
@AbdullahKaratas
Copy link
Collaborator Author

I did the manual test, these are some real example outputs when we call the /metrics endpoint:

Akamai request count (5-minute window)

andromeda_akamai_requests_5m{datacenter_id="a10d9971-4a8a-11f0-a028-26af4eb7f05b",domain="perf-canary.ccee.sapcloud.io",project_id="test-project-00
2",target_ip="130.214.48.13"} 11629

Akamai status (0, 1)

andromeda_akamai_status_5m{datacenter_id="c6176476-1509-11f0-bc67-3eda432567b3",domain="eu01.hana.ondemand.com",project_id="test-project-003",targe
t_ip="130.214.199.139"} 1

Last update timestamp

andromeda_akamai_last_update_timestamp{datacenter_id="26817744-4b6e-11f0-a028-26af4eb7f05b",domain="zz02-test.hanavlab.ondemand.com",project_id="te
st-project-001",target_ip="130.214.252.210"} 1.7529363e+09

- Fix EnsureDomain permission error handling to return error instead of assuming domain exists
- Refactor metrics collector loops to eliminate conditional checks and improve readability
@ronchi-oss
Copy link
Contributor

I'll give an approval stamp now, though there are go.mod and go.sum conflicts that need fixing.

ronchi-oss
ronchi-oss previously approved these changes Aug 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants