Skip to content

Conversation

johnlam90
Copy link
Owner

Release v1.3.8: GitHub Container Registry Migration and Configuration Synchronization

This release focuses on migrating from Docker Hub to GitHub Container Registry and ensuring all configurations are properly synchronized.

Key Changes

Container Registry Migration

  • Migrated all image references from Docker Hub to GitHub Container Registry (ghcr.io)
  • Updated Helm chart values.yaml to use ghcr.io/johnlam90/aws-multi-eni-controller
  • Updated deployment.yaml and eni-manager-daemonset.yaml manifests
  • Consistent v1.3.8 image tags across all configurations

Configuration Synchronization

  • Synchronized YAML manifests with Helm chart templates
  • Added missing NODE_NAME environment variable to controller deployment
  • Updated resource limits to match Helm chart values (500m CPU, 512Mi memory)
  • Added LOG_LEVEL environment variable to ENI manager daemonset
  • Set MAX_CONCURRENT_RECONCILES to 5 consistently across all manifests

Code Quality Improvements

  • Fixed golint issue in network manager (removed unnecessary else block)
  • All Go code quality checks passing:
    • go vet, go fmt, golint, gocyclo, ineffassign, misspell
  • All unit and integration tests passing

Version Management

  • Bumped Chart.yaml version to 1.3.8
  • Updated appVersion to v1.3.8
  • Consistent version tagging across all components

Compatibility

  • Fully backward compatible with existing deployments
  • Helm chart can be upgraded seamlessly
  • No breaking changes to APIs or configurations

Deployment

  • Use ghcr.io/johnlam90/aws-multi-eni-controller:v1.3.8 for container images
  • Helm chart version 1.3.8 available in GitHub Container Registry
  • All deployment manifests updated and tested

Testing

  • ✅ All Go code quality checks passed
  • ✅ All unit tests passed
  • ✅ All integration tests passed
  • ✅ Build verification successful
  • ✅ Configuration synchronization verified

Files Changed

  • charts/aws-multi-eni-controller/Chart.yaml - Version bump to 1.3.8
  • charts/aws-multi-eni-controller/values.yaml - Updated to use ghcr.io registry
  • deploy/deployment.yaml - Updated image reference and added NODE_NAME env var
  • deploy/eni-manager-daemonset.yaml - Updated image references and added LOG_LEVEL
  • pkg/eni-manager/network/manager.go - Fixed golint issue

This PR is ready for merge and will trigger the automated GitHub Actions workflows for Helm chart packaging and container image publishing to GitHub Container Registry.


Pull Request opened by Augment Code with guidance from the PR author

John Lam added 14 commits June 10, 2025 18:33
Implements full support for Instance Metadata Service Version 2 (IMDSv2) to ensure compatibility with Amazon Linux 2023 nodes that enforce IMDSv2 by default.

Key changes:
- Configure AWS SDK environment variables to enforce IMDSv2 usage
- Add timeout and retry settings for reliable credential retrieval
- Update DPDK setup to work properly on Amazon Linux 2023
- Improve ENI pattern detection for AL2023 network interfaces
- Add comprehensive documentation on IMDSv2 support
- Add test scripts and unit tests for IMDSv2 functionality

This ensures the controller works seamlessly on both Amazon Linux 2 and Amazon Linux 2023 without requiring manual IMDS configuration changes.
Implements automatic configuration of EC2 instance metadata hop limit to ensure IMDS requests work reliably from containerized environments.

- Adds environment variables to control hop limit configuration
- Implements detection of current instance ID and IMDS settings
- Updates hop limit only when needed using EC2 API
- Documents required IAM permissions and common troubleshooting steps
- Updates README and documentation with enhanced IMDSv2 support details

This improves reliability in Kubernetes environments where the default hop limit of 1 can cause credential retrieval failures.
Implement multi-strategy IMDS configuration to solve the chicken-and-egg problem
with IMDSv2 access on new instances. This enables automatic node replacement
recovery without manual intervention.

Key improvements:
- Add IRSA (IAM Roles for Service Accounts) support for cloud-native auth
- Implement private IP-based instance ID discovery for new nodes
- Add VPC-wide IMDS configuration as fallback strategy
- Create background IMDS configuration retry mechanism
- Add comprehensive documentation for IRSA setup

Bump version to v1.3.6 to reflect these improvements.
- Simplifies authentication flow by using sequential variable assignment
  instead of nested if-else blocks for better readability
- Promotes AWS SDK dependencies (credentials, imds, sts) from indirect to
  direct dependencies to make them explicit requirements
- Cleans up whitespace in IMDS test environment variable declarations

These changes improve code maintainability while preserving the existing
authentication strategy sequence.
Implements robust retry mechanisms for handling resource version conflicts
when updating NodeENI resources and their status. This helps prevent
reconciliation failures in concurrent update scenarios.

Key improvements:
- Adds exponential backoff retry logic for NodeENI updates and status updates
- Preserves intended changes during retry attempts
- Updates all NodeENI modification points to use retry-capable methods
- Adds comprehensive unit tests to verify retry behavior

Also updates container image repository to use ghcr.io and updates tag to beta-332d180.
Corrects the device index mapping for ENS interfaces by subtracting 5 instead of 4,
which aligns with EKS configurations where ens5 corresponds to device index 0.

Changes behavior for handling stale attachments when nodes no longer match NodeENI
selectors - now properly marks these as stale for cleanup rather than keeping them.

Improves E2E tests with better retry logic, more robust label management, and
additional logging for better test diagnostics.

Updates container images to latest versions with relevant fixes.
Improves network interface device index detection reliability by:
- Adding a primary method that reads device index directly from sysfs
- Maintaining the name-based parsing as a fallback mechanism
- Adding logging to indicate which method was used

This makes the interface detection more robust across different system configurations where sysfs information is available, reducing reliance on interface naming conventions.
Fix ENS interface device index calculation by implementing a more robust hybrid approach:
- Correct ens8 interface device index calculation (ens_number - 5 formula)
- Add comprehensive unit tests for device index calculation logic
- Implement sysfs-based calculation with name-based fallback
- Reorganize sample files for better organization
- Support cross-platform compatibility between Amazon Linux versions

Updates Chart version to 1.3.7 with comprehensive test suite and changes image repository.
Changes the image repository from Docker Hub to GitHub Container Registry (ghcr.io)
and updates the tag from v1.3.6-comprehensive-tests to beta-a86ce7b.

This switch to GitHub Container Registry provides better integration with
GitHub workflows and actions, while the new tag reflects the current
development branch.
Implements a more granular locking strategy for ENI cleanup operations:

- Adds node-level coordination for DPDK/SR-IOV ENIs while using granular locking for standard ENIs
- Enhances interface detection with better logging and retry mechanisms
- Improves wait logic for expected interfaces to appear based on NodeENI attachments
- Updates image references to use the fixed AL2023 DPDK setup version
- Adds more detailed logging for troubleshooting coordination issues

These changes help prevent race conditions during cleanup operations while allowing higher parallelism for standard ENI operations.
Implements a new GetInstanceENIs method to retrieve all ENIs attached to an EC2 instance, returning a map of device index to ENI ID. This helps prevent race conditions during ENI creation.

Enhances logging in the NodeENI controller to better track ENI attachment operations, particularly during the subnet processing and device index determination phases.

Updates the NodeENI controller to immediately update status after ENI attachment to prevent race conditions in subsequent reconciliations.

Updates container image tags to the latest version with cleanup fixes.
…ronize configurations

- Update all image references from Docker Hub to GitHub Container Registry (ghcr.io)
- Synchronize YAML manifests with Helm chart configurations
- Bump version to v1.3.8 in Chart.yaml and all image tags
- Add missing NODE_NAME environment variable to deployment.yaml
- Update resource limits to match Helm chart values (500m CPU, 512Mi memory)
- Add LOG_LEVEL environment variable to eni-manager-daemonset.yaml
- Fix golint issue in pkg/eni-manager/network/manager.go (remove unnecessary else block)
- Ensure MAX_CONCURRENT_RECONCILES is set to 5 consistently across all manifests
- All Go code quality checks passing (go vet, golint, gocyclo, ineffassign, misspell)
- All tests passing successfully
@johnlam90 johnlam90 merged commit ebc2c8e into main Jul 3, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant