Skip to content

Conversation

lsiepel
Copy link
Contributor

@lsiepel lsiepel commented Oct 22, 2025

As a follow-up to #19159 specific for the http binding

Signed-off-by: Leo Siepel <leosiepel@gmail.com>
@lsiepel lsiepel requested a review from J-N-K as a code owner October 22, 2025 09:07
@lsiepel lsiepel requested a review from Copilot October 22, 2025 09:07
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces the deprecated java.net.URL constructor with a custom URI parsing implementation to modernize the HTTP binding's URL handling approach.

Key Changes:

  • Removed dependency on deprecated URL class constructor
  • Implemented custom regex-based URL parsing with parse() method
  • Maintained existing functionality for IDN (Internationalized Domain Names) conversion and special character handling

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

return URI.create(uri.toASCIIString().replace("+", "%2B").replace("%25%25", "%"));
}

private static final Pattern URL_PATTERN = Pattern.compile("^(?:(?<scheme>[a-zA-Z][a-zA-Z0-9+.-]*):)?//?"
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern makes the '//' after the scheme optional with '//?', which incorrectly allows malformed URLs like 'http:example.com'. RFC 3986 requires '//' for authority-based URIs. The pattern should use '//(' to make the entire authority section optional, or '//' if authority is required.

Suggested change
private static final Pattern URL_PATTERN = Pattern.compile("^(?:(?<scheme>[a-zA-Z][a-zA-Z0-9+.-]*):)?//?"
private static final Pattern URL_PATTERN = Pattern.compile("^(?:(?<scheme>[a-zA-Z][a-zA-Z0-9+.-]*):)?//"

Copilot uses AI. Check for mistakes.

}

private static final Pattern URL_PATTERN = Pattern.compile("^(?:(?<scheme>[a-zA-Z][a-zA-Z0-9+.-]*):)?//?"
+ "(?:(?<userinfo>[^@/?#]*)@)?" + "(?<host>[^:/?#]*)(?::(?<port>\\d+))?" + "(?<path>/[^?#]*)?"
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path pattern requires a leading slash with '/[^?#]', but paths can be relative or empty. This will fail to parse valid URIs like 'http://example.com' (empty path) or 'http://example.com?query' (no path). Change to '(?[^?#])?' to allow any path including empty.

Suggested change
+ "(?:(?<userinfo>[^@/?#]*)@)?" + "(?<host>[^:/?#]*)(?::(?<port>\\d+))?" + "(?<path>/[^?#]*)?"
+ "(?:(?<userinfo>[^@/?#]*)@)?" + "(?<host>[^:/?#]*)(?::(?<port>\\d+))?" + "(?<path>[^?#]*)?"

Copilot uses AI. Check for mistakes.

lsiepel and others added 2 commits October 22, 2025 11:23
…ding/http/internal/Util.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: lsiepel <leosiepel@gmail.com>
…ding/http/internal/Util.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: lsiepel <leosiepel@gmail.com>
@lsiepel
Copy link
Contributor Author

lsiepel commented Oct 22, 2025

Regarding the co-pilot suggestions about the regex. Unless some one is absolutely sure, i like to stick to the current implementation to prevent regression.

Signed-off-by: Leo Siepel <leosiepel@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant