Skip to content

Wrong encoding in url_parse #102

@strohne

Description

@strohne

Everybody loves encoding issues ;) When parsing urls containing non-ascii-characters the encoding of the domain is messed up and I have not found a way to fix it yet. That's how it works:

# Create UTF-8 string
url <- "https://exämple.org"

#  Conversion is necessary in my RStudio environment
url <- iconv(url,"latin1","UTF-8")
Encoding(url)  # UTF-8
print(url)        # https://exämple.org
 
# Parse
url_parse(url)

Output for the domain part is ex<e3><U+00A4>mple.org. Expected: exämple.org.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions