Skip to content

Reparsing problem with non-special URL and double-dot path component #415

@zealousidealroll

Description

@zealousidealroll

When the URL a:/a/..//a is parsed, the resulting URL object has no hostname, and has the path //a

c = 'a'
state = scheme start state
buffer = ""
pointer = 0


c = ':'
state = scheme state
buffer = "a"
pointer = 1


scheme = "a"
buffer = ""
state = path or authority state
pointer = 2


pointer = 3
c = 'a'
state = path state
pointer = 2


state = path state
pointer = 3
c = 'a'
buffer = "a"


state = path state
pointer = 4
c = '/'
path = [ "a" ]
buffer = ""


state = path state
pointer = 5
c = '.'
path = [ "a" ]
buffer = "."


state = path state
pointer = 6
c = '.'
path = [ "a" ]
buffer = ".."


state = path state
pointer = 7
c = '/'
path = [""]
buffer = ""


state = path state
pointer = 8
c = '/'
path = ["", ""]
buffer = ""


state = path state
pointer = 9
c = 'a'
path = ["", ""]
buffer = "a"


state = path state
pointer = 10
c = EOF
path = ["", "", "a"]
buffer = ""

When the resulting URL is serialized, it gets serialized as a://a, which, if it gets reparsed, gets an empty path and a hostname a.

Found while trying to find a spec-compliant resolution for servo/rust-url#459

This isn't a problem for special URLs, which always have a host.

Metadata

Metadata

Assignees

No one assigned

    Labels

    interopImplementations are not interoperable with each otherneeds testsMoving the issue forward requires someone to write teststopic: parser

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions