Skip to content

chapter not parsing  #1

@MaqAnquor

Description

@MaqAnquor

For URL http://www.wuxiaworld.com/st-index/book-11-chapter-33/ Book 11 chapter 33
there are total 3 "hr" tags and so the page is not properly parsed same is the case with other pages with similar html structure

start_tag = ch_soup.find("hr") start_tag = start_tag.find_next(True) for p in start_tag.find_all_next(True): if p.name == "hr": break elif "Previous Chapter" in p.text and "Next Chapter" in p.text: break elif p.name == "p": out.write(unicode(p)) out.write("\n")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions