Skip to content

google cannot parse "Tallest mountain in the world" #149

@fanzhuyifan

Description

@fanzhuyifan

Description
The google engine cannot parse the return results of "Tallest mountain in the world"

To Reproduce
Steps to reproduce the behavior:

from search_engine_parser.core.engines.google import Search
searcher = Search()
results = searcher.search("Tallest mountain in the world")

Expected behavior
Correctly parsed results

Screenshots

Traceback (most recent call last):
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 240, in get_results
    search_results = self.parse_result(results, **kwargs)
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 151, in parse_result
    rdict = self.parse_single_result(each, **kwargs)
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/engines/google.py", line 74, in parse_single_result
    title = r_elem.find('div', class_='BNeawe').text
AttributeError: 'NoneType' object has no attribute 'text'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "XXXXX/temp.py", line 4, in <module>
    results = searcher.search("Tallest mountain in the world")
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 270, in search
    return self.get_results(soup, **kwargs)
  File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 243, in get_results
    raise NoResultsOrTrafficError(
search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists

Desktop (please complete the following information):

  • OS: [Linux]
  • Python Version [3.9.5]
  • Search-engine-parser version [0.6.2] (latest)

Additional context
The result that cannot be parsed:

<div class="ZINbbc xpd O9g5cc uUPGi"><div><div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQCw&amp;usg=AOvVaw1pflhmM0gRBSRK5KlKcTT6"><span></span></a></div><div class="CgE3Ac I9mEQ"><table class="LnMnt"><thead><tr><td class="IxZjcf sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Rank</div></div></td><td class="IxZjcf sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Mountain</div></div></td><td class="IxZjcf sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Country</div></div></td></tr></thead><tbody><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">1.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Everest</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Nepal/Tibet</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">2.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">K2 (Mount Godwin Austen)</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Pakistan/China</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">3.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Kangchenjunga</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">India/Nepal</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">4.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Lhotse</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Nepal/Tibet</div></div></td></tr></tbody></table></div><div class="hwc"><div class="Q0HXG"></div><div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQDA&amp;usg=AOvVaw39wAm-G8SzoUzVMu-r2DX6"><div><span><div class="BNeawe vvjwJb AP7Wnd">The Top Ten: The World's Highest Mountains - Infoplease</div></span><span><div class="BNeawe UPmit AP7Wnd">www.infoplease.com &gt; world &gt; geography &gt; top-ten-worlds-highest-mount...</div></span></div></a></div></div></div></div>

The corresponding result of https://github.com/bisoncorps/search-engine-parser/blob/0418867b3529980d5a4eb71899dec37092fe7df1/search_engine_parser/core/engines/google.py#L66

[<div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQCw&amp;usg=AOvVaw1pflhmM0gRBSRK5KlKcTT6"><span></span></a></div>,
 <div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&amp;sa=U&amp;ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQDA&amp;usg=AOvVaw39wAm-G8SzoUzVMu-r2DX6"><div><span><div class="BNeawe vvjwJb AP7Wnd">The Top Ten: The World's Highest Mountains - Infoplease</div></span><span><div class="BNeawe UPmit AP7Wnd">www.infoplease.com &gt; world &gt; geography &gt; top-ten-worlds-highest-mount...</div></span></div></a></div>]

The first div does not contain the title.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions