-
Notifications
You must be signed in to change notification settings - Fork 87
Open
Labels
bugSomething isn't workingSomething isn't workinghelp wantedExtra attention is neededExtra attention is needed
Description
Description
The google engine cannot parse the return results of "Tallest mountain in the world"
To Reproduce
Steps to reproduce the behavior:
from search_engine_parser.core.engines.google import Search
searcher = Search()
results = searcher.search("Tallest mountain in the world")
Expected behavior
Correctly parsed results
Screenshots
Traceback (most recent call last):
File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 240, in get_results
search_results = self.parse_result(results, **kwargs)
File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 151, in parse_result
rdict = self.parse_single_result(each, **kwargs)
File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/engines/google.py", line 74, in parse_single_result
title = r_elem.find('div', class_='BNeawe').text
AttributeError: 'NoneType' object has no attribute 'text'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "XXXXX/temp.py", line 4, in <module>
results = searcher.search("Tallest mountain in the world")
File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 270, in search
return self.get_results(soup, **kwargs)
File "XXXXX/.conda/envs/info/lib/python3.9/site-packages/search_engine_parser/core/base.py", line 243, in get_results
raise NoResultsOrTrafficError(
search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists
Desktop (please complete the following information):
- OS: [Linux]
- Python Version [3.9.5]
- Search-engine-parser version [0.6.2] (latest)
Additional context
The result that cannot be parsed:
<div class="ZINbbc xpd O9g5cc uUPGi"><div><div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&sa=U&ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQCw&usg=AOvVaw1pflhmM0gRBSRK5KlKcTT6"><span></span></a></div><div class="CgE3Ac I9mEQ"><table class="LnMnt"><thead><tr><td class="IxZjcf sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Rank</div></div></td><td class="IxZjcf sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Mountain</div></div></td><td class="IxZjcf sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe uEec3 AP7Wnd">Country</div></div></td></tr></thead><tbody><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">1.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Everest</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Nepal/Tibet</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">2.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">K2 (Mount Godwin Austen)</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Pakistan/China</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">3.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Kangchenjunga</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">India/Nepal</div></div></td></tr><tr><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">4.</div></div></td><td class="sjsZvd OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Lhotse</div></div></td><td class="sjsZvd s5aIid OE1use"><div class="hfgVwf"><div class="BNeawe s3v9rd AP7Wnd">Nepal/Tibet</div></div></td></tr></tbody></table></div><div class="hwc"><div class="Q0HXG"></div><div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&sa=U&ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQDA&usg=AOvVaw39wAm-G8SzoUzVMu-r2DX6"><div><span><div class="BNeawe vvjwJb AP7Wnd">The Top Ten: The World's Highest Mountains - Infoplease</div></span><span><div class="BNeawe UPmit AP7Wnd">www.infoplease.com > world > geography > top-ten-worlds-highest-mount...</div></span></div></a></div></div></div></div>
The corresponding result of https://github.com/bisoncorps/search-engine-parser/blob/0418867b3529980d5a4eb71899dec37092fe7df1/search_engine_parser/core/engines/google.py#L66
[<div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&sa=U&ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQCw&usg=AOvVaw1pflhmM0gRBSRK5KlKcTT6"><span></span></a></div>,
<div class="kCrYT"><a href="/url?q=https://www.infoplease.com/world/geography/top-ten-worlds-highest-mountains&sa=U&ved=2ahUKEwih5sjUusjxAhWPFjQIHbqKDhEQFnoECAoQDA&usg=AOvVaw39wAm-G8SzoUzVMu-r2DX6"><div><span><div class="BNeawe vvjwJb AP7Wnd">The Top Ten: The World's Highest Mountains - Infoplease</div></span><span><div class="BNeawe UPmit AP7Wnd">www.infoplease.com > world > geography > top-ten-worlds-highest-mount...</div></span></div></a></div>]
The first div
does not contain the title.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinghelp wantedExtra attention is neededExtra attention is needed