-
Actually, I was going to be opening issue about this, but would like to discuss over here first to be sure if I am not making any mistake. Is the output json supposed to be using single-quotes( I am trying to crawl this webpage using Code: from haystack.connector import Crawler
crawler = Crawler(output_dir="crawled_files")
# crawl Haystack docs, i.e. all pages that include haystack.deepset.ai/docs/
docs = crawler.crawl(urls=["https://www.businesswire.com/news/home/20200717005310/en/Global-8.5-Bn-Baobab-Powder-Market-Outlook-2020-2027---ResearchAndMarkets.com"],
filter_urls= ["Baobab"]) Output: {'meta': {'url': 'https://www.businesswire.com/news/home/20200717005310/en/Global-8.5-Bn-Baobab-Powder-Market-Outlook-2020-2027---ResearchAndMarkets.com', 'base_url': 'https://www.businesswire.com/news/home/20200717005310/en/Global-8.5-Bn-Baobab-Powder-Market-Outlook-2020-2027---ResearchAndMarkets.com'}, 'text': 'Global $8.5 Bn Baobab Powder Market Outlook 2020-2027 - ResearchAndMarkets.com\nJuly 17, 2020 09:20 AM Eastern Daylight Time\nDUBLIN--(BUSINESS WIRE)--The "Baobab Powder - Global Market Trajectory & Analytics" report has been added to ResearchAndMarkets.com\'s offering.\n“Baobab Powder - Global Market Trajectory & Analytics”\nTweet this\nThe publisher brings years of research experience to this 7th edition of this report. The 276-page report presents concise insights into how the pandemic has impacted production and the buy side for 2020 and 2021. A short-term phased recovery by key geography is also addressed.\nGlobal Baobab Powder Market to Reach US$8.5 Billion by the Year 2027\nAmid the COVID-19 crisis, the global market for Baobab Powder, estimated at US$6 Billion in the year 2020, is projected to reach a revised size of US$8.5 Billion by 2027, growing at a CAGR of 5.1% over the period 2020-2027.\nOrganic Baobab Powder, one of the segments analyzed in the report, is projected to grow at a 5.4% CAGR to reach US$7.2 Billion by the end of the analysis period. After an early analysis of the business implications of the pandemic and its induced economic crisis, growth in the Conventional Baobab Powder segment is readjusted to a revised 3.6% CAGR for the next 7-year period. This segment currently accounts for a 16.7% share of the global Baobab Powder market.\nThe U.S. Accounts for Over 27.1% of Global Market Size in 2020, While China is Forecast to Grow at a 7.8% CAGR for the Period of 2020-2027\nThe Baobab Powder market in the U.S. is estimated at US$1.6 Billion in the year 2020. The country currently accounts for a 27.09% share in the global market. China, the world second largest economy, is forecast to reach an estimated market size of US$1.8 Billion in the year 2027 trailing a CAGR of 7.8% through 2027.\nAmong the other noteworthy geographic markets are Japan and Canada, each forecast to grow at 2.8% and 4.5% respectively over the 2020-2027 period. Within Europe, Germany is forecast to grow at approximately 3.1% CAGR while Rest of European market (as defined in the study) will reach US$1.8 Billion by the year 2027.\nCompetitors identified in this market include, among others:\nADUNA Ltd.\nALAFFIA\nAtacora Essential, Inc.\nBaobab Foods, Inc.\nBaobab Fruit Company Senegal\nB\'Ayoba (Pvt) Ltd.\nEcoProducts\nEcuadorian Rainforest LLC\nFarafena\nHolland & Barrett Retail Ltd.\nIndigo Herbs Ltd.\nKiki Ltd.\nOrganic Africa\nOrganic Burst UK Ltd.\nOrganic Herb Trading Company\nPowbab, Inc.\nStern Ingredients, Inc.\nSuperfruit Scandinavia AB\nZ Natural Foods, LLC\nTotal Companies Profiled: 42\nFor more information about this report visit https://www.researchandmarkets.com/r/2hnl1y\nContacts\nResearchAndMarkets.com\nLaura Wood, Senior Press Manager\npress@researchandmarkets.com\n\nFor E.S.T Office Hours Call 1-917-300-0470\nFor U.S./CAN Toll Free Call 1-800-526-8630\nFor GMT Office Hours Call +353-1-416-8900\n\n\n\n\n\n\nLog In\nSign Up\nMore from Business Wire\nBlog\nUK/Ireland\nDeutschland\nFrance\nHong Kong\nItaly\nJapan\nTradeshownews.com\nContact Us\nUK Tax Strategy\nPrivacy\nManage Cookies\nTerms of Use\n© 2021 Business Wire, Inc.\nBy clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. Cookie Policy\nCookies Settings Accept All Cookies'} Then, I used the JsonFormatter Website, and the correct output json is:
When I try to pass the json output from crawler module (the corrupt one) into from haystack.preprocessor.preprocessor import PreProcessor
preprocessor = PreProcessor(
clean_empty_lines=True,
clean_whitespace=True,
clean_header_footer=False,
split_by="word",
split_length=100,
split_respect_sentence_boundary=True
)
docs_default = preprocessor.process(docs)
print(f"n_docs_input: 1\nn_docs_output: {len(docs_default)}") It gives an error:
So, does the |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 6 replies
-
Hi @prikmm thank you for your question. Maybe @DIVYA-19 can shed some light on this problem based on the contribution made in #775 ? |
Beta Was this translation helpful? Give feedback.
-
Hi @prikmm, Crawler.crawl() returns list of paths. as @julian-risch said PreProcesser.preprocess() takes input as document in the form of dictionary or list of dictionaries as input |
Beta Was this translation helpful? Give feedback.
-
@julian-risch I have created a PR to add colab usecase to crawler as per the dicussion here |
Beta Was this translation helpful? Give feedback.
-
I am new to haystack. I am trying the same example see code below: print(f"n_docs_input: 1\nn_docs_output: {(docs)}") but my response contains an html tags, i tried using above preprocessor but with no help. How can i get the clean test from webpage ').html(rawHeadline).text();\n var len = twttr.txt.getTweetLength(rawHeadline);\n var maxLength = Number('24') - 1;\n while (len > maxLength) {\n rawHeadline = rawHeadline.substring(0,rawHeadline.length-1);\n len = twttr.txt.getTweetLength(rawHeadline);\n }\n return rawHeadline;\n }\n\n var maxTotal = 280 - Number('24');\n function twitterTruncatePullQuote(text) {\n text = $('
').html(text).text();\n var len = twttr.txt.getTweetLength(text);\n if (len<=maxTotal) {\n return text;\n }\n var textLength = twttr.txt.getUnicodeTextLength(text);\n var urlsWithIndices = twttr.txt.extractUrlsWithIndices(text);\n twttr.txt.modifyIndicesFromUTF16ToUnicode(text, urlsWithIndices);\n var i = urlsWithIndices.length-1;\n\n if (i<0) { //if pullquote doesn't contain any urls, which is almost all the case.\n var shorten = 'Baobab Powder - Global Market Trajectory & Analytics';\n return $(' ').html(shorten).text();\n }\n var maxLen = maxTotal-3; //3 is for trailing ...\n var newStr = text;\n var newLen = twttr.txt.getTweetLength(newStr);\n\n while (newLen>maxLen && i>=0) {\n var buildStr = newStr.substring(0,urlsWithIndices[i].indices[1]);\n var buildLen = twttr.txt.getTweetLength(buildStr);\n\n if (buildLen>maxLen ) {\n newStr = newStr.substring(0,urlsWithIndices[i].indices[0]);\n newLen = twttr.txt.getTweetLength(newStr);\n } else {\n var lack = maxLen - buildLen;\n if (twttr.txt.getUnicodeTextLength(newStr)<=urlsWithIndices[i].indices[1]+lack) {\n newStr = newStr;\n } else {\n newStr = newStr.substring(0,urlsWithIndices[i].indices[1]+lack);\n }\n }\n\n newLen = twttr.txt.getTweetLength(newStr);\n i=i-1;\n }\n\n if (newLen>maxLen) {\n newStr = newStr.substring(0,maxLen);\n }\n\n return newStr+'...';\n }\n\n function addPullQuote() {\n jQuery('#pull-quote').append('“Baobab Powder - Global Market Trajectory & Analytics”');\n\n if (1==0 || true) {\n var pullQuoteEle = document.getElementById('pull-quote');\n var tweettthis = 'Tweet this';\n var tweetthisUrl = ''+tweettthis+''\n\n pullQuoteEle.insertAdjacentHTML('afterend', tweetthisUrl);\n var pullQuoter = document.getElementById('tweet-pull-quote');\n pullQuoter.addEventListener('click', function(e) {\n e.preventDefault();\n var pullQuote = 'Baobab Powder - Global Market Trajectory & Analytics';\n\n if (0==0 || twttr.txt.getTweetLength(pullQuote)>maxTotal) {\n pullQuote = twitterTruncatePullQuote(pullQuote);\n } else {\n pullQuote = $(' ').html(pullQuote).text();\n }\n pullQuote = encodeURIComponent(pullQuote);\n\n var rlPermalink = twitterTruncateHeadline('https://www.businesswire.com/news/home/20200717005310/en/Global-8.5-Bn-Baobab-Powder-Market-Outlook-2020-2027---ResearchAndMarkets.com');\n\n twitterAPIUrl = "https://twitter.com/intent/tweet?text=" + pullQuote;\n twitterAPIUrl += '&url='+encodeURIComponent(rlPermalink);\n\n window.open(twitterAPIUrl, '_blank', "height=420,width=520");\n })\n }\n };\n\n // Add popup links\n var ndmPopups = new BWLinkEnhancement();\n ndmPopups.addPopupOnclicks();\n\n jQuery('#ajaxRecentStories').load('/portal/site/home/template.BINARYPORTLET/permalink/resource.process/;portal.JSESSIONID=a0OvkLTiPet-AhZZniluztVCW_5xMKgEbUzx0KP5kWNHlkCZTDyp!705051433!-967996342?javax.portlet.tpst=e8d55157ef2522ec12306b100d908a0c&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_releaseid=20200717005310&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_mmgroupid=&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_language=en&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_displayLanguage=en&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_displayReleaseId=20200717005310&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_language=en&javax.portlet.rid_e8d55157ef2522ec12306b100d908a0c=recentStories&javax.portlet.rcl_e8d55157ef2522ec12306b100d908a0c=cacheLevelPage&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken');\n jQuery('#ajaxReleaseVersions').load('/portal/site/home/template.BINARYPORTLET/permalink/resource.process/;portal.JSESSIONID=a0OvkLTiPet-AhZZniluztVCW_5xMKgEbUzx0KP5kWNHlkCZTDyp!705051433!-967996342?javax.portlet.tpst=e8d55157ef2522ec12306b100d908a0c&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_releaseid=20200717005310&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_mmgroupid=&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_language=en&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_displayLanguage=en&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_displayReleaseId=20200717005310&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_language=en&javax.portlet.rid_e8d55157ef2522ec12306b100d908a0c=releaseVersions&javax.portlet.rcl_e8d55157ef2522ec12306b100d908a0c=cacheLevelPage&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken');\n jQuery('#companyInformation').load('/portal/site/home/template.BINARYPORTLET/permalink/resource.process/;portal.JSESSIONID=a0OvkLTiPet-AhZZniluztVCW_5xMKgEbUzx0KP5kWNHlkCZTDyp!705051433!-967996342?javax.portlet.tpst=e8d55157ef2522ec12306b100d908a0c&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_releaseid=20200717005310&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_mmgroupid=&javax.portlet.prp_e8d55157ef2522ec12306b100d908a0c_language=en&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_displayLanguage=en&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_displayReleaseId=20200717005310&javax.portlet.rst_e8d55157ef2522ec12306b100d908a0c_language=en&javax.portlet.rid_e8d55157ef2522ec12306b100d908a0c=companyInformation&javax.portlet.rcl_e8d55157ef2522ec12306b100d908a0c=cacheLevelPage&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken');\n\n \n\n \n addPullQuote();\n \n\n\n\n\n\n\n\n\t \n\t \n\t\n\n\n\t\t\t \n\t\n\n\n \n\n\n\n\n\n\t\t\n\t \n\t\n\t\t\n\n\n\n\n\n\n\n\n\n\n\n\n\t\t\t\n\t\t\t\tSite Navigation\n\t\t\t\t\n\n\n\t\t\t\t\t\n\t\t\t\t\t\tHome\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tHome\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tSubmit a Press Release\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\n\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\tServices\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\tNews\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tAll News\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tNews with Multimedia\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tNews by Industry\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tNews by Subject\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tNews by Language\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tTradeshows & Events\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\n\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\tEducation\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tOverview\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tBlog\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tDistribution & Media\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tMedia & Journalist Tools\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tSample Press Release\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tFind Your News Online\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tDisclosure Resources\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\n\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\tAbout Us\n\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tOverview\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tBecome a Member\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tContact Us\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tCareers\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tBusiness Wire Newsroom\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tBusiness Wire Events\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\n\n\t\t\t\t\t\n\n\t\t\t\t\n\t\t\t\n\t\t\t\n\n\n\n\n\n\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\n\n\n\n\n\n\n\n\n\n\n\n\n\t\n\t\tSearch\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\tSearch Options\n\t\t\t\t\tSearch AllSearch NewsSearch SiteAdvanced News Search\n\t\t\t\t\tAdvanced News Search\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t \t\n\n\t\t\t\t\n\t\t\t\n\t\t\n\t\n\n\n\n\n //set ie expire flag here\n var isIEExpired = true;\n\t// Add search options toggle\n\t\n\t// **I18N STRINGS\n\tvar strAll = 'Search All';\n\tvar strNews = 'Search News';\n\tvar strSite = 'Search Site';\n\tvar strAdv = 'Advanced News Search';\n\tvar strOptions = 'Search Options';\n\n\t// **define search action and advanced-search target\n\tvar strAdvLocation = '/portal/site/home/search?javax.portlet.tpst=503a8767054f0df6a9e77a100d908a0c&javax.portlet.pbp_503a8767054f0df6a9e77a100d908a0c_view=advancedSearch&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken';\n\n\tfunction searchByUrl() {\n\t\twindow.location.href = '/portal/site/home/search/?searchType=' + $('#bw-search-type').val() + '&searchTerm=' + encodeURIComponent($('#bw-search-input').val()) + '&searchPage=1';\n\t\treturn false;\n\t}\n\t\n\n\n\t\t\t\t\n\t\t\t\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\t\t\t\n\t\t\t\t\n\n\t\t\t\t\t\n\t\t\t\t\t\tLog In\n\t\t\t\t\t\tSign Up\n\n\t\t\t\t\t\n\n\t\t\t\t\n\t\t\t\n\t\t\t\n\n\t\t\n\t\n\n\n\n\n\n\n\n\n\n\n\n\n\n\t\n\t\t\n\t\t\t\n\t\t\t\tFollow Us\n\t\t\t\t\n\t\t\t\t\tTwitter\n\t\t\t\t\tLinkedIn\t\t\t\t\t\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\tMore from Business Wire\n\t\t\t\t\n\t\t\t\t\tBlog\n\t\t\t\t\tUK/Ireland\n\t\t\t\t\tDeutschland\n\t\t\t\t\tFrance\n\t\t\t\t\tHong Kong\n\t\t\t\t\tItaly\n\t\t\t\t\tJapan\n\t\t\t\t\tTradeshownews.com\n\t\t\t\t\n\t\t\t\n\t\t\t\n\t\t\t\tBusiness Wire Information\n\t\t\t\t\n\t\t\t\t\tContact Us\n\t\t\t\t\tUK Tax Strategy\n\t\t\t\t\tPrivacy Statement\n\t\t\t\t\tManage Cookies\n\t\t\t\t\tTerms of Use\n\t\t\t\t\t© 2023 Business Wire\n\t\t\t\t\n\t\t\t\n\t\t\n\t \n\t \n\t \n\t \n\t Internet Explorer presents a security risk. To ensure the most secure and best overall experience on our website we recommend the latest versions of \n\t Chrome, \n Edge, \n Firefox, or \n Safari. Internet Explorer will not be supported as of August 17, 2021.\t \n\t \n\t \n Internet Explorer is no longer supported. To ensure the most secure and best overall experience on our website, we recommend the latest versions of \n\t Chrome, \n\t Edge, \n\t Firefox, or \n\t Safari.\n\t \n\t \n\t \n \n \n \n\t$(function () {\n\t\tvar expiredate = new Date('2021-08-17T00:00:00');\n\t\tvar now = new Date();\n\t\tvar ieExpired = now.getTime() > expiredate.getTime();\n\t\t//if we find var from config, use it\n\t\tif (typeof isIEExpired !== 'undefined') {\n\t\t\tieExpired = isIEExpired;\n\t\t}\n\t\t//Name cookie\n\t\tvar cookieName = 'IEWarningCookie';\n\t\t//Read cookie\n\t\tvar cookieFound = getCookie(cookieName);\n\t\t// Microsoft Internet Explorer detected in. \n\t\tif (!cookieFound\n\t\t\t&& isIE()) {\n\t\t\tshowIEWarning(true);\n\t\t} else {\n\t\t\tshowIEWarning(false);\n\t\t}\n\n\t\t$('#closeButton').on('click', function () {\n\t\t\tsetCookie(cookieName, 'true');\n\t\t\tshowIEWarning(false);\n\t\t});\n\n\t\tfunction showIEWarning(show) {\n\t\t\tif (show) {\n\t\t\t\tif ($('#bw-home').length) {\n\t\t\t\t\t$('#bw-home').addClass('ie-warn');\n\t\t\t\t}\n\t\t\t\tif ($('#bw-main').length) {\n\t\t\t\t\t$('#bw-main').addClass('ie-warn');\n\t\t\t\t}\n\t\t\t\tif($('#ie-warn').length) {\n\t\t\t\t\t$('#ie-warn').css('display', 'block');\n\t\t\t\t\t$('.ieWarningSpan').css('display', 'block');\n\t\t\t\t\t$('#closeButton').css('display', 'block');\t\n\t\t\t\t\tif (ieExpired) {\t\t\t\t\t\t\n\t\t\t\t\t\t$('#expireContent').css('display', 'block');\t\t\t\t\t\t\n\t\t\t\t\t} else {\n\t\t\t\t\t\t$('#warningContent').css('display', 'block');\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t} else {\n\t\t\t\tif ($('#bw-home').length) {\n\t\t\t\t\t$('#bw-home').removeClass('ie-warn');\n\t\t\t\t}\n\t\t\t\tif ($('#bw-main').length) {\n\t\t\t\t\t$('#bw-main').removeClass('ie-warn');\n\t\t\t\t}\n\t\t\t\tif($('#ie-warn').length) {\n\t\t\t\t\t$('#ie-warn').css('display', 'none');\n\t\t\t\t\t$('.ieWarningSpan').css('display', 'none');\n\t\t\t\t\t$('#closeButton').css('display', 'none');\n\t\t\t\t\tif (ieExpired) {\t\t\t\t\t\t\n\t\t\t\t\t\t$('#warningContent').css('display', 'none');\t\t\t\t\t\t\n\t\t\t\t\t} else {\n\t\t\t\t\t\t$('#warningContent').css('display', 'none');\n\t\t\t\t\t}\t\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\t\t// Create cookie\n\t\tfunction setCookie(key, value) {\n\t\t\tdocument.cookie = key + "=" + value + "; expires=30; path=/;secure=true;domain=" + '.businesswire.com';\n\t\t}\n\n\t\tfunction getCookie(key) {\n\t\t\tvar allCookies = document.cookie;\n\t\t\tvar thisCookiePos = allCookies.indexOf(key);\n\t\t\tif (thisCookiePos != -1) {\n\t\t\t\treturn true;\n\t\t\t}\n\t\t\treturn false;\n\t\t}\n\n\t\tfunction isIE() {\n\t\t\tif (navigator && navigator.userAgent) {\n\t\t\t\tvar ua = navigator.userAgent.toUpperCase();\n\t\t\t\tif (ua.indexOf("MSIE ") > -1\n\t\t\t\t\t|| ua.indexOf("TRIDENT/") > -1) {\n\t\t\t\t\treturn true;\n\t\t\t\t}\n\t\t\t}\n\t\t\treturn false;\n\t\t}\n\t\t\n\t\t$('#login_form').submit(function() {\n\t\t\tif(isIE() && ieExpired ) {\n\t\t\t\tif ($('#ie-expire').length) {\n\t\t\t\t\t$('#ie-expire').css("display","block");\n\t\t\t\t\treturn false;\n\t\t\t\t}\n\t\t\t} else {\n\t\t\t\treturn true;\n\t\t\t}\n\t\t});\n\n\t\t$('#closeIEExpireIcon').on('click', function() {\n\t\t\t$('#ie-expire').css('display','none');\n\t\t});\n\t});\n \n\n\t\n\t\t// Cancel hover menus for touch devices, then change menu link to toggle the hover (but not if whole nav is toggled)\n\t\tvar hasTouch = ("ontouchstart" in window);\n\t\tif ( hasTouch && jQuery('#bw-nav h2').css('display')=='none' ) {\n\t\t\tnew BWNav("#bw-nav > ul", "#bw-nav > ul > li > a").toggleMenus();\n\t\t};\n\t\t\n\t\t// Track outbound links\n\t\tvar followLinks = jQuery('a.bw-outbound');\n\t\tfollowLinks.each(function() {\n\t\t\tjQuery(this).click(function() {\n\t\t\t\tvar thisTrack = jQuery(this).attr('href');\n\t\t\t\t_gaq.push(['_trackEvent', 'Ads', 'Click', thisTrack]);\n\t\t\t});\n\t\t});\n\t\n\t\n\n\n', 'content_type': 'text', 'score': None, 'meta': {'url': 'https://www.businesswire.com/news/home/20200717005310/en/Global-8.5-Bn-Baobab-Powder-Market-Outlook-2020-2027---ResearchAndMarkets.com'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '841a230c9040a5f1af80647795bf425d'}>] |
Beta Was this translation helpful? Give feedback.
@julian-risch I have created a PR to add colab usecase to crawler as per the dicussion here