Skip to content
This repository was archived by the owner on Dec 17, 2018. It is now read-only.
This repository was archived by the owner on Dec 17, 2018. It is now read-only.

运行出问题 #11

@my-dady

Description

@my-dady

2018-05-30 15:33:15 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: crawler)
2018-05-30 15:33:15 [scrapy.utils.log] INFO: Versions: lxml 4.1.1.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.5.0, Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 17.5.0 (OpenSSL 1.0.2n 7 Dec 2017), cryptography 2.1.4, Platform Windows-10-10.0.16299-SP0
2018-05-30 15:33:15 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'crawler', 'COOKIES_DEBUG': True, 'DOWNLOAD_DELAY': 1.0, 'DOWNLOAD_TIMEOUT': 10, 'LOG_FILE': 'C:\Users\myh\Desktop\PatentCrawler-master\output\20180530_153315\PatentCrawler.log', 'NEWSPIDER_MODULE': 'crawler.spiders', 'RETRY_TIMES': 3, 'SPIDER_MODULES': ['crawler.spiders']}
2018-05-30 15:33:15 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2018-05-30 15:33:16 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'crawler.middlewares.PatentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-05-30 15:33:16 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-05-30 15:33:16 [scrapy.middleware] INFO: Enabled item pipelines:
['crawler.pipelines.CrawlerPipeline']
2018-05-30 15:33:16 [scrapy.core.engine] INFO: Spider opened
2018-05-30 15:33:16 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-05-30 15:33:16 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-05-30 15:33:17 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:17 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/pageIsUesd-pageUsed.shtml HTTP/1.1" 200 None
2018-05-30 15:33:17 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:17 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/patentsearch/tableSearch-showTableSearchIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:18 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:18 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/login-showPic.shtml HTTP/1.1" 200 None
2018-05-30 15:33:18 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:18 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/wee/platform/wee_security_check HTTP/1.1" 302 None
2018-05-30 15:33:18 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uilogin-loginSuccess.shtml?params=991CFE73D4DF553253D44E119219BF31366856FF4B15222669397E093A956A2C&j_loginsuccess_url= HTTP/1.1" 302 None
2018-05-30 15:33:18 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uiIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:19 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:19 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/showViewList-jumpToView.shtml HTTP/1.1" 200 None
2018-05-30 15:33:19 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml> (failed 1 times): unlogin
2018-05-30 15:33:19 [scrapy.downloadermiddlewares.cookies] DEBUG: Sending cookies to: <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml>
Cookie: JSESSIONID=x1Sv9YxmnHdXesCJk04Y3SMqTX3yBIpnhcwf0uKlEOg9TlE-gYYY!309799008!187544033; IS_LOGIN=true; WEE_SID=x1Sv9YxmnHdXesCJk04Y3SMqTX3yBIpnhcwf0uKlEOg9TlE-gYYY!309799008!187544033!1527665495142

2018-05-30 15:33:19 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:19 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/pageIsUesd-pageUsed.shtml HTTP/1.1" 200 None
2018-05-30 15:33:19 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:19 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/patentsearch/tableSearch-showTableSearchIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:19 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/login-showPic.shtml HTTP/1.1" 200 None
2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/wee/platform/wee_security_check HTTP/1.1" 302 None
2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uilogin-loginSuccess.shtml?params=991CFE73D4DF553253D44E119219BF31366856FF4B15222669397E093A956A2C&j_loginsuccess_url= HTTP/1.1" 302 None
2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uiIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/showViewList-jumpToView.shtml HTTP/1.1" 200 None
2018-05-30 15:33:20 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml> (failed 2 times): unlogin
2018-05-30 15:33:20 [scrapy.downloadermiddlewares.cookies] DEBUG: Sending cookies to: <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml>
Cookie: JSESSIONID=enOv9ZPDdp7oeLhqlYjU_gHhiJA63dF52InwKDPUfwSJwT4OC0x4!309799008!187544033; IS_LOGIN=true; WEE_SID=enOv9ZPDdp7oeLhqlYjU_gHhiJA63dF52InwKDPUfwSJwT4OC0x4!309799008!187544033!1527665497027

2018-05-30 15:33:20 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:21 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/pageIsUesd-pageUsed.shtml HTTP/1.1" 200 None
2018-05-30 15:33:21 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:21 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/patentsearch/tableSearch-showTableSearchIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:21 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:21 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/login-showPic.shtml HTTP/1.1" 200 None
2018-05-30 15:33:21 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/wee/platform/wee_security_check HTTP/1.1" 302 None
2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uilogin-loginSuccess.shtml?params=991CFE73D4DF553253D44E119219BF31366856FF4B15222669397E093A956A2C&j_loginsuccess_url= HTTP/1.1" 302 None
2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uiIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/showViewList-jumpToView.shtml HTTP/1.1" 200 None
2018-05-30 15:33:22 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml> (failed 3 times): unlogin
2018-05-30 15:33:22 [scrapy.downloadermiddlewares.cookies] DEBUG: Sending cookies to: <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml>
Cookie: JSESSIONID=fdyv9Zmxa7oMcWvdvBHwiuh8nvKhmeaYnZ03iat0rUfX2SfDs-5E!309799008!187544033; IS_LOGIN=true; WEE_SID=fdyv9Zmxa7oMcWvdvBHwiuh8nvKhmeaYnZ03iat0rUfX2SfDs-5E!309799008!187544033!1527665498545

2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/pageIsUesd-pageUsed.shtml HTTP/1.1" 200 None
2018-05-30 15:33:22 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:23 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/patentsearch/tableSearch-showTableSearchIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:23 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:23 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/login-showPic.shtml HTTP/1.1" 200 None
2018-05-30 15:33:23 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:23 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/wee/platform/wee_security_check HTTP/1.1" 302 None
2018-05-30 15:33:23 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uilogin-loginSuccess.shtml?params=991CFE73D4DF553253D44E119219BF31366856FF4B15222669397E093A956A2C&j_loginsuccess_url= HTTP/1.1" 302 None
2018-05-30 15:33:24 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "GET /sipopublicsearch/portal/uiIndex.shtml HTTP/1.1" 200 None
2018-05-30 15:33:24 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): www.pss-system.gov.cn
2018-05-30 15:33:24 [urllib3.connectionpool] DEBUG: http://www.pss-system.gov.cn:80 "POST /sipopublicsearch/patentsearch/showViewList-jumpToView.shtml HTTP/1.1" 200 None
2018-05-30 15:33:24 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml> (failed 4 times): unlogin
2018-05-30 15:33:24 [scrapy.core.scraper] ERROR: Error downloading <POST http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml>
Traceback (most recent call last):
File "D:\Program Files (x86)\anaconda\lib\site-packages\twisted\internet\defer.py", line 1386, in _inlineCallbacks
result = g.send(result)
File "D:\Program Files (x86)\anaconda\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
File "D:\Program Files (x86)\anaconda\lib\site-packages\twisted\internet\defer.py", line 1363, in returnValue
raise _DefGen_Return(val)
twisted.internet.defer._DefGen_Return: <404 http://www.pss-system.gov.cn/sipopublicsearch/patentsearch/executeTableSearch0402-executeCommandSearch.shtml>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\Program Files (x86)\anaconda\lib\site-packages\twisted\internet\defer.py", line 1386, in _inlineCallbacks
result = g.send(result)
File "D:\Program Files (x86)\anaconda\lib\site-packages\scrapy\core\downloader\middleware.py", line 56, in process_response
(six.get_method_self(method).class.name, type(response))
AssertionError: Middleware PatentMiddleware.process_response must return Response or Request, got <class 'NoneType'>
2018-05-30 15:33:24 [scrapy.core.engine] INFO: Closing spider (finished)
2018-05-30 15:33:24 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 4368,
'downloader/request_count': 4,
'downloader/request_method_count/POST': 4,
'downloader/response_bytes': 6301,
'downloader/response_count': 4,
'downloader/response_status_count/404': 4,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2018, 5, 30, 7, 33, 24, 666286),
'log_count/DEBUG': 56,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'retry/count': 3,
'retry/max_reached': 1,
'retry/reason_count/unlogin': 3,
'scheduler/dequeued': 4,
'scheduler/dequeued/memory': 4,
'scheduler/enqueued': 4,
'scheduler/enqueued/memory': 4,
'start_time': datetime.datetime(2018, 5, 30, 7, 33, 16, 985230)}
2018-05-30 15:33:24 [scrapy.core.engine] INFO: Spider closed (finished)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions