Skip to content

AYJAYY/vB-mTurk-Scraper-1.5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vB-mTurk-Scraper v1.5

Scrapes vBulletin forums for links to mTurk HITs

v1.5 introduced an improved Windows version and there will no longer be duplicates in the logs.

Written in: Python 2.* (Not 3 Compatible OR Tested)
Linux Requirements: BeautifulSoup and requests

To install BeautifulSoup run: pip install beautifulsoup4 OR: easy_install beautifulsoup4

To install requests run: pip install requests OR: easy_install requests

Information required to run program:

  • HITs Thread Number (Changes Daily: 5 digit number found in the thread URL)
  • Forum URL
  • Page To Start From (Default: 0)
  • Number Of Pages To Scrape

This is a very very simple command-line script. Simply run python vB-mTurk-Scraper.py and it will guide you through setting the forum you want to scrape, entering todays thread number, which page you want to start from, and how many pages you want to go through.

Note: When entering the address to the forum, enter only the domain, ex: forum.com

It outputs all HITs to HITs/mturklinks-todaysdate.html by default.

The final question asked is about writing to the HITs/forumlog-todaysdate.txt file. If you answer with True it will write to HITs/forumlog-todaysdate.txt file with just the links so you can share that plain text with people.

This is perfect for when you wake up! Run it on the HITs thread of any vBulletin mTurk forum you like, and you get an html file full of all the HITs you missed. (Note: There will be NO duplicates as of v1.5)

This has only been tested on Mturkforum.com and Turkernation.com - Any vBulletin forum that allows you to read posts without being logged in should work.

NOTE: When setting the number of pages to use, please make sure you are displaying 40 posts per page when checking the site.

Windows Users: Click "Download ZIP" on the right side of the page. Extract the Windows-Version-1.5 folder and run vB-mTurk-Scraper.exe

Screenshots:


Program Running In Windows:


Program Running In Xubuntu:


HTML File(Note that visited HITs are red):

MIT Licensed

About

Scrapes vBulletin forums for links to mTurk HITS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published