Skip to content

piotrmasior/scrap_task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WATIR with headless example

Scraping first 30 results from google (limited because google blocks IP when to many requests occur) Presenting only headlines on console. Demonstration purpose only.

INSTALL

sudo apt-get install xvfb

HOW TO RUN:

ruby lib/core.rb "other text"

or by default

ruby lib/core.rb

by default will go with "charlie sheen winning"

TODO (possible improvements):

  • add sh script to run it from root directory
  • refactor google method to some general_purpose scraping path with custom strategies like:
  scrap_strategy("www.somesite.com") do |wraper|
    #browser.start
    wraper.specific.methods
    wraper.goes.here
    #scraping iteration
    #browser.stop
  end

WARNING

This program violates GOOGLE search ToS, use it for own responsibility It has been written for tutorial purpose only. Your IP can be temporarily blocked after heavy overuse of this software.

About

Demonstration purpose only

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages