2/02/2013 Google Search with bash


yesterday i was just curious about lynx and google search engine,so we  can use bash for get quick results   an automate the process,also i want  to filter the url,using sed or awk .


 the first thing is stablish the url for a proper search for this example i wanna to  use this

http://www.google.com/search?q=keywordforsearchhere&start=pagenumberhere
where the search?q= interpret the proper keyword. and &start= is the number of page,as a text browser i use lynx followed of -dump and -listonly options,lynx provide many command line options but for this test i just use the above -dump for formatted output of the default document and -listonly that show only the list of links.

for the first test  i use  keyword=house and page=1

lynx "http://www.google.com/search?q=house&start=1" -dump -listonly

it gives a result like in the pastie

http://pastie.org/private/jlaakeglj0fsfga27tmoqg
the final result :
lynx "http://www.google.com/search?q=house&start=1" -dump -listonly | grep 'url?q=' | cut -d ' ' -f4 | sed 's/http:\/\/www.google.com\/url?q=//' | sed 's/\(&sa=\).*//' 
finally :
#!/bin/bash 
#Google search using bash tools
#we need $1 the keyword 
count=0 #page number

while [ "$count" -le 200 ]
do
    lynx "http://www.google.com/search?q=$1&start=$count" -dump -listonly | grep 'url?q=' | cut -d ' ' -f4 |
    sed 's/http:\/\/www.google.com\/url?q=//' | sed 's/\(&sa=\).*//' 
    count=$(( $count +5 ))
done
echo
Ciao
Digg it StumbleUpon del.icio.us

0 comentarios:

Post a Comment