好得很程序员自学网

<tfoot draggable='sEl'></tfoot>

ruby爬虫

ruby爬虫

一个快速得到回家火车票的方法

Update(2010/02/07): 火车票已经买到,不过不是通过这种方式搞到的,是自己排队加托人才搞到的,票贩子太猖獗了。

就是抓抓抓抓抓抓抓抓抓抓抓抓抓抓抓抓……..

require   '  rubygems  '  
require   '  hpricot  '  
require   '  open-uri  '  
require   '  active_support  '  
require   '  action_mailer  '  
 $KCODE  =   '  u  '   
require   '  jcode  '  
require   '  tlsmail  '    #need install tlsmail gem to support TLS connect 
 Net :: SMTP .enable_tls( OpenSSL :: SSL :: VERIFY_NONE )

 ActionMailer :: Base .smtp_settings = { 
   :address  =>   '  smtp.gmail.com  '  ,      #default: localhost 
   :port  =>   '  25  '  ,                     #default: 25 
   :user_name  =>   '  xxxx@gmail.com  '  , #login name 
   :password  =>   '  xxxx  '  ,         #login password 
   :authentication  =>  :login           #:plain, :login or :cram_md5 
}
 class   SimpleMailer  <  ActionMailer :: Base 
   def   simple_message (recipient, mail_subject, mail_body)
    from   '  weekface@gmail.com  '  
    recipients recipient
    subject mail_subject
    body mail_body
   end  
 end 

url =   "  http://shenghuo.google.cn/shenghuo/search?a_y0=9&a_n0=%E7%81%AB%E8%BD%A6%E7%A5%A8&view=Table&a_n1=%E5%A7%8B%E5%8F%91%E7%AB%99&a_y1=1&a_o1=0&a_v1=%E5%8C%97%E4%BA%AC&a_n2=%E5%88%B0%E8%BE%BE%E7%AB%99&a_y2=1&a_o2=0&a_v2=%E9%82%AF%E9%83%B8  "  
doc = Hpricot(open(url))

arivals = []
doc.search(  "  tr.cssf  "  ).each  do  |ht|
  match = ht.to_s.match(  /  (  \d  )张 发车日期:2010-02-(  \d  \d  )  /  m  )
  match_time = ht.to_s.match(  /  <td>  \s  ?(  \d  +)分钟前  /  )
   if  match_time
     if  match && match[ 1 ].to_i >  1  &&
        (match[ 2 ].to_i ==  11  || match[ 2 ].to_i ==  12 ) &&
        match_time[ 1 ].to_i <=  60 
      a = ht.search(  "  td>a  "  )
       SimpleMailer .deliver_simple_message   '  weekface@gmail.com  '  , a.first.inner_html, a.first.attributes[  '  href  '  ]
     end 
   end 
 end 

查看更多关于ruby爬虫的详细内容...

  阅读:35次

上一篇: paperclip自定制文件名

下一篇:ruby 库