Python之爬虫-校花网
#!/usr/bin/env?python #?-*-?coding:utf-8?-*- import?re import?requests #?拿到校花网主页的内容 response?=?requests.get('http://HdhCmsTestxiaohuar测试数据/') data?=?response.text #?拿到校花网所有的图片链接 results?=?re.findall('lazysrc="(.*?)"',?data) for?result?in?results:??#?type:str ????#?判断是不是有链接的 ????if?result.startswith('htt'): ????????pass ????else: ????????img_result?=?'http://HdhCmsTestxiaohuar测试数据/'?+?result ????????#?获取图片内容 ????????img_response?=?requests.get(img_result) ????????img_data?=?img_response.content ????????img_name?=?result.split('/')[3] ????????img_filename?=?img_name?+?'.jpg' ????????print(img_filename) ????????#?保存图片内容 ????????with?open(img_filename,?'wb')?as?f:??#?write,read,wb是写入二进制 ????????????f.write(img_data) ????????????print('爬取成功一张')
声明:本文来自网络,不代表【好得很程序员自学网】立场,转载请注明出处:http://haodehen.cn/did127454