好得很程序员自学网

<tfoot draggable='sEl'></tfoot>

网络爬虫之数据库连接

 

连接mysql:

首先检查是否安装上pymsql

 import   pymysql

conn  = pymysql.connect(host= ‘  172.16.70.130  ‘ ,port=3306,user=‘user ‘  ,password=‘passwd’)#host是你的主机地址 port默认为3306 user表示你的用户名 password表示密码 另外可以指定库只需要传递database参数即可 
 
cur  =  conn.cursor()
cur.execute(  ‘  select version()  ‘  )
data  =  cur.fetchall()
  print (data)#打印版本号

运行结果如下:

(( ‘  5.7.27  ‘ ,),)

 

连接redis:

首先检查是否安装redis

 import   redis
conn  = redis.StrictRedis(host= ‘  172.16.70.130  ‘ ,port=6379,decode_responses=True,db=1 )
  #  host:主机名 port:端口号默认6379 如果有设密码需要传递password参数 db指定库 默认为0 
 print (conn.info())

结果如下:

{ ‘  redis_version  ‘ :  ‘  5.0.5  ‘#版本号 ,  ‘  redis_git_sha1  ‘ : 0,  ‘  redis_git_dirty  ‘ : 0,  ‘  redis_build_id  ‘ :  ‘  6a23e5766d3175f5  ‘ ,  ‘  redis_mode  ‘ :  ‘  standalone  ‘ ,  ‘  os  ‘ :  ‘  Linux 3.10.0-1160.21.1.el7.x86_64 x86_64  ‘ ,  ‘  arch_bits  ‘ : 64,  ‘  multiplexing_api  ‘ :  ‘  epoll  ‘ ,  ‘  atomicvar_api  ‘ :  ‘  atomic-builtin  ‘ ,  ‘  gcc_version  ‘ :  ‘  4.8.5  ‘ ,  ‘  process_id  ‘ : 3702,  ‘  run_id  ‘ :  ‘  0b9a27d474df47866f2615cbb4c12a157c202d57  ‘ ,  ‘  tcp_port  ‘ : 6379,  ‘  uptime_in_seconds  ‘ : 6837,  ‘  uptime_in_days  ‘ : 0,  ‘  hz  ‘ : 10,  ‘  configured_hz  ‘ : 10,  ‘  lru_clock  ‘ : 7365539,  ‘  executable  ‘ :  ‘  /opt/redis-5.0.5/./src/redis-server  ‘ ,  ‘  config_file  ‘ :  ‘  /opt/redis-5.0.5/redis.conf  ‘ ,  ‘  connected_clients  ‘ : 2,  ‘  client_recent_max_input_buffer  ‘ : 2,  ‘  client_recent_max_output_buffer  ‘ : 0,  ‘  blocked_clients  ‘ : 0,  ‘  used_memory  ‘ : 5019560,  ‘  used_memory_human  ‘ :  ‘  4.79M  ‘ ,  ‘  used_memory_rss  ‘ : 11505664,  ‘  used_memory_rss_human  ‘ :  ‘  10.97M  ‘ ,  ‘  used_memory_peak  ‘ : 8052504,  ‘  used_memory_peak_human  ‘ :  ‘  7.68M  ‘ ,  ‘  used_memory_peak_perc  ‘ :  ‘  62.34%  ‘ ,  ‘  used_memory_overhead  ‘ : 858080,  ‘  used_memory_startup  ‘ : 791392,  ‘  used_memory_dataset  ‘ : 4161480,  ‘  used_memory_dataset_perc  ‘ :  ‘  98.42%  ‘ ,  ‘  allocator_allocated  ‘ : 5438560,  ‘  allocator_active  ‘ : 6467584,  ‘  allocator_resident  ‘ : 17625088,  ‘  total_system_memory  ‘ : 1019559936,  ‘  total_system_memory_human  ‘ :  ‘  972.33M  ‘ ,  ‘  used_memory_lua  ‘ : 37888,  ‘  used_memory_lua_human  ‘ :  ‘  37.00K  ‘ ,  ‘  used_memory_scripts  ‘ : 0,  ‘  used_memory_scripts_human  ‘ :  ‘  0B  ‘ ,  ‘  number_of_cached_scripts  ‘ : 0,  ‘  maxmemory  ‘ : 0,  ‘  maxmemory_human  ‘ :  ‘  0B  ‘ ,  ‘  maxmemory_policy  ‘ :  ‘  noeviction  ‘ ,  ‘  allocator_frag_ratio  ‘ : 1.19,  ‘  allocator_frag_bytes  ‘ : 1029024,  ‘  allocator_rss_ratio  ‘ : 2.73,  ‘  allocator_rss_bytes  ‘ : 11157504,  ‘  rss_overhead_ratio  ‘ : 0.65,  ‘  rss_overhead_bytes  ‘ : -6119424,  ‘  mem_fragmentation_ratio  ‘ : 2.32,  ‘  mem_fragmentation_bytes  ‘ : 6549264,  ‘  mem_not_counted_for_evict  ‘ : 0,  ‘  mem_replication_backlog  ‘ : 0,  ‘  mem_clients_slaves  ‘ : 0,  ‘  mem_clients_normal  ‘ : 66616,  ‘  mem_aof_buffer  ‘ : 0,  ‘  mem_allocator  ‘ :  ‘  jemalloc-5.1.0  ‘ ,  ‘  active_defrag_running  ‘ : 0,  ‘  lazyfree_pending_objects  ‘ : 0,  ‘  loading  ‘ : 0,  ‘  rdb_changes_since_last_save  ‘ : 0,  ‘  rdb_bgsave_in_progress  ‘ : 0,  ‘  rdb_last_save_time  ‘ : 1617974663,  ‘  rdb_last_bgsave_status  ‘ :  ‘  ok  ‘ ,  ‘  rdb_last_bgsave_time_sec  ‘ : 0,  ‘  rdb_current_bgsave_time_sec  ‘ : -1,  ‘  rdb_last_cow_size  ‘ : 634880,  ‘  aof_enabled  ‘ : 0,  ‘  aof_rewrite_in_progress  ‘ : 0,  ‘  aof_rewrite_scheduled  ‘ : 0,  ‘  aof_last_rewrite_time_sec  ‘ : -1,  ‘  aof_current_rewrite_time_sec  ‘ : -1,  ‘  aof_last_bgrewrite_status  ‘ :  ‘  ok  ‘ ,  ‘  aof_last_write_status  ‘ :  ‘  ok  ‘ ,  ‘  aof_last_cow_size  ‘ : 0,  ‘  total_connections_received  ‘ : 14,  ‘  total_commands_processed  ‘ : 60113,  ‘  instantaneous_ops_per_sec  ‘ : 0,  ‘  total_net_input_bytes  ‘ : 7993087,  ‘  total_net_output_bytes  ‘ : 9155003,  ‘  instantaneous_input_kbps  ‘ : 0.0,  ‘  instantaneous_output_kbps  ‘ : 0.0,  ‘  rejected_connections  ‘ : 0,  ‘  sync_full  ‘ : 0,  ‘  sync_partial_ok  ‘ : 0,  ‘  sync_partial_err  ‘ : 0,  ‘  expired_keys  ‘ : 0,  ‘  expired_stale_perc  ‘ : 0.0,  ‘  expired_time_cap_reached_count  ‘ : 0,  ‘  evicted_keys  ‘ : 0,  ‘  keyspace_hits  ‘ : 15,  ‘  keyspace_misses  ‘ : 5,  ‘  pubsub_channels  ‘ : 0,  ‘  pubsub_patterns  ‘ : 0,  ‘  latest_fork_usec  ‘ : 321,  ‘  migrate_cached_sockets  ‘ : 0,  ‘  slave_expires_tracked_keys  ‘ : 0,  ‘  active_defrag_hits  ‘ : 0,  ‘  active_defrag_misses  ‘ : 0,  ‘  active_defrag_key_hits  ‘ : 0,  ‘  active_defrag_key_misses  ‘ : 0,  ‘  role  ‘ :  ‘  master  ‘ ,  ‘  connected_slaves  ‘ : 0,  ‘  master_replid  ‘ :  ‘  545becf1f5f2952dcf76619dbc67fe7b95a03776  ‘ ,  ‘  master_replid2  ‘ : 0,  ‘  master_repl_offset  ‘ : 0,  ‘  second_repl_offset  ‘ : -1,  ‘  repl_backlog_active  ‘ : 0,  ‘  repl_backlog_size  ‘ : 1048576,  ‘  repl_backlog_first_byte_offset  ‘ : 0,  ‘  repl_backlog_histlen  ‘ : 0,  ‘  used_cpu_sys  ‘ : 20.550579,  ‘  used_cpu_user  ‘ : 8.305302,  ‘  used_cpu_sys_children  ‘ : 0.058768,  ‘  used_cpu_user_children  ‘ : 0.202806,  ‘  cluster_enabled  ‘ : 0,  ‘  db1  ‘ : { ‘  keys  ‘ : 1,  ‘  expires  ‘ : 0,  ‘  avg_ttl  ‘ : 0}}

 

mongodb连接:

首先检查是否安装pymongo

 import   pymongo
client  = pymongo.MongoClient(host= ‘  172.16.70.130  ‘ ,port=27017 )
  #  host:主机地址 port:端口号默认27017 

 #  如果开启权限认证就需要进行登陆认证 
client[ ‘  admin  ‘  ].authenticate(username,password)

db  = client[‘databasename’]  #  指定库 
 
然后就可以往里插入或者查找数据验证是否连接成功 

 

连接成功后就可根据需求建表建库保存数据,还可以通过操作文件句柄保存在本地。

网络爬虫之数据库连接

标签:sla   网络   print   pre   mysql   script   tin   god   partial   

查看更多关于网络爬虫之数据库连接的详细内容...

  阅读:34次