網頁資料儲存mysql資料庫過程問題及解決

#coding:utf-8

from bs4 import beautifulsoup

import pymysql

import sqlite3

import sys

import importlib

importlib.reload(sys)

html = """

"""soup = beautifulsoup(html,'html.parser')

list = soup.find_all('a')

conn = pymysql.connect('localhost','zoe','1235789y','tianyadb',charset='utf8')

cursor = conn.cursor()

for l in list:

try:

print('there')

sql = "insert into citys(city,c_url) values(%s,%s)"

print(sql)

cursor.execute(sql,(str(l.string),str(''+l['href']).encode('utf8')))

print('ok')

conn.commit()

except pymysql.error as e:

print('************error:',e)

conn.rollback()

cursor.close()

conn.close()

以上為源**

其中遇到的問題

1：中文編碼問題

分為4步：1.python檔案設定編碼（檔案前面加上 #encoding=utf-8)

2.設定資料庫編碼為utf-8，（charset=utf-8）

3.python連線mysql資料庫時加上引數charset=『utf8』

4.設定python的預設編碼是utf8(sys.setdefaultencoding(utf-8)）

注：python3中取消了這種寫法，改用import importlib importlib.reload(sys)

2.insert語句變數插入

在sql語句中將佔位符標好，然後cursor.execute()中將變數當做引數加入。（這種方式安全，使用python字串傳參的方法後面加%，有sql注入的危險）

sql = "insert into citys(city,c_url) values(%s,%s)"

print(sql)

cursor.execute(sql,(str(l.string),str(''+l['href']).encode('utf8')))

3.錯誤異常處理

4.游標指標cursor的位置問題

網頁儲存到mysql資料庫把網頁資料儲存到資料庫

cs 按鈕事件 string sbhtmltext wbdata.documenttext 獲取所有頁面元素 getcolomnnumandname sbhtmltext 獲取欄目 getvalue sbhtmltext 獲取資料 tryif dt null dt.rows.count 0 判斷資料...

清洗網頁資料

ascii american standard code for information interchange美國標準資訊交換碼只能表示128個字元這個大家都是很熟悉的，從32是空格，然後是一堆符號，然後是48 57表示0 9，65 90是a z，97 122是a z。就是很少，也只有英文本母...

第一戰爬起靜態網頁資料庫儲存

最近開始學習爬蟲技術，將自己學習的心得以及一些認識寫在部落格裡，歡迎更多的人一塊和我從零開始學習爬蟲。爬蟲的基本環境和一些常用庫就不多說了。先直接介紹爬取的網頁和我的爬蟲貼爬取塔里木大學教務處爬取目的將網頁上所有的新聞標題爬下來介面先上 import requests import re...

網頁資料儲存mysql資料庫過程問題及解決

網頁儲存到mysql資料庫 把網頁資料儲存到資料庫

清洗網頁資料

第一戰 爬起靜態網頁 資料庫儲存

相關推薦

網頁儲存到mysql資料庫把網頁資料儲存到資料庫

第一戰爬起靜態網頁資料庫儲存