今天想把pandas的csv里面的数据导入到数据库中,在导入的过程中有一些特殊字符,最终导致导入失败,错误码为:

"Incorrect string value: '\\xF0\\x9F\\x99\\x8F'

我的代码为:

import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy
import glob

engine = create_engine('mysql+pymysql://username:password@localhost/zhihudemo??charset=utf8')
csv_files=glob.glob('data-science-for-good-careervillage/*.csv')
for csv_name in csv_files:
try:
df=pd.read_csv(csv_name)
table_name=csv_name.split('/')[-1]
table_name=table_name.split('.')[0]
df.to_sql(table_name, engine,index=False,if_exists='replace')
except Exception as e:
print(csv_name)
print(e)

结果一些表情等特殊字符死活导入不进去。

解决方法

将对应字符类型换成将对应的数据类型改为utf8mb4类型,同时连接类型也要改成utf8mb4_general_ci

python pandas tosql的时候“Incorrect string value: ‘\xF0\x9F\x99\x8F‘_sqlalchemy

类似于我这样。然后代码改为:

import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy
import glob

engine = create_engine('mysql+pymysql://username:password@localhost/zhihudemo??charset=utf8mb4')
csv_files=glob.glob('data-science-for-good-careervillage/*.csv')
for csv_name in csv_files:
try:
df=pd.read_csv(csv_name)
table_name=csv_name.split('/')[-1]
table_name=table_name.split('.')[0]
df.to_sql(table_name, engine,index=False,if_exists='replace')
except Exception as e:
print(csv_name)
print(e)

参考文献

[1].利用pandas的to_sql将数据插入MySQL数据库和所踩过的坑.​

[2].python连接SQL报错:1366, "Incorrect string value: '\\xF0\\x9F\\x98\\x81. ​