实现Hessian进行序列化和反序列化序列化和反序列化代码

转载

轩辕 2024-07-17 14:40:01

文章标签 实现Hessian进行序列化和反序列化 python 序列化反序列化 json 文章分类 架构后端开发

我们把变量从内存中变成可存储或传输的过程（字节序列-一串二进制数据的序列）称之为序列化，在Python中叫pickling，序列化之后，就可以把序列化后的内容写入磁盘，或者通过网络传输到别的机器上。反过来，把变量内容从序列化的对象重新读到内存里称之为反序列化，即unpickling。

因为计算机只能存储二进制的数据，所以要想把一些内存存到计算机上，必须将其编码成二进制的序列（字节数组），然后读取的时候进行相应解码（如ascii、Unicode等），将其转为原始的数据进行显示。
而且一些传输协议，例如TCP/IP协议，只支持字节数组的传输，不能直接传对象。对象序列化的结果一定是字节数组！当两个进程在进行远程通信时，彼此可以发送各种类型的数据。无论是何种类型的数据，都会以二进制序列的形式在网络上传送。发送方需要把这个对象转换为字节序列，才能在网络上传送；接收方则需要把字节序列再恢复为对象。

2.2.1 自定义default转换函数，把其他的对象变成一个JSON并写入文件
2.2.2 自定义object_hook转换函数，从文件中读取json数据进行反序列化为自定义的对象
2.2.3 创建一个类，继承json.JSONEncoder，重写default方法进而实现序列化
2.2.3 创建一个类，继承json.JSONDecoder，重写decode方法进而实现反序列化

3. 综合的例子

1.python中pickle的序列化与反序列化

Python提供了pickle模块来实现序列化。

1.1把一个对象序列化并写入文件

pickle.dumps()方法把对象序列化成一个bytes，然后，就可以把这个bytes写入文件。

import pickle
d = dict(name='Bob', age=20, score=88)

# 序列化为字节数组
d_pickling = pickle.dumps(d)

# 将序列化的字节数组写入pickling.txt
with open('pickling.txt', 'wb') as output:
    output.write(d_pickling)
    
# 将pickling.txt中存放的序列化的字节数组读入
with open('pickling2.txt', 'rb') as f:
    d_pickling = f.read()
print(d_pickling)

实现Hessian进行序列化和反序列化序列化和反序列化代码_json

或者用另一个方法pickle.dump()直接把对象序列化后写入一个file-like Object：

import pickle
d = dict(name='Bob', age=20, score=88)

# 将对象序列化后的二进制数组存入pickling.txt
with open('pickling.txt', 'wb') as f:
    pickle.dump(d, f)

# 将pickling.txt的内容读取出来
with open('pickling.txt', 'rb') as f:
    d_pickling = f.read()
print(d_pickling)

实现Hessian进行序列化和反序列化序列化和反序列化代码_python_02

1.2.从文件中读取字节序列进行反序列化

当我们要把对象从磁盘读到内存时，可以先把内容读到一个bytes，然后用pickle.loads()方法反序列化出对象，也可以直接用pickle.load()方法从一个file-like Object中直接反序列化出对象。
使用pickle.loads()方法

# 对d_pickling字节序列直接反序列化
print(d_pickling)
d_unpickling = pickle.loads(d_pickling)
print(d_unpickling)

实现Hessian进行序列化和反序列化序列化和反序列化代码_序列化_03

使用pickle.load()方法

with open('pickling.txt', 'rb') as f:
    d_unpickling = pickle.load(f)

print(d_unpickling)

实现Hessian进行序列化和反序列化序列化和反序列化代码_json_04

Pickle的问题和所有其他编程语言特有的序列化问题一样，就是它只能用于Python，并且可能不同版本的Python彼此都不兼容，因此，只能用Pickle保存那些不重要的数据，不能成功地反序列化也没关系。

2.python利用json包的序列化与反序列化

如果我们要在不同的编程语言之间传递对象，就必须把对象序列化为标准格式，比如XML，但更好的方法是序列化为JSON，因为JSON表示出来就是一个字符串，可以被所有语言读取，也可以方便地存储到磁盘或者通过网络传输。JSON不仅是标准格式，并且比XML更快，而且可以直接在Web页面中读取，非常方便。

JSON表示的对象就是标准的JavaScript语言的对象，JSON和Python内置的数据类型对应如下：

JSON类型	Python类型
{}	dict
[]	list
“string”	str
1234.56	int或float
true/false	True/False
null	None

对于python的内置数据类型的数据，是可以直接序列化为JSON的{}，不过，很多时候，我们更喜欢用class表示对象，比如定义Student类，然后序列化，另外对于其他包中的数据类型，比如ndarry等，这些对象是无法直接进行序列化和反序列化的。对于非python内置数据类型的数据，需要我们先将其转化为python内置的数据类型，对于非字典类型的数据，需要转为字典数据，因为只有字典这种键值对数据才可以序列为JSON的对象，同样，在反序列的过程中，再将json对象的相应数据变为其原来的数据类型，同时转为自定义的对象。

2.1 json的序列化与反序列化——python内置的数据类型

Python内置的json模块提供了非常完善的Python对象到JSON格式的转换。

2.1.1 把Python对象变成一个JSON并写入文件

第一种方式：利用dumps方法
dumps()方法返回一个str，内容就是标准的JSON，然后再将该json写入文件

import json

d = dict(name='Bob', age=20, score=88)
print(d)  # key是单引号
# 将d序列化为json
d_pickling = json.dumps(d)
print(d_pickling)  #  JSON字符串中的key是双引号
print(type(d_pickling))
# 将json写入文件pickling2.json,也可以写入.txt文件等
with open('pickling.json', 'w') as f:
    f.write(d_pickling)

# 从文件中读取json
with open('pickling.json', 'r') as f:
    d_pickling = f.read()
print(d_pickling)
print(type(d_pickling))

实现Hessian进行序列化和反序列化序列化和反序列化代码_python_05

第二种方式：利用dump方法

dump()方法可以直接把JSON写入一个file-like Object

import json

d = dict(name='Bob', age=20, score=88)

with open('pickling.json', 'w') as f:
    json.dump(d, f)

with open('pickling.json', 'r') as f:
    d_pickling = f.read()
    
print(d_pickling)
print(type(d_pickling))

实现Hessian进行序列化和反序列化序列化和反序列化代码_实现Hessian进行序列化和反序列化_06

2.1.1 从文件中读取json数据进行反序列化

第一种方式：利用loads方法
要把JSON反序列化为Python对象，可以用loads()，把JSON的字符串反序列化

import json

d = dict(name='Bob', age=20, score=88)
# 将d序列化为json
d_pickling = json.dumps(d)
# 将json字符串反序列化
d_unpickling = json.loads(d_pickling)
print(d_unpickling)
print(type(d_unpickling))

实现Hessian进行序列化和反序列化序列化和反序列化代码_python_07

第二种方式：利用load方法

也可以利用load()方法，从file-like Object中读取字符串并反序列化：

import json

with open('pickling.json', 'r') as f:
    d_unpickling = json.load(f)
print(d_unpickling)
print(type(d_unpickling))

实现Hessian进行序列化和反序列化序列化和反序列化代码_实现Hessian进行序列化和反序列化_08

2.2.json的序列化与反序列化——其他的对象

Python的dict对象可以直接序列化为JSON的{}，不过，很多时候，我们更喜欢用class表示对象，比如定义Student类，然后序列化，另外对于其他包中的对象，比如ndarry对象等，有时候我们也需要进行序列化与反序列化，这些对象是无法直接进行序列化和反序列化的。

这时，需要我们去定制JSON序列化与反序列化，dumps()方法的参数列表，除了第一个必须的obj参数外，dumps()方法还提供了一大堆的可选参数：这些可选参数就是让我们来定制JSON序列化。关于dumps的官方指南https://docs.python.org/3/library/json.html#json.dumps

其中的可选参数default就是把任意一个对象变成一个可序列为JSON的对象，我们只需要专门写一个转换函数，再把函数传进去即可，就可以将任意对象序列化为json对象。

2.2.1 自定义default转换函数，把其他的对象变成一个JSON并写入文件

第一种方式：利用dumps方法
例如：

import json
class Student(object):
    def __init__(self, name, age, score):
        self.name = name
        self.age = age
        self.score = score


s = Student('Bob', 20, 88)

# 定义转换函数，将自定义的对象转换为json对象
def studentdict(std):
    return {
        'name': std.name,
        'age': std.age,
        'score': std.score
    }


# 将定义的转换函数传入default，然后序列化为json
stu_pickling = json.dumps(s, default=studentdict)
print(stu_pickling)

# 将序列化后的json存入文件
with open('pickling.json', 'w') as f:
    f.write(stu_pickling)

# 从文件中读取json
with open('pickling.json', 'r') as f:
    d_pickling = f.read()

实现Hessian进行序列化和反序列化序列化和反序列化代码_json_09

延申：

下次如果遇到一个Teacher类的实例，照样无法序列化为JSON。我们可以偷个懒，把任意class的实例变为dict：

class Student(object):
    def __init__(self, name, age, score):
        self.name = name
        self.age = age
        self.score = score


s = Student('Bob', 20, 88)
print(s.__dict__)  # s.__dict__的结果为{'name': 'Bob', 'age': 20, 'score': 88}
# 定义转换函数，将自定义的对象转换为json对象
def studentdict(std):
    return {
        'name': std.name,
        'age': std.age,
        'score': std.score
    }


# 将定义的转换函数传入default
stu_pickling = json.dumps(s, default=lambda obj: obj.__dict__)
print(stu_pickling)

with open('pickling.json', 'w') as f:
    f.write(stu_pickling)

# 从文件中读取json
with open('pickling.json', 'r') as f:
    d_pickling = f.read()

因为通常class的实例都有一个__dict__属性，它就是一个dict，用来存储实例变量。

第二种方式：利用dump方法

import json


class Student(object):
    def __init__(self, name, age, score):
        self.name = name
        self.age = age
        self.score = score


s = Student('Bob', 20, 88)
# print(s.__dict__)


def studentdict(std):
    return {
        'name': std.name,
        'age': std.age,
        'score': std.score
    }

# 将stu进行序列化，然后存到文件pickling.json
with open('pickling.json', 'w') as f:
    json.dump(s, f, default=studentdict)

# 从文件中读取序列化的json
with open('pickling.json', 'r') as f:
    stu_pickling = f.read()
    
print(stu_pickling)
print(type(stu_pickling))

实现Hessian进行序列化和反序列化序列化和反序列化代码_实现Hessian进行序列化和反序列化_10

2.2.2 自定义object_hook转换函数，从文件中读取json数据进行反序列化为自定义的对象

要把JSON反序列化为一个Student对象实例，loads()方法首先转换出一个dict对象，然后，我们传入的object_hook函数负责把dict转换为Student实例：
第一种方式：利用loads方法

import json

with open('pickling.json', 'r') as f:
    stu_pickling = f.read()
print(stu_pickling)
print(type(stu_pickling))


def dictstudent(d):
    return Student(d['name'], d['age'], d['score'])


stu_unpickling = json.loads(stu_pickling, object_hook=dictstudent)

print(stu_unpickling)

实现Hessian进行序列化和反序列化序列化和反序列化代码_反序列化_11

第二种方式：利用load方法

import json

with open('pickling.json', 'r') as f:
    stu_pickling = f.read()
print(stu_pickling)
print(type(stu_pickling))

def dictstudent(d):
    return Student(d['name'], d['age'], d['score'])

with open('pickling.json', 'r') as f:
    stu_unpickling = json.load(f, object_hook=dictstudent)

print(stu_unpickling)

实现Hessian进行序列化和反序列化序列化和反序列化代码_反序列化_12

2.2.3 创建一个类，继承json.JSONEncoder，重写default方法进而实现序列化

继承json.JSONEncoder，对其中的default方法进行重写，增加对Student中对象进行序列化的操作。
第一种方式：使用dumps方法

import json


class Student(object):
    def __init__(self, name, age, score):
        self.name = name
        self.age = age
        self.score = score


# 定义转换函数，将自定义的对象转换为json对象
def studentdict(std):
    return {
        'name': std.name,
        'age': std.age,
        'score': std.score
    }


s = Student('Bob', 20, 88)


class StuEncoder(json.JSONEncoder):
    def default(self, obj):
    	# 通过判断obj的类型，进行相应的转换，变成可序列化的对象
        if isinstance(obj, Student):
            return studentdict(obj)
        else:
            return super().default(obj)

# 序列化为json
stu_pickling = json.dumps(s, cls=StuEncoder)
with open('pickling.json', 'w') as f:
    f.write(stu_pickling)

print(stu_pickling)
print(type(stu_pickling))

第二种方式：使用dump方法

import json
class Student(object):
    def __init__(self, name, age, score):
        self.name = name
        self.age = age
        self.score = score

# 定义转换函数，将自定义的对象转换为json对象
def studentdict(std):
    return {
        'name': std.name,
        'age': std.age,
        'score': std.score
    }
s = Student('Bob', 20, 88)

class StuEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, Student):
            return studentdict(obj)
        else:
            return super(StuEncoder, self).default(obj)
            
# 字典写入文件
with open("pickling.json", 'w') as f:
    json.dump(s, f, cls=StuEncoder)


with open('pickling.json', 'r') as f:
    stu_pickling = f.read()
print(stu_pickling)
print(type(stu_pickling))

实现Hessian进行序列化和反序列化序列化和反序列化代码_json_13

2.2.3 创建一个类，继承json.JSONDecoder，重写decode方法进而实现反序列化

第一种方式：使用loads方法

import json


class Student(object):
    def __init__(self, name, age, score):
        self.name = name
        self.age = age
        self.score = score


# 定义转换函数
def dictstudent(d):
    return Student(d['name'], d['age'], d['score'])


class StuDecoder(json.JSONDecoder):
    def decode(self, obj):
        dic = super().decode(obj)
        if dic['name'] is not None:
            return dictstudent(dic)
        return dic

# 从文件pickling.json中读取序列化后的json对象
with open('pickling.json', 'r') as f:
    stu_pickling = f.read()
print(stu_pickling)
print(type(stu_pickling))

stu_unpickling = json.loads(stu_pickling, cls=StuDecoder)

print(stu_unpickling)

第二种方式：使用load方法

import json
class Student(object):
    def __init__(self, name, age, score):
        self.name = name
        self.age = age
        self.score = score

# 定义转换函数
def dictstudent(d):
    return Student(d['name'], d['age'], d['score'])

class StuDecoder(json.JSONDecoder):
    def decode(self, obj):
        dic = super().decode(obj)
        if dic['name'] is not None:
            return dictstudent(dic)
        return dic

with open('pickling.json', 'r') as f:
    stu_unpickling = json.load(f, cls=StuDecoder)
print(stu_unpickling)

实现Hessian进行序列化和反序列化序列化和反序列化代码_实现Hessian进行序列化和反序列化_14

3. 综合的例子

对于这样的一个字典，parameters = {‘w1’: np.arange(10), ‘w2’: np.array([[1, 2, 3], [3, 2, 1]])}，将其序列化保存到文件，然后读取文件中的数据，反序列化为原本的数据（数据类型不可改变）。

分析:
首先数据类型是ndarry，非python内置数据类型是无法进行序列化的，所以需要先将其转为python的内置数据类型，由于数据本身就是字典数据了，所以不需要再将其转字典数据了，另外再反序列化时，需要将数据类型在变为Numpy中的对应的数据类型。

import numpy as np
import json
import codecs

parameters = {'w1': np.arange(10), 'w2': np.array([[1, 2, 3], [3, 2, 1]])}


class NpEncoder(json.JSONEncoder):
    def default(self, obj):
        # 通过判断对象类型，来进行特定的序列化操作
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super().default(obj)


class NpDecoder(json.JSONDecoder):
    def decode(self, obj):
        dic = super().decode(obj)
        # 通过判断字典对象中是否有'w1'这个key值，来进行特定的反序列化操作
        if dic['w1'] is not None:
            for i in dic.keys():
                dic[i] = np.array(dic[i])
            return dic
        return dic


# 字典序列化json后保存到文件
with open("para_new.csv", 'w', encoding='utf-8') as f:
    json.dump(parameters, f, cls=NpEncoder)
# 读取json对象的数据
with open('para_new.csv', 'r', encoding='utf-8') as f:
    para_pickling = f.read()

print(para_pickling)
print(type(para_pickling))

# 读取文件中的json，然后反序列化，不对其数据类型进行转化
with open('para_new2.csv', 'r', encoding='utf-8') as f:
    para_unpickling = json.load(f)

print(type(para_unpickling['w1']), type(para_unpickling['w2']))  # 均是list类型 

# 读取文件中的json，然后反序列化，设置cls参数，将数据类型变成原来的形式
with open('para_new2.csv', 'r', encoding='utf-8') as f:
    para_unpickling = json.load(f, cls=NpDecoder)

print(type(para_unpickling['w1']), type(para_unpickling['w2']))

实现Hessian进行序列化和反序列化序列化和反序列化代码_序列化_15