1、json的两种结构:
1、对象:对象在json中表现为“{ }”括起来的内容,数据结构为键值对结构{key:value,key:value},在面向对象的语言中key为对象的属性,value为对应的属性值,。取值方法为 对象.key 获取属性值,这个属性值的类型可以是数字、字符串、数组、对象几种。 大括号{}用来描述一组“不同类型的无序键值对集合,即各个key之间没有什么明确的关系。
下面举例json的对象结构:
{"firstName":"Brett","lastName":"McLaughlin","email":"aaaa"}
每个属性和值之间用":"分隔
不同属性之间用","分隔
json中,对象的键名必须用双引号括起来;同一个对象中不应该出现两个同名的属性。
2、数组:在json中数组是用[ ]括起来的内容,数据结构为{ item1,item2,…,itemn} 其中,item是key-value键值对的形式,取值方法和数组一样,使用索引获取,其中属性值的类型可以是:数字,字符串,数组,对象(这里暗示了数组[ ]里面可以前台对象{ })几种。方括号[]用来描述一组“相同类型的有序数据集合”
{
"people":[
{"firstName":"Brett","lastName":"McLaughlin","email":"aaaa"},
{"firstName":"Jason","lastName":"Hunter","email":"bbbb"},
{"firstName":"Elliotte","lastName":"Harold","email":"cccc"}
]
}
数组值之间用","分隔
访问形式为:people[0].firstname
输出:Brett
这个例子中people数组包含了三个值,这些值又是json对象。
数组和对象的最后一个成员的后面不能加 “,” 。经过对象、数组这2种结构就可以组合成复杂的数据结构了。
2、python的dict结构:
字典是另一种可变容器模型,且可存储任意类型对象。
字典的每个键值(key=>value)对用冒号(:)分割,每个对之间用逗号(,)分割,整个字典包括在花括号({})中 ,格式如下所示:
dict = {'Name': 'Zara', 'Age': 7, 'Name': 'Manni'};
.Attention
1、python字典的值可以去任何对象,既可以是python的标准对象,也可以是用户自定义的对象,但是dict的键不行。
2、不允许同一个键出现两次。创建时如果同一个键被赋值两次,后一个值会被记住。
3、键必须不可变,所以可以用数字,字符串或元组充当,所以用列表就不行。
3、序列化(serialization)和反序列化(deserialization):
序列化:将对象的状态信息(如python的简单数据类型:list,dict,tuple,int,float,unicode)转化为可存储或者可传输的内容(如json 、xml)的过程。
反序列化:从存储文件或者存储区域(如json 、xml)读取需要反序列化的对象状态,并重建该对象。
json:(javascript object notation)一种轻量级数据交换格式,相对于xml更简单、易阅读、编写、解析和生成,json是javascript中的一个子集
python中集成了json模块:
序列化——-encoding :将一个python对象编码成json字符串(注意:这里是指json字符串,言外之意是可以将非json字符串的文件转化为json字符串,然后再反序列化)
反序列化-—decoding:将一个字符串解码成python对象
4、序列化(serialization)和反序列化(deserialization):
序列化方式:(Python对象转换成json的对照图)
对于python对象内的各个元素,python的字典会被转化为json的对象,python的list,tuple会被转化为array,以此类推。
反序列化:(json 对象转换成Python对象的对照图)
对于json字符串内的各个元素,字符串会被转成pyhon的str,对象会被转成python的dict,数组会被转化为python的list以此类推
参考网址:https://docs.python.org/3/library/json.html
5、下面来举例子吧:
dumps是将dict转化成str格式,loads是将str转化成dict格式。
dump和load也是类似的功能,只是与文件操作结合起来了。
json文件如下:
{
"dataset": {
"train": {
"type": "mnist",
"data_set": "train",
"layout_x": "tensor"
},
"test": {
"type": "mnist",
"data_set": "test",
"layout_x": "tensor"
}
},
"train": {
"keep_model_in_mem": 0,
"random_state": 0,
"data_cache": {
"cache_in_disk": {
"default": 1
},
"keep_in_mem": {
"default": 0
},
"cache_dir": "/mnt/raid/fengji/gcforest/mnist/fg-tree500-depth100-3folds/datas"
}
},
"outputs": ["pool1/7x7/ets", "pool1/7x7/rf", "pool1/10x10/ets", "pool1/10x10/rf", "pool1/13x13/ets", "pool1/13x13/rf"],
"estimators": [
{"n_folds":3,"type":"ExtraTreesClassifier","n_estimators":500,"max_depth":100,"n_jobs":-1,"min_samples_leaf":10},
{"n_folds":3,"type":"RandomForestClassifier","n_estimators":500,"max_depth":100,"n_jobs":-1,"min_samples_leaf":10}
]
}
python对象如下:
dictObject={
"dataset": {
"train": {
"type": "mnist",
"data_set": "train",
"layout_x": "tensor"
},
"test": {
"type": "mnist",
"data_set": "test",
"layout_x": "tensor"
}
},
"train": {
"keep_model_in_mem": 0,
"random_state": 0,
"data_cache": {
"cache_in_disk": {
"default": 1
},
"keep_in_mem": {
"default": 0
},
"cache_dir": "/mnt/raid/fengji/gcforest/mnist/fg-tree500-depth100-3folds/datas"
}
},
"outputs": ["pool1/7x7/ets", "pool1/7x7/rf", "pool1/10x10/ets", "pool1/10x10/rf", "pool1/13x13/ets", "pool1/13x13/rf"],
"estimators": [
{"n_folds":3,"type":"ExtraTreesClassifier","n_estimators":500,"max_depth":100,"n_jobs":-1,"min_samples_leaf":10},
{"n_folds":3,"type":"RandomForestClassifier","n_estimators":500,"max_depth":100,"n_jobs":-1,"min_samples_leaf":10}
]
}
反序列化:json——>python
# 读取json数据的代码:
path="D:\PycharmProjects/untitled\jsonformat"
with open(path) as f:
deserialization=json.load(f)
print(type(deserialization))
print(type(deserialization["train"]))
print(type(deserialization["outputs"]))
print(type(deserialization["estimators"]))
print(type(deserialization["dataset"]["train"]["type"]))
print(type(deserialization["train"]["keep_model_in_mem"]))
print(type(deserialization["train"]["data_cache"]))
# 对应的输出:
<class 'dict'>
<class 'dict'>
<class 'list'>
<class 'list'>
<class 'str'>
<class 'int'>
<class 'dict'>
# 与json--->python的对照图一致
序列化:python——>json
# 读取python对象的代码:
serialization=json.dumps(dictObject)
print(type(serialization))
print(serialization)
# 输出结果:
<class 'str'>
{"dataset": {"train": {"type": "mnist", "layout_x": "tensor", "data_set": "train"}, "test": {"type": "mnist", "layout_x": "tensor", "data_set": "test"}}, "train": {"random_state": 0, "data_cache": {"cache_in_disk": {"default": 1}, "cache_dir": "/mnt/raid/fengji/gcforest/mnist/fg-tree500-depth100-3folds/datas", "keep_in_mem": {"default": 0}}, "keep_model_in_mem": 0}, "outputs": ["pool1/7x7/ets", "pool1/7x7/rf", "pool1/10x10/ets", "pool1/10x10/rf", "pool1/13x13/ets", "pool1/13x13/rf"], "estimators": [{"type": "ExtraTreesClassifier", "n_jobs": -1, "n_estimators": 500, "max_depth": 100, "n_folds": 3, "min_samples_leaf": 10}, {"type": "RandomForestClassifier", "n_jobs": -1, "n_estimators": 500, "max_depth": 100, "n_folds": 3, "min_samples_leaf": 10}]}