看看visual genome 数据集能不能整个完整的知识库(有截图,希望有用!可以结合我的其他文章进行整理即可)。



持续更新-》目标-》抽取relationship中的 关系,构成三元组词典 可视化组成大图。(当我写完博客,发现只用relationships 这一个文件即可提取我想要的三元组,汗!~~)

(目前只是整理到目标object级别,例如目标object 的属性attribute 这层关系其实在relationships 中应该查找不到,如果要这层关系需要解析region_graph.json~~ 以后再说吧!)

下面先观赏一下我参考博客链接的json输出内容:

单张图片json的信息:(从结果可以看出object 是按照图片的顺序进行解析。标明图片的宽高,id号,coco id号,flickr id号,url 

Information of a single image:
 
 {
'width': 800, 
'url': 'https://cs.stanford.edu/people/rak248/VG_100K_2/1.jpg',
 'height': 600, 
'image_id': 1,
 'coco_id': None, #话说这个id 不知道是coco哪一个版本的,因为有coco id 的图片的url也是标明的VG_100K 所以这就不好玩了。
'flickr_id': None
}

 object json 的信息:(从结果可以看出object 是按照图片的顺序进行解析。每个图片作为object 列表的一个元素,而元素里面包含图片所有的object的信息。同义词synsets,图中的位置和物体宽高,物体的标签,物体的id 其实这个id 是按照所有物体个数排序的 。两棵树 则有两个不同的id。至于'merged_object_ids'不太懂)

type and length of objects json <class 'list'> 108077

{
'image_id': 1, 
'objects': 
[
{'synsets': ['tree.n.01'], 
'h': 557,
 'object_id': 1058549,
 'merged_object_ids': [], 
'names': ['trees'], 
'w': 799, 'y': 0, 'x': 0}, 

{'synsets': ['sidewalk.n.01'],
 'h': 290, 
'object_id': 1058534,
'merged_object_ids': [5046], 
'names': ['sidewalk'], 
'w': 722, 'y': 308, 'x': 78}, 

{'synsets': ['building.n.01'],
 'h': 538, 
'object_id': 1058508, 
'merged_object_ids': [], 
'names': ['building'], 
'w': 222, 'y': 0, 'x': 1},

 {'synsets': ['street.n.01'], 
'h': 258, 
'object_id': 1058539, 
'merged_object_ids': [3798578], 
'names': ['street'],
'w': 359, 'y': 283,'x': 439}, 

{'synsets': ['wall.n.01'], 
'h': 535, 
'object_id': 1058543, 
'merged_object_ids': [], 
'names': ['wall'], 
'w': 135, 'y': 1, 'x': 0}, 

{'synsets': ['tree.n.01'],
 'h': 360, 
'object_id': 1058545, 
'merged_object_ids': [],
 'names': ['tree'], 
'w': 476, 'y': 0,'x': 178},

 {'synsets': ['shade.n.01'], 
'h': 189, 
'object_id': 5045,
 'merged_object_ids': [], 
'names': ['shade'], 
'w': 274, 'y': 344, 'x': 116}, 

{'synsets': ['van.n.05'], 
'h': 176, 
'object_id': 1058542,
 'merged_object_ids': [1058536],
 'names': ['van'],
 'w': 241, 'y': 278, 'x': 533}, 

{'synsets': ['trunk.n.01'], 
'h': 348, 
'object_id': 5055, 
'merged_object_ids': [],
 'names': ['tree trunk'], 
'w': 78, 'y': 213, 'x': 623},

 {'synsets': ['clock.n.01'], 
'h': 363, 
'object_id': 1058498,
 'merged_object_ids': [],
 'names': ['clock'], 
'w': 77, 'y': 63, 'x': 422},


#....too long
],

 'image_url': 'https://cs.stanford.edu/people/rak248/VG_100K_2/1.jpg'}

relationships.json 解析:(可以看出与object 一样都是按照图片作为一个列表的元素。一个图片包含一堆relationships ,每个关系表明关系类别、主语、宾语、谓语id 、谓语的同义词,当然主语与谓语都是object 集合里,里面的属性的解释可参考上面个的object.json 的解析)

type of relationships json <class 'list'> 108077

{'relationships': 
[{
#谓语
'predicate': 'ON',

#谓语的宾语
 'object':
 {'h': 290,
 'object_id': 1058534, 
'merged_object_ids': [5046],
 'synsets': ['sidewalk.n.01'], 
'w': 722, 'y': 308, 'x': 78, 
'names': ['sidewalk']}, 

#谓语的id
'relationship_id': 15927, 

#谓语的同义词
'synsets': ['along.r.01'],

#谓语的主语
'subject': 
{'name': 'shade',
 'h': 192, 
'synsets': ['shade.n.01'], 
'object_id': 5045, 
'w': 274, 'y': 338, 'x': 119}

}, 

#下面依然如此

{'predicate': 'wears', 
'object': {'h': 28, 'object_id': 1058525, 'merged_object_ids': [5048], 'synsets': ['shoe.n.01'], 'w': 48, 'y': 485, 'x': 388, 'names': ['shoes']},
'relationship_id': 15928, 
'synsets': ['wear.v.01'], 
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}}, 


{'predicate': 'has', 
'object': {'name': 'headlight', 'h': 15, 'synsets': ['headlight.n.01'], 'object_id': 5050, 'w': 23, 'y': 366, 'x': 514}, 
'relationship_id': 15929,
 'synsets': ['have.v.01'], 
'subject': {'name': 'car', 'h': 98, 'synsets': ['car.n.01'], 'object_id': 5049, 'w': 74, 'y': 315, 'x': 479}}, 

{'predicate': 'ON', 
'object': {'name': 'building', 'h': 536, 'synsets': ['building.n.01'], 'object_id': 1058508, 'w': 218, 'y': 2, 'x': 1},
'relationship_id': 15930, 
'synsets': ['along.r.01'],
'subject': {'name': 'sign', 'h': 182, 'synsets': ['sign.n.02'], 'object_id': 1058507, 'w': 88, 'y': 13, 'x': 118}}, 

{'predicate': 'ON',
'object': {'name': 'sidewalk', 'h': 266, 'synsets': ['sidewalk.n.01'], 'object_id': 1058534, 'w': 722, 'y': 331, 'x': 77}, 
'relationship_id': 15931, 
'synsets': ['along.r.01'], 
'subject': {'name': 'tree trunk', 'h': 327, 'synsets': ['trunk.n.01'], 'object_id': 5055, 'w': 87, 'y': 234, 'x': 622}},

{'predicate': 'has', 
'object': {'name': 'shirt', 'h': 101, 'synsets': ['shirt.n.01'], 'object_id': 1058511, 'w': 59, 'y': 289, 'x': 241}, 
'relationship_id': 15932, 
'synsets': ['have.v.01'], 
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}}, 

{'predicate': 'next to', 
'object': {'name': 'street', 'h': 233, 'synsets': ['street.n.01'], 'object_id': 1058539, 'w': 440, 'y': 283, 'x': 358}, 
'relationship_id': 15933, 
'synsets': ['next.r.01'], 
'subject': {'name': 'sidewalk', 'h': 266, 'synsets': ['sidewalk.n.01'], 'object_id': 1058534, 'w': 722, 'y': 331, 'x': 77}}, 

{'predicate': 'has', 
'object': {'name': 'back', 'h': 170, 'synsets': ['back.n.01'], 'object_id': 5060, 'w': 67, 'y': 339, 'x': 721}, 
'relationship_id': 15934, 
'synsets': ['have.v.01'], 
'subject': {'name': 'car', 'h': 174, 'synsets': ['car.n.01'], 'object_id': 1058515, 'w': 91, 'y': 342, 'x': 708}}, 

{'predicate': 'has', 
'object': {'name': 'glasses', 'h': 12, 'synsets': ['spectacles.n.01'], 'object_id': 1058518, 'w': 20, 'y': 268, 'x': 271}, 
'relationship_id': 15935, 
'synsets': ['have.v.01'], 'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}}, 

{'predicate': 'ON', 
'object': {'name': 'sidewalk', 'h': 266, 'synsets': ['sidewalk.n.01'], 'object_id': 1058534, 'w': 722, 'y': 331, 'x': 77}, 
'relationship_id': 15936, 
'synsets': ['along.r.01'], 
'subject': {'name': 'parking meter', 'h': 143, 'synsets': ['parking_meter.n.01'], 'object_id': 1058519, 'w': 32, 'y': 327, 'x': 574}}, 

{'predicate': 'wears', 
'object': {'h': 28, 'object_id': 1058525, 'merged_object_ids': [5048], 'synsets': ['shoe.n.01'], 'w': 48, 'y': 485, 'x': 388, 'names': ['shoes']}, 
'relationship_id': 15937, 
'synsets': ['wear.v.01'], 
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}}, 

{'predicate': 'has', 
'object': {'name': 'shoes', 'h': 34, 'synsets': ['shoe.n.01'], 'object_id': 1058525, 'w': 46, 'y': 481, 'x': 391}, 
'relationship_id': 15938, 
'synsets': ['have.v.01'], 
'subject': {'name': 'man', 'h': 251, 'synsets': ['man.n.01'], 'object_id': 1058532, 'w': 75, 'y': 264, 'x': 372}}, 
{'predicate': 'has', 
'object': {'name': 'shirt', 'h': 101, 'synsets': ['shirt.n.01'], 'object_id': 1058511, 'w': 59, 'y': 289, 'x': 241}, 
'relationship_id': 15939, 
'synsets': ['have.v.01'], 
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}}, 

{'predicate': 'wears', 
'object': {'name': 'pants', 'h': 118, 'synsets': ['trouser.n.01'], 'object_id': 1058528, 'w': 38, 'y': 384, 'x': 245}, 
'relationship_id': 15940, 
'synsets': ['wear.v.01'], 
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}}, 

#too long........

], 

'image_id': 1}

Found 46 relationship types with >= 500 training instances
import json
import argparse, json, os
from collections import Counter, defaultdict
import numpy as np
parser = argparse.ArgumentParser()
parser.add_argument('--min_objects_per_image', default=3, type=int)
parser.add_argument('--max_objects_per_image', default=30, type=int)
parser.add_argument('--max_attributes_per_image', default=30, type=int)
parser.add_argument('--min_relationships_per_image', default=1, type=int)
parser.add_argument('--max_relationships_per_image', default=30, type=int)

#load aliases for objects and relationships
def load_aliases(alias_path):
	aliases = {}
	with open(alias_path, 'r') as f:
		for line in f:
			## strip() remove spaces at the beginning and at the end of the string
			line = [s.strip() for s in line.split(',')]
			for s in line:
				aliases[s] = line[0]
	return aliases

################-------object_idx_to_name-------###############
def create_object_vocab(min_object_instances, image_ids, objects, aliases, vocab):
	image_ids = set(image_ids)

	print('Making object vocab from %d training images' % len(image_ids))

	#
	object_name_counter = Counter()
	for image in objects:
		# if image['image_id'] not in image_ids:
		# 	continue
		for obj in image['objects']:
			names = set()
			for name in obj['names']:
				names.add(aliases.get(name, name))
			object_name_counter.update(names)

	object_names = ['__image__']
	for name, count in object_name_counter.most_common():
		if count >= min_object_instances:
			object_names.append(name)
	print('Found %d object categories with >= %d training instances' %
		(len(object_names), min_object_instances))

	object_name_to_idx = {}
	object_idx_to_name = []
	for idx, name in enumerate(object_names):
		object_name_to_idx[name] = idx
		object_idx_to_name.append(name)

	vocab['object_name_to_idx'] = object_name_to_idx
	vocab['object_idx_to_name'] = object_idx_to_name

################-------object_id_to_obj-------###############
def filter_objects(min_object_size, objects, aliases, vocab, splits):
	object_id_to_objects = {}
	all_image_ids = splits
	#all_image_ids = set()
	# for image_ids in splits.values():
	# 	all_image_ids |= set(image_ids)

	object_name_to_idx = vocab['object_name_to_idx']
	object_id_to_obj = {}

	num_too_small = 0
	for image in objects:
		image_id = image['image_id']
		if image_id not in all_image_ids:
			continue
		for obj in image['objects']:
			object_id = obj['object_id']
			final_name = None
			final_name_idx = None
			for name in obj['names']:
				name = aliases.get(name, name)
				if name in object_name_to_idx:
					final_name = name
					final_name_idx = object_name_to_idx[final_name]
					break
			w, h = obj['w'], obj['h']
			too_small = (w < min_object_size) or (h < min_object_size)
			if too_small:
				num_too_small += 1
			if final_name is not None and not too_small:
				object_id_to_obj[object_id] = {
				'name': final_name,
				'name_idx': final_name_idx,
				'synsets':obj['synsets'],
				'box': [obj['x'], obj['y'], obj['w'], obj['h']],
				}
	print('Skipped %d objects with size < %d' % (num_too_small, min_object_size))
	return object_id_to_obj

################-------Relationship-------###############
def create_rel_vocab(min_relationship_instances, image_ids, relationships, object_id_to_obj, rel_aliases, vocab):
	pred_counter = defaultdict(int)
	image_ids_set = set(image_ids)
	for image in relationships:
		image_id = image['image_id']
		if image_id not in image_ids_set:
			continue
		for rel in image['relationships']:
			sid = rel['subject']['object_id']
			oid = rel['object']['object_id']
			found_subject = sid in object_id_to_obj
			found_object = oid in object_id_to_obj
			if not found_subject or not found_object:
				continue
			pred = rel['predicate'].lower().strip()
			pred = rel_aliases.get(pred, pred)
			rel['predicate'] = pred
			pred_counter[pred] += 1

	pred_names = ['__in_image__']
	for pred, count in pred_counter.items():
		if count >= min_relationship_instances:
			pred_names.append(pred)
	print('Found %d relationship types with >= %d training instances'
		% (len(pred_names), min_relationship_instances))

	pred_name_to_idx = {}
	pred_idx_to_name = []
	for idx, name in enumerate(pred_names):
		pred_name_to_idx[name] = idx
		pred_idx_to_name.append(name)

	vocab['pred_name_to_idx'] = pred_name_to_idx
	vocab['pred_idx_to_name'] = pred_idx_to_name

#load image_meta_information
with open('image_data.json', 'r') as f:
	#list 108,077
	images = json.load(f)

#dict image_id to image instance
image_id_to_image = {i['image_id']: i for i in images}

#create aliases for obj and rel
obj_aliases = load_aliases('object_alias.txt')
rel_aliases = load_aliases('relationship_alias.txt')

#create vocab, idx is from 1 to len
vocab = {}
min_object_instances = 2000
min_object_size = 32
min_relationship_instances = 500

#load objects
with open('objects.json', 'r') as f:
	objects = json.load(f)



image_ids = image_id_to_image.keys()
# train_ids = splits['train']
create_object_vocab(min_object_instances, image_ids, objects, obj_aliases, vocab)

#split->image_ids
object_id_to_obj = filter_objects(min_object_size, objects, obj_aliases, vocab, image_ids)

#load relationships
with open('relationships.json', 'r') as f:
	relationships = json.load(f)

create_rel_vocab(min_relationship_instances, image_ids, relationships,object_id_to_obj, rel_aliases, vocab)

经过一番整理,有如下三元组截图:

维基百科app使用镜像_json

还是有点意思:可惜文件实在太大了,我存了1000个竟然有25GB的大小,我都怀疑人生了。。。。。。

后来还是有办法,直接解析relation.json文件

下面是整理后的截图超过10次出现(仍然有遗漏,实在懒得动):<obj>为了隔开主语与宾语,数字即为粗略的统计两个目标同时出现的次数。

维基百科app使用镜像_json_02

比如上面例子:

shade on street
man wear sneaker
car has headlight
sign on building
man has shirt

结果截图:

维基百科app使用镜像_维基百科app使用镜像_03

展示一个图片、一个问题、一个描述、一个答案(数据集标注的问题是Open-Ended)

{"image_id": 57870, "question": "What are the chairs made off?", "question_id": 57870000},
{"image_id": 57870, "question": "Is this a dinner setting?", "question_id": 57870001}, 
{"image_id": 57870, "question": "Is there exposed brick on the walls?", "question_id": 57870002}
A restaurant has modern wooden tables and chairs .
A long restaurant table with rattan rounded back chairs .
a long table with a plant on top of it surrounded with wooden chairs
A long table with a flower arrangement in the middle for meetings
A table is adorned with wooden chairs with blue accents .
{"question_type": "what are the", "multiple_choice_answer": "wood", 
"answers": [{"answer": "wood and wicker", "answer_confidence": "maybe", "answer_id": 1}, 
{"answer": "rattan", "answer_confidence": "yes", "answer_id": 2}, 
{"answer": "bamboo", "answer_confidence": "maybe", "answer_id": 3},
 {"answer": "wood", "answer_confidence": "yes", "answer_id": 4},
 {"answer": "wood", "answer_confidence": "yes", "answer_id": 5},
 {"answer": "wood", "answer_confidence": "yes", "answer_id": 6},
 {"answer": "wood", "answer_confidence": "yes", "answer_id": 7},
 {"answer": "wicker", "answer_confidence": "yes", "answer_id": 8},
 {"answer": "wood", "answer_confidence": "yes", "answer_id": 9},
 {"answer": "wood", "answer_confidence": "yes", "answer_id": 10}], 
"image_id": 57870, "answer_type": "other", "question_id": 57870000},

 {"question_type": "is this a", "multiple_choice_answer": "yes",
 "answers": [{"answer": "no", "answer_confidence": "yes", "answer_id": 1}, 
{"answer": "yes", "answer_confidence": "yes", "answer_id": 2}, 
{"answer": "no", "answer_confidence": "yes", "answer_id": 3}, 
{"answer": "yes", "answer_confidence": "maybe", "answer_id": 4}, 
{"answer": "yes", "answer_confidence": "yes", "answer_id": 5},
 {"answer": "not sure", "answer_confidence": "maybe", "answer_id": 6}, 
{"answer": "yes", "answer_confidence": "maybe", "answer_id": 7},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 8}, 
{"answer": "yes", "answer_confidence": "yes", "answer_id": 9},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 10}], 
"image_id": 57870, "answer_type": "yes/no", "question_id": 57870001}, 

{"question_type": "is there", "multiple_choice_answer": "yes", 
"answers": [{"answer": "yes", "answer_confidence": "yes", "answer_id": 1}, 
{"answer": "yes", "answer_confidence": "yes", "answer_id": 2}, 
{"answer": "yes", "answer_confidence": "yes", "answer_id": 3}, 
{"answer": "yes", "answer_confidence": "yes", "answer_id": 4},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 5},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 6},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 7},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 8},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 9},
 {"answer": "yes", "answer_confidence": "yes", "answer_id": 10}], 
"image_id": 57870, "answer_type": "yes/no", "question_id": 57870002},

COCO_train2014_000000057870.jpg 

维基百科app使用镜像_维基百科app使用镜像_04