看看visual genome 数据集能不能整个完整的知识库(有截图,希望有用!可以结合我的其他文章进行整理即可)。
持续更新-》目标-》抽取relationship中的 关系,构成三元组词典 可视化组成大图。(当我写完博客,发现只用relationships 这一个文件即可提取我想要的三元组,汗!~~)
(目前只是整理到目标object级别,例如目标object 的属性attribute 这层关系其实在relationships 中应该查找不到,如果要这层关系需要解析region_graph.json~~ 以后再说吧!)
下面先观赏一下我参考博客链接的json输出内容:
单张图片json的信息:(从结果可以看出object 是按照图片的顺序进行解析。标明图片的宽高,id号,coco id号,flickr id号,url
Information of a single image:
{
'width': 800,
'url': 'https://cs.stanford.edu/people/rak248/VG_100K_2/1.jpg',
'height': 600,
'image_id': 1,
'coco_id': None, #话说这个id 不知道是coco哪一个版本的,因为有coco id 的图片的url也是标明的VG_100K 所以这就不好玩了。
'flickr_id': None
}
object json 的信息:(从结果可以看出object 是按照图片的顺序进行解析。每个图片作为object 列表的一个元素,而元素里面包含图片所有的object的信息。同义词synsets,图中的位置和物体宽高,物体的标签,物体的id 其实这个id 是按照所有物体个数排序的 。两棵树 则有两个不同的id。至于'merged_object_ids'不太懂)
type and length of objects json <class 'list'> 108077
{
'image_id': 1,
'objects':
[
{'synsets': ['tree.n.01'],
'h': 557,
'object_id': 1058549,
'merged_object_ids': [],
'names': ['trees'],
'w': 799, 'y': 0, 'x': 0},
{'synsets': ['sidewalk.n.01'],
'h': 290,
'object_id': 1058534,
'merged_object_ids': [5046],
'names': ['sidewalk'],
'w': 722, 'y': 308, 'x': 78},
{'synsets': ['building.n.01'],
'h': 538,
'object_id': 1058508,
'merged_object_ids': [],
'names': ['building'],
'w': 222, 'y': 0, 'x': 1},
{'synsets': ['street.n.01'],
'h': 258,
'object_id': 1058539,
'merged_object_ids': [3798578],
'names': ['street'],
'w': 359, 'y': 283,'x': 439},
{'synsets': ['wall.n.01'],
'h': 535,
'object_id': 1058543,
'merged_object_ids': [],
'names': ['wall'],
'w': 135, 'y': 1, 'x': 0},
{'synsets': ['tree.n.01'],
'h': 360,
'object_id': 1058545,
'merged_object_ids': [],
'names': ['tree'],
'w': 476, 'y': 0,'x': 178},
{'synsets': ['shade.n.01'],
'h': 189,
'object_id': 5045,
'merged_object_ids': [],
'names': ['shade'],
'w': 274, 'y': 344, 'x': 116},
{'synsets': ['van.n.05'],
'h': 176,
'object_id': 1058542,
'merged_object_ids': [1058536],
'names': ['van'],
'w': 241, 'y': 278, 'x': 533},
{'synsets': ['trunk.n.01'],
'h': 348,
'object_id': 5055,
'merged_object_ids': [],
'names': ['tree trunk'],
'w': 78, 'y': 213, 'x': 623},
{'synsets': ['clock.n.01'],
'h': 363,
'object_id': 1058498,
'merged_object_ids': [],
'names': ['clock'],
'w': 77, 'y': 63, 'x': 422},
#....too long
],
'image_url': 'https://cs.stanford.edu/people/rak248/VG_100K_2/1.jpg'}
relationships.json 解析:(可以看出与object 一样都是按照图片作为一个列表的元素。一个图片包含一堆relationships ,每个关系表明关系类别、主语、宾语、谓语id 、谓语的同义词,当然主语与谓语都是object 集合里,里面的属性的解释可参考上面个的object.json 的解析)
type of relationships json <class 'list'> 108077
{'relationships':
[{
#谓语
'predicate': 'ON',
#谓语的宾语
'object':
{'h': 290,
'object_id': 1058534,
'merged_object_ids': [5046],
'synsets': ['sidewalk.n.01'],
'w': 722, 'y': 308, 'x': 78,
'names': ['sidewalk']},
#谓语的id
'relationship_id': 15927,
#谓语的同义词
'synsets': ['along.r.01'],
#谓语的主语
'subject':
{'name': 'shade',
'h': 192,
'synsets': ['shade.n.01'],
'object_id': 5045,
'w': 274, 'y': 338, 'x': 119}
},
#下面依然如此
{'predicate': 'wears',
'object': {'h': 28, 'object_id': 1058525, 'merged_object_ids': [5048], 'synsets': ['shoe.n.01'], 'w': 48, 'y': 485, 'x': 388, 'names': ['shoes']},
'relationship_id': 15928,
'synsets': ['wear.v.01'],
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}},
{'predicate': 'has',
'object': {'name': 'headlight', 'h': 15, 'synsets': ['headlight.n.01'], 'object_id': 5050, 'w': 23, 'y': 366, 'x': 514},
'relationship_id': 15929,
'synsets': ['have.v.01'],
'subject': {'name': 'car', 'h': 98, 'synsets': ['car.n.01'], 'object_id': 5049, 'w': 74, 'y': 315, 'x': 479}},
{'predicate': 'ON',
'object': {'name': 'building', 'h': 536, 'synsets': ['building.n.01'], 'object_id': 1058508, 'w': 218, 'y': 2, 'x': 1},
'relationship_id': 15930,
'synsets': ['along.r.01'],
'subject': {'name': 'sign', 'h': 182, 'synsets': ['sign.n.02'], 'object_id': 1058507, 'w': 88, 'y': 13, 'x': 118}},
{'predicate': 'ON',
'object': {'name': 'sidewalk', 'h': 266, 'synsets': ['sidewalk.n.01'], 'object_id': 1058534, 'w': 722, 'y': 331, 'x': 77},
'relationship_id': 15931,
'synsets': ['along.r.01'],
'subject': {'name': 'tree trunk', 'h': 327, 'synsets': ['trunk.n.01'], 'object_id': 5055, 'w': 87, 'y': 234, 'x': 622}},
{'predicate': 'has',
'object': {'name': 'shirt', 'h': 101, 'synsets': ['shirt.n.01'], 'object_id': 1058511, 'w': 59, 'y': 289, 'x': 241},
'relationship_id': 15932,
'synsets': ['have.v.01'],
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}},
{'predicate': 'next to',
'object': {'name': 'street', 'h': 233, 'synsets': ['street.n.01'], 'object_id': 1058539, 'w': 440, 'y': 283, 'x': 358},
'relationship_id': 15933,
'synsets': ['next.r.01'],
'subject': {'name': 'sidewalk', 'h': 266, 'synsets': ['sidewalk.n.01'], 'object_id': 1058534, 'w': 722, 'y': 331, 'x': 77}},
{'predicate': 'has',
'object': {'name': 'back', 'h': 170, 'synsets': ['back.n.01'], 'object_id': 5060, 'w': 67, 'y': 339, 'x': 721},
'relationship_id': 15934,
'synsets': ['have.v.01'],
'subject': {'name': 'car', 'h': 174, 'synsets': ['car.n.01'], 'object_id': 1058515, 'w': 91, 'y': 342, 'x': 708}},
{'predicate': 'has',
'object': {'name': 'glasses', 'h': 12, 'synsets': ['spectacles.n.01'], 'object_id': 1058518, 'w': 20, 'y': 268, 'x': 271},
'relationship_id': 15935,
'synsets': ['have.v.01'], 'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}},
{'predicate': 'ON',
'object': {'name': 'sidewalk', 'h': 266, 'synsets': ['sidewalk.n.01'], 'object_id': 1058534, 'w': 722, 'y': 331, 'x': 77},
'relationship_id': 15936,
'synsets': ['along.r.01'],
'subject': {'name': 'parking meter', 'h': 143, 'synsets': ['parking_meter.n.01'], 'object_id': 1058519, 'w': 32, 'y': 327, 'x': 574}},
{'predicate': 'wears',
'object': {'h': 28, 'object_id': 1058525, 'merged_object_ids': [5048], 'synsets': ['shoe.n.01'], 'w': 48, 'y': 485, 'x': 388, 'names': ['shoes']},
'relationship_id': 15937,
'synsets': ['wear.v.01'],
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}},
{'predicate': 'has',
'object': {'name': 'shoes', 'h': 34, 'synsets': ['shoe.n.01'], 'object_id': 1058525, 'w': 46, 'y': 481, 'x': 391},
'relationship_id': 15938,
'synsets': ['have.v.01'],
'subject': {'name': 'man', 'h': 251, 'synsets': ['man.n.01'], 'object_id': 1058532, 'w': 75, 'y': 264, 'x': 372}},
{'predicate': 'has',
'object': {'name': 'shirt', 'h': 101, 'synsets': ['shirt.n.01'], 'object_id': 1058511, 'w': 59, 'y': 289, 'x': 241},
'relationship_id': 15939,
'synsets': ['have.v.01'],
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}},
{'predicate': 'wears',
'object': {'name': 'pants', 'h': 118, 'synsets': ['trouser.n.01'], 'object_id': 1058528, 'w': 38, 'y': 384, 'x': 245},
'relationship_id': 15940,
'synsets': ['wear.v.01'],
'subject': {'name': 'man', 'h': 262, 'synsets': ['man.n.01'], 'object_id': 1058529, 'w': 60, 'y': 249, 'x': 238}},
#too long........
],
'image_id': 1}
Found 46 relationship types with >= 500 training instances
import json
import argparse, json, os
from collections import Counter, defaultdict
import numpy as np
parser = argparse.ArgumentParser()
parser.add_argument('--min_objects_per_image', default=3, type=int)
parser.add_argument('--max_objects_per_image', default=30, type=int)
parser.add_argument('--max_attributes_per_image', default=30, type=int)
parser.add_argument('--min_relationships_per_image', default=1, type=int)
parser.add_argument('--max_relationships_per_image', default=30, type=int)
#load aliases for objects and relationships
def load_aliases(alias_path):
aliases = {}
with open(alias_path, 'r') as f:
for line in f:
## strip() remove spaces at the beginning and at the end of the string
line = [s.strip() for s in line.split(',')]
for s in line:
aliases[s] = line[0]
return aliases
################-------object_idx_to_name-------###############
def create_object_vocab(min_object_instances, image_ids, objects, aliases, vocab):
image_ids = set(image_ids)
print('Making object vocab from %d training images' % len(image_ids))
#
object_name_counter = Counter()
for image in objects:
# if image['image_id'] not in image_ids:
# continue
for obj in image['objects']:
names = set()
for name in obj['names']:
names.add(aliases.get(name, name))
object_name_counter.update(names)
object_names = ['__image__']
for name, count in object_name_counter.most_common():
if count >= min_object_instances:
object_names.append(name)
print('Found %d object categories with >= %d training instances' %
(len(object_names), min_object_instances))
object_name_to_idx = {}
object_idx_to_name = []
for idx, name in enumerate(object_names):
object_name_to_idx[name] = idx
object_idx_to_name.append(name)
vocab['object_name_to_idx'] = object_name_to_idx
vocab['object_idx_to_name'] = object_idx_to_name
################-------object_id_to_obj-------###############
def filter_objects(min_object_size, objects, aliases, vocab, splits):
object_id_to_objects = {}
all_image_ids = splits
#all_image_ids = set()
# for image_ids in splits.values():
# all_image_ids |= set(image_ids)
object_name_to_idx = vocab['object_name_to_idx']
object_id_to_obj = {}
num_too_small = 0
for image in objects:
image_id = image['image_id']
if image_id not in all_image_ids:
continue
for obj in image['objects']:
object_id = obj['object_id']
final_name = None
final_name_idx = None
for name in obj['names']:
name = aliases.get(name, name)
if name in object_name_to_idx:
final_name = name
final_name_idx = object_name_to_idx[final_name]
break
w, h = obj['w'], obj['h']
too_small = (w < min_object_size) or (h < min_object_size)
if too_small:
num_too_small += 1
if final_name is not None and not too_small:
object_id_to_obj[object_id] = {
'name': final_name,
'name_idx': final_name_idx,
'synsets':obj['synsets'],
'box': [obj['x'], obj['y'], obj['w'], obj['h']],
}
print('Skipped %d objects with size < %d' % (num_too_small, min_object_size))
return object_id_to_obj
################-------Relationship-------###############
def create_rel_vocab(min_relationship_instances, image_ids, relationships, object_id_to_obj, rel_aliases, vocab):
pred_counter = defaultdict(int)
image_ids_set = set(image_ids)
for image in relationships:
image_id = image['image_id']
if image_id not in image_ids_set:
continue
for rel in image['relationships']:
sid = rel['subject']['object_id']
oid = rel['object']['object_id']
found_subject = sid in object_id_to_obj
found_object = oid in object_id_to_obj
if not found_subject or not found_object:
continue
pred = rel['predicate'].lower().strip()
pred = rel_aliases.get(pred, pred)
rel['predicate'] = pred
pred_counter[pred] += 1
pred_names = ['__in_image__']
for pred, count in pred_counter.items():
if count >= min_relationship_instances:
pred_names.append(pred)
print('Found %d relationship types with >= %d training instances'
% (len(pred_names), min_relationship_instances))
pred_name_to_idx = {}
pred_idx_to_name = []
for idx, name in enumerate(pred_names):
pred_name_to_idx[name] = idx
pred_idx_to_name.append(name)
vocab['pred_name_to_idx'] = pred_name_to_idx
vocab['pred_idx_to_name'] = pred_idx_to_name
#load image_meta_information
with open('image_data.json', 'r') as f:
#list 108,077
images = json.load(f)
#dict image_id to image instance
image_id_to_image = {i['image_id']: i for i in images}
#create aliases for obj and rel
obj_aliases = load_aliases('object_alias.txt')
rel_aliases = load_aliases('relationship_alias.txt')
#create vocab, idx is from 1 to len
vocab = {}
min_object_instances = 2000
min_object_size = 32
min_relationship_instances = 500
#load objects
with open('objects.json', 'r') as f:
objects = json.load(f)
image_ids = image_id_to_image.keys()
# train_ids = splits['train']
create_object_vocab(min_object_instances, image_ids, objects, obj_aliases, vocab)
#split->image_ids
object_id_to_obj = filter_objects(min_object_size, objects, obj_aliases, vocab, image_ids)
#load relationships
with open('relationships.json', 'r') as f:
relationships = json.load(f)
create_rel_vocab(min_relationship_instances, image_ids, relationships,object_id_to_obj, rel_aliases, vocab)
经过一番整理,有如下三元组截图:
还是有点意思:可惜文件实在太大了,我存了1000个竟然有25GB的大小,我都怀疑人生了。。。。。。
后来还是有办法,直接解析relation.json文件
下面是整理后的截图超过10次出现(仍然有遗漏,实在懒得动):<obj>为了隔开主语与宾语,数字即为粗略的统计两个目标同时出现的次数。
比如上面例子:
shade on street
man wear sneaker
car has headlight
sign on building
man has shirt
结果截图:
展示一个图片、一个问题、一个描述、一个答案(数据集标注的问题是Open-Ended)
{"image_id": 57870, "question": "What are the chairs made off?", "question_id": 57870000},
{"image_id": 57870, "question": "Is this a dinner setting?", "question_id": 57870001},
{"image_id": 57870, "question": "Is there exposed brick on the walls?", "question_id": 57870002}
A restaurant has modern wooden tables and chairs .
A long restaurant table with rattan rounded back chairs .
a long table with a plant on top of it surrounded with wooden chairs
A long table with a flower arrangement in the middle for meetings
A table is adorned with wooden chairs with blue accents .
{"question_type": "what are the", "multiple_choice_answer": "wood",
"answers": [{"answer": "wood and wicker", "answer_confidence": "maybe", "answer_id": 1},
{"answer": "rattan", "answer_confidence": "yes", "answer_id": 2},
{"answer": "bamboo", "answer_confidence": "maybe", "answer_id": 3},
{"answer": "wood", "answer_confidence": "yes", "answer_id": 4},
{"answer": "wood", "answer_confidence": "yes", "answer_id": 5},
{"answer": "wood", "answer_confidence": "yes", "answer_id": 6},
{"answer": "wood", "answer_confidence": "yes", "answer_id": 7},
{"answer": "wicker", "answer_confidence": "yes", "answer_id": 8},
{"answer": "wood", "answer_confidence": "yes", "answer_id": 9},
{"answer": "wood", "answer_confidence": "yes", "answer_id": 10}],
"image_id": 57870, "answer_type": "other", "question_id": 57870000},
{"question_type": "is this a", "multiple_choice_answer": "yes",
"answers": [{"answer": "no", "answer_confidence": "yes", "answer_id": 1},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 2},
{"answer": "no", "answer_confidence": "yes", "answer_id": 3},
{"answer": "yes", "answer_confidence": "maybe", "answer_id": 4},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 5},
{"answer": "not sure", "answer_confidence": "maybe", "answer_id": 6},
{"answer": "yes", "answer_confidence": "maybe", "answer_id": 7},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 8},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 9},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 10}],
"image_id": 57870, "answer_type": "yes/no", "question_id": 57870001},
{"question_type": "is there", "multiple_choice_answer": "yes",
"answers": [{"answer": "yes", "answer_confidence": "yes", "answer_id": 1},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 2},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 3},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 4},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 5},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 6},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 7},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 8},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 9},
{"answer": "yes", "answer_confidence": "yes", "answer_id": 10}],
"image_id": 57870, "answer_type": "yes/no", "question_id": 57870002},
COCO_train2014_000000057870.jpg