Oracle可以存储二进制文件,测试后发现lob字段再数据delete后表空间不能被复用,考虑将数据存储在mongodb中,mongodb存在GridFS,测试GridFS是否也存在相同的情况

mongofiles命令介绍

Possible commands include:
list - list all files; 'filename' is an optional prefix which listed filenames must begin with
search - search all files; 'filename' is a substring which listed filenames must contain
put - add a file with filename 'filename'
put_id - add a file with filename 'filename' and a given '_id'
get - get a file with filename 'filename'
get_id - get a file with the given '_id'
delete - delete all files with filename 'filename'
delete_id - delete a file with the given '_id'

mongodb数据库初始化后空间占用情况

[mongo@Mon01 mongodb]$ du -sh *
1.1G data
1.4M log
4.0K mongodb.pid
4.0K mongodb.yaml
4.0K secret

数据写入

写入两个文件test1大小1GB,test.log大小5GB

[mongo@Mon01 ~]$ mongofiles -u test -p test --authenticationDatabase test put test1
2022-08-11T14:56:51.882+0800 connected to: localhost
added file: test1
[mongo@Mon01 ~]$ mongofiles -u test -p test --authenticationDatabase test put test.log
2022-08-11T15:00:34.925+0800 connected to: localhost
added file: test.log

GridFS存储相关表

MongoRepl:SECONDARY> show tables;
fs.chunks
fs.files

主要存储数据的集合fs.chunks

MongoRepl:PRIMARY> db.fs.chunks.stats()
{
"ns" : "test.fs.chunks",
"size" : 3802809920,
"count" : 14560,
"avgObjSize" : 261182,
"storageSize" : 594558976,
"capped" : false,
"wiredTiger" : {
"metadata" : {
"formatVersion" : 1
},
"creationString" :
"type" : "file",
"uri" : "statistics:table:collection-0-1779480531914116713",
},
}

写入期间,观察文件大小变化

[mongo@Mon01 mongodb]$ du -sh *
2.2G data
1.6M log
4.0K mongodb.pid
4.0K mongodb.yaml
4.0K secret

观察集合fs.chunks数据文件大小变化

[mongo@Mon01 data]$ du -ch *
865M collection-0-1779480531914116713.wt
... ...
200M diagnostic.data
... ...
301M journal
... ...
230M WiredTigerLAS.wt
... ...
2.4G total

查看当前写入的文件

[mongo@Mon01 ~]$ mongofiles -u test -p test --authenticationDatabase test list
2022-08-11T15:09:52.043+0800 connected to: localhost
test1 1048576000
test.log 5776622690

数据删除

删除文件

[mongo@Mon01 ~]$ mongofiles -u test -p test --authenticationDatabase test delete test1
2022-08-11T15:11:21.229+0800 connected to: localhost
successfully deleted all instances of 'test1' from GridFS
[mongo@Mon01 ~]$ mongofiles -u test -p test --authenticationDatabase test delete test.log
2022-08-11T15:11:31.479+0800 connected to: localhost
successfully deleted all instances of 'test.log' from GridFS

删除文件后,观察文件大小变化,一段事件后空间会自动回收,集合对应数据文件变小

[mongo@Mon01 data]$ du -ch *
12K collection-0-1779480531914116713.wt
... ...
796M collection-8--2049863378525224046.wt
... ...
200M diagnostic.data
... ...
301M journal
... ...
432M WiredTigerLAS.wt
... ...
144K WiredTiger.wt
1.7G total

总结:

Mongodb存储二进制文件可以正常进行增删,数据删除后空间可自动回收。

Tips:

journal类似Oracle或MySQL的Redo
WiredTiger是Mongodb的存储引擎,值得研究