postgres 字符串截取

转载

mob6454cc6ccc8a 2024-09-13 06:51:01

文章标签 postgres 字符串截取字符串 SQL 子字符串 文章分类 云原生云计算

本节描述了用于检查和操作字符串数值的函数和操作符。在这个环境中的字符串包括所有 character, character varying, text 类型的值。除非另外说明，所有下面列出的函数都可以处理这些类型，不过要小心的是，在使用 character

SQL 定义了一些字符串函数，它们有指定的语法(用特定的关键字而不是逗号来分隔参数)。详情请见表9-5，这些函数也用正常的函数调用语法实现了(参阅表9-6)。

表9-5. SQL 字符串函数和操作符

函数	返回类型	描述	例子	结果
string \|\| string	text	字符串连接	'Post' \|\| 'greSQL'	PostgreSQL
`bit_length`(string)	int	字符串里二进制位的个数	bit_length('jose')	32
`char_length`(string) 或`character_length`(string)	int	字符串中的字符个数	char_length('jose')	4
`convert`(string usingconversion_name)	text	使用指定的转换名字改变编码。转换可以通过 CREATE CONVERSION 定义。当然系统里有一些预定义的转换名字。参阅表9-7获取可用的转换名。	convert('PostgreSQL' using iso_8859_1_to_utf8)	UTF8编码的'PostgreSQL'
`lower`(string)	text	把字符串转化为小写	lower('TOM')	tom
`octet_length`(string)	int	字符串中的字节数	octet_length('jose')	4
`overlay`(string placing stringfrom int [for int])	text	替换子字符串	overlay('Txxxxas' placing 'hom' from 2 for 4)	Thomas
`position`(substring in string)	int	指定的子字符串的位置	position('om' in 'Thomas')	3
`substring`(string [from int] [forint])	text	抽取子字符串	substring('Thomas' from 2 for 3)	hom
`substring`(string from pattern)	text	抽取匹配 POSIX 正则表达式的子字符串。参见节9.7获取更多关于模式匹配的信息。	substring('Thomas' from '...$')	mas
`substring`(string from pattern forescape)	text	抽取匹配 SQL 正则表达式的子字符串。参见节9.7获取更多关于模式匹配的信息。	substring('Thomas' from '%#"o_a#"_' for '#')	oma
`trim`([leading \| trailing \| both] [characters] from string)	text	从字符串 string 的开头/结尾/两边删除只包含 characters	trim(both 'x' from 'xTomxx')	Tom
`upper`(string)	text	把字符串转化为大写	upper('tom')	TOM

还有额外的字符串操作函数可以用，它们在表9-6列出。它们有些在内部用于实现表9-5列出的 SQL 标准字符串函数。

表9-6. 其它字符串函数

函数	返回类型	描述	例子	结果
`ascii`(string)	int	参数第一个字符的 ASCII 码	ascii('x')	120
`btrim`(string text [, characterstext])	text	从 string 开头和结尾删除只包含 characters	btrim('xyxtrimyyx', 'xy')	trim
`chr`(int)	text	给出 ASCII 码的字符	chr(65)	A
`convert`(string text, [src_encoding name,]dest_encoding name)	text	把原来编码为 src_encoding 的字符串转换为 dest_encoding 编码(如果省略了src_encoding	convert( 'text_in_utf8', 'UTF8', 'LATIN1')	以ISO 8859-1编码表示的text_in_utf8
`decode`(string text, type text)	bytea	把早先用 `encode` 编码的 string 里面的二进制数据解码。参数类型和 `encode`相同。	decode('MTIzAAE=', 'base64')	123\000\001
`encode`(data bytea, type text)	text	把二进制数据编码为只包含 ASCII 形式的数据。支持的类型有：base64, hex,escape	encode( E'123\\000\\001', 'base64')	MTIzAAE=
`initcap`(string)	text	把每个单词的第一个子母转为大写，其它的保留小写。单词是一系列字母数字组成的字符，用非字母数字分隔。	initcap('hi THOMAS')	Hi Thomas
`length`(string)	int	string	length('jose')	4
`lpad`(string text, length int [,fill text])	text	通过填充字符 fill(缺省时为空白)，把 string 填充为 length 长度。如果string 已经比 length	lpad('hi', 5, 'xy')	xyxhi
`ltrim`(string text [, characterstext])	text	从字符串 string 的开头删除只包含 characters	ltrim('zzzytrim', 'xyz')	trim
`md5`(string)	text	计算 string	md5('abc')	900150983cd24fb0 d6963f7d28e17f72
`pg_client_encoding`()	name	当前客户端编码名称	pg_client_encoding()	SQL_ASCII
`quote_ident`(string)	text	返回适用于 SQL 语句的标识符形式(使用适当的引号进行界定)。只有在必要的时候才会添加引号(字符串包含非标识符字符或者会转换大小写的字符)。嵌入的引号被恰当地写了双份。	quote_ident('Foo bar')	"Foo bar"
`quote_literal`(string)	text	返回适用于在 SQL 语句里当作文本使用的形式。嵌入的引号和反斜杠被恰当地写了双份。	quote_literal( 'O\'Reilly')	'O''Reilly'
`regexp_replace`(string text,pattern text, replacement text[,flags text])	text	替换匹配 POSIX 正则表达式的子字符串。参见节9.7以获取更多模式匹配的信息。	regexp_replace('Thomas', '.[mN]a.', 'M')	ThM
`repeat`(string text, number int)	text	将 string 重复 number	repeat('Pg', 4)	PgPgPgPg
`replace`(string text, from text,to text)	text	把字符串 string 里出现地所有子字符串 from 替换成子字符串 to	replace( 'abcdefabcdef', 'cd', 'XX')	abXXefabXXef
`rpad`(string text, length int [,fill text])	text	使用填充字符 fill(缺省时为空白)，把 string 填充到 length 长度。如果string 已经比 length	rpad('hi', 5, 'xy')	hixyx
`rtrim`(string text [, characterstext])	text	从字符串 string 的结尾删除只包含 characters	rtrim('trimxxxx', 'x')	trim
`split_part`(string text,delimiter text, field int)	text	根据 delimiter 分隔 string 返回生成的第 field	split_part('abc~@~def~@~ghi', '~@~', 2)	def
`strpos`(string, substring)	int	指定的子字符串的位置。和 position(substring in string)	strpos('high', 'ig')	2
`substr`(string, from [, count])	text	抽取子字符串。和 substring(string from from for count)	substr('alphabet', 3, 2)	ph
`to_ascii`(string text [, encodingtext])	text	把 string 从其它编码转换为 ASCII (仅支持 LATIN1, LATIN2, LATIN9, WIN1250编码)。	to_ascii('Karel')	Karel
`to_hex`(number int 或 bigint)	text	把 number	to_hex(2147483647)	7fffffff
`translate`(string text, fromtext, to text)	text	把在 string 中包含的任何匹配 from 中字符的字符转化为对应的在 to	translate('12345', '14', 'ax')	a23x5

表9-7. 内置的转换

ascii_to_mic	SQL_ASCII	MULE_INTERNAL
ascii_to_utf8	SQL_ASCII	UTF8
big5_to_euc_tw	BIG5	EUC_TW
big5_to_mic	BIG5	MULE_INTERNAL
big5_to_utf8	BIG5	UTF8
euc_cn_to_mic	EUC_CN	MULE_INTERNAL
euc_cn_to_utf8	EUC_CN	UTF8
euc_jp_to_mic	EUC_JP	MULE_INTERNAL
euc_jp_to_sjis	EUC_JP	SJIS
euc_jp_to_utf8	EUC_JP	UTF8
euc_kr_to_mic	EUC_KR	MULE_INTERNAL
euc_kr_to_utf8	EUC_KR	UTF8
euc_tw_to_big5	EUC_TW	BIG5
euc_tw_to_mic	EUC_TW	MULE_INTERNAL
euc_tw_to_utf8	EUC_TW	UTF8
gb18030_to_utf8	GB18030	UTF8
gbk_to_utf8	GBK	UTF8
iso_8859_10_to_utf8	LATIN6	UTF8
iso_8859_13_to_utf8	LATIN7	UTF8
iso_8859_14_to_utf8	LATIN8	UTF8
iso_8859_15_to_utf8	LATIN9	UTF8
iso_8859_16_to_utf8	LATIN10	UTF8
iso_8859_1_to_mic	LATIN1	MULE_INTERNAL
iso_8859_1_to_utf8	LATIN1	UTF8
iso_8859_2_to_mic	LATIN2	MULE_INTERNAL
iso_8859_2_to_utf8	LATIN2	UTF8
iso_8859_2_to_windows_1250	LATIN2	WIN1250
iso_8859_3_to_mic	LATIN3	MULE_INTERNAL
iso_8859_3_to_utf8	LATIN3	UTF8
iso_8859_4_to_mic	LATIN4	MULE_INTERNAL
iso_8859_4_to_utf8	LATIN4	UTF8
iso_8859_5_to_koi8_r	ISO_8859_5	KOI8
iso_8859_5_to_mic	ISO_8859_5	MULE_INTERNAL
iso_8859_5_to_utf8	ISO_8859_5	UTF8
iso_8859_5_to_windows_1251	ISO_8859_5	WIN1251
iso_8859_5_to_windows_866	ISO_8859_5	WIN866
iso_8859_6_to_utf8	ISO_8859_6	UTF8
iso_8859_7_to_utf8	ISO_8859_7	UTF8
iso_8859_8_to_utf8	ISO_8859_8	UTF8
iso_8859_9_to_utf8	LATIN5	UTF8
johab_to_utf8	JOHAB	UTF8
koi8_r_to_iso_8859_5	KOI8	ISO_8859_5
koi8_r_to_mic	KOI8	MULE_INTERNAL
koi8_r_to_utf8	KOI8	UTF8
koi8_r_to_windows_1251	KOI8	WIN1251
koi8_r_to_windows_866	KOI8	WIN866
mic_to_ascii	MULE_INTERNAL	SQL_ASCII
mic_to_big5	MULE_INTERNAL	BIG5
mic_to_euc_cn	MULE_INTERNAL	EUC_CN
mic_to_euc_jp	MULE_INTERNAL	EUC_JP
mic_to_euc_kr	MULE_INTERNAL	EUC_KR
mic_to_euc_tw	MULE_INTERNAL	EUC_TW
mic_to_iso_8859_1	MULE_INTERNAL	LATIN1
mic_to_iso_8859_2	MULE_INTERNAL	LATIN2
mic_to_iso_8859_3	MULE_INTERNAL	LATIN3
mic_to_iso_8859_4	MULE_INTERNAL	LATIN4
mic_to_iso_8859_5	MULE_INTERNAL	ISO_8859_5
mic_to_koi8_r	MULE_INTERNAL	KOI8
mic_to_sjis	MULE_INTERNAL	SJIS
mic_to_windows_1250	MULE_INTERNAL	WIN1250
mic_to_windows_1251	MULE_INTERNAL	WIN1251
mic_to_windows_866	MULE_INTERNAL	WIN866
sjis_to_euc_jp	SJIS	EUC_JP
sjis_to_mic	SJIS	MULE_INTERNAL
sjis_to_utf8	SJIS	UTF8
tcvn_to_utf8	WIN1258	UTF8
uhc_to_utf8	UHC	UTF8
utf8_to_ascii	UTF8	SQL_ASCII
utf8_to_big5	UTF8	BIG5
utf8_to_euc_cn	UTF8	EUC_CN
utf8_to_euc_jp	UTF8	EUC_JP
utf8_to_euc_kr	UTF8	EUC_KR
utf8_to_euc_tw	UTF8	EUC_TW
utf8_to_gb18030	UTF8	GB18030
utf8_to_gbk	UTF8	GBK
utf8_to_iso_8859_1	UTF8	LATIN1
utf8_to_iso_8859_10	UTF8	LATIN6
utf8_to_iso_8859_13	UTF8	LATIN7
utf8_to_iso_8859_14	UTF8	LATIN8
utf8_to_iso_8859_15	UTF8	LATIN9
utf8_to_iso_8859_16	UTF8	LATIN10
utf8_to_iso_8859_2	UTF8	LATIN2
utf8_to_iso_8859_3	UTF8	LATIN3
utf8_to_iso_8859_4	UTF8	LATIN4
utf8_to_iso_8859_5	UTF8	ISO_8859_5
utf8_to_iso_8859_6	UTF8	ISO_8859_6
utf8_to_iso_8859_7	UTF8	ISO_8859_7
utf8_to_iso_8859_8	UTF8	ISO_8859_8
utf8_to_iso_8859_9	UTF8	LATIN5
utf8_to_johab	UTF8	JOHAB
utf8_to_koi8_r	UTF8	KOI8
utf8_to_sjis	UTF8	SJIS
utf8_to_tcvn	UTF8	WIN1258
utf8_to_uhc	UTF8	UHC
utf8_to_windows_1250	UTF8	WIN1250
utf8_to_windows_1251	UTF8	WIN1251
utf8_to_windows_1252	UTF8	WIN1252
utf8_to_windows_1253	UTF8	WIN1253
utf8_to_windows_1254	UTF8	WIN1254
utf8_to_windows_1255	UTF8	WIN1255
utf8_to_windows_1256	UTF8	WIN1256
utf8_to_windows_1257	UTF8	WIN1257
utf8_to_windows_866	UTF8	WIN866
utf8_to_windows_874	UTF8	WIN874
windows_1250_to_iso_8859_2	WIN1250	LATIN2
windows_1250_to_mic	WIN1250	MULE_INTERNAL
windows_1250_to_utf8	WIN1250	UTF8
windows_1251_to_iso_8859_5	WIN1251	ISO_8859_5
windows_1251_to_koi8_r	WIN1251	KOI8
windows_1251_to_mic	WIN1251	MULE_INTERNAL
windows_1251_to_utf8	WIN1251	UTF8
windows_1251_to_windows_866	WIN1251	WIN866
windows_1252_to_utf8	WIN1252	UTF8
windows_1256_to_utf8	WIN1256	UTF8
windows_866_to_iso_8859_5	WIN866	ISO_8859_5
windows_866_to_koi8_r	WIN866	KOI8
windows_866_to_mic	WIN866	MULE_INTERNAL
windows_866_to_utf8	WIN866	UTF8
windows_866_to_windows_1251	WIN866	WIN
windows_874_to_utf8	WIN874	UTF8
转换名[a]	源编码	目的编码
【注意】a. 转换名遵循一个标准的命名模式：将源编码中的所有非字母数字字符用下划线替换，后面跟着 _to_