9.1 正则介绍_grep上 9.2 grep中 9.3 grep下

原创

wg_dFdhjadB 2018-07-02 22:03:51 ©著作权

©著作权归作者所有：来自51CTO博客作者wg_dFdhjadB的原创作品，请联系作者获取转载授权，否则将追究法律责任

正则的介绍

定义是：它使用单个字符串来描述或匹配一系列符合某个句法规则的字符串。在很多文本编辑器或其他工具里，正则表达式通常用来检索和替换那些符合模式的文本内容。许多程序设计语言也都支持利用正则表达式进行字符串操作。常用的工具有grep，sed，awk等，都是针对文本的行进行操作。

grep上

grep/egrep工具的使用

该命令的格式为：grep [-cinvABC] 'word' filename，常用命令如下：

-c:表示打印符合要求的行数。
-i:表示忽略大小写。
-n：表示输出符号要求的行及其行号。
-v:表示打印不符和要求的行。
-A：后面跟一个数字（有无空格都可以），例如-A2表示打印符合要求的行以及下面两行。
-B：后面跟一个数字，例如-B2表示打印符合要求的行以及上面两行。
-c：后面跟一个数字，例如-C2表示打印符合要求的行以及上下各两行。

首先看看-A、-B、和-C这3个选项的用法。

-A2会把包含halt的行以及这行下面的两行都打印出来：

[root@localhost ~]# grep -A2 'halt' /etc/passwd
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

grep默认帮我们把匹配到的字符串标注了红色。

-B2会把包含halt的行以及这行上面的两行都打印出来：

[root@localhost ~]# grep -B2 'halt' /etc/passwd
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt

-C2：会把包含halt的行以及这行上下的两行都打印出来：

[root@localhost ~]# grep -C2 'halt' /etc/passwd
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

9.2 grep中

过滤出带有某个关键词的行，并输出行号。

示例命令如下：

[root@localhost ~]# grep -n 'root' /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
10:operator:x:11:0:operator:/root:/sbin/nologin

前面的数字显示为绿色，表示行号。

过滤出不带有某个关键词的行，并输出行号。

[root@localhost ~]# grep -n 'root' /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
10:operator:x:11:0:operator:/root:/sbin/nologin
[root@localhost ~]# grep -nv 'nologin' /etc/passwd
6:sync:x:5:0:sync:/sbin:/bin/sync
7:shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
8:halt:x:7:0:halt:/sbin:/sbin/halt

过滤出所有包含数字的行示例命令如下：

[root@localhost ~]# grep '[0-9]' /etc/inittab
# multi-user.target: analogous to runlevel 3
# graphical.target: analogous to runlevel 5

只要有一个数字就算匹配到了

过滤出所有不包含数字的行示例命令如下：

[root@localhost ~]# grep -v '[0-9]' /etc/inittab
# inittab is no longer used when using systemd.
#
# ADDING CONFIGURATION HERE WILL HAVE NO EFFECT ON YOUR SYSTEM.
#
# Ctrl-Alt-Delete is handled by /usr/lib/systemd/system/ctrl-alt-del.target
#
# systemd uses 'targets' instead of runlevels. By default, there are two main targets:
#
#
# To view current default target, run:
# systemctl get-default
#
# To set a default target, run:
# systemctl set-default TARGET.target
#

和上一个例子的结果正好相反，只要是包含一个数字，就不显示。

过滤掉所有以#开头的行

示例命令如下：

[root@bogon ~]# cat /etc/sos.conf
[plugins]

#disable = rpm, selinux, dovecot

[tunables]

#rpm.rpmva = off
#general.syslogsize = 15

[root@bogon ~]# grep -v '^#' /etc/sos.conf
[plugins]


[tunables]

过滤掉所有空行和以#开头的行

示例命令如下：

[root@bogon ~]# grep -v '^#' /etc/sos.conf |grep -v '^$'
[plugins]
[tunables]

在正则表达式中，^表示行的开始，$表示行的结尾，那么空行则可以用^$表示。如何打印出不以英文字母开头的行呢？先来自定义一个文件，如下所示：

[root@bogon ~]# mkdir /tmp/1
[root@bogon ~]# cd /tmp/1
[root@bogon 1]# vim test.txt
[root@bogon 1]# cat test.txt
[root@bogon 1]# vim test.txt
[root@bogon 1]# cat test.txt
123
abc
456


abc2323
[root@bogon 1]# laksdjf

在test.txt中写几行字符串，用来做实验，如下所示：

[root@bogon 1]# grep '^[^a-zA-Z]' test.txt
123
456
#laksdjf
[root@bogon 1]# grep '[^a-zA-Z]' test.txt
123
456
abc2323
#laksdjf

如果是数字就用[0-9]这样的形式（当遇到类似[15]的形式时，表示只含有1或者5）。如果要过滤数字以及大小写字母，则要写成类似[0-9a-zA-Z]的形式。[^字符]表示除[]内字符之外的字符。注意，把^写到方括号里面和外面是有区别的。

过滤出任意一个字符和重复字符

示例命令如下：

[root@bogon 1]# grep 'r.o' /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

.表示任意一个字符。上例中，r.o表示把r和o之间有一个任意字符的行过滤出来。

[root@bogon 1]# grep 'ooo*' /etc/passwd
root:x:0:0:root:/root:/bin/bash
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
setroubleshoot:x:991:985::/var/lib/setroubleshoot:/sbin/nologin

*表示零个或多个*前面的字符。上例中，ooo*表示oo，ooo，oooo....或者更多的o。

[root@bogon 1]# grep '.*' /etc/passwd |wc -l
44
[root@bogon 1]# wc -l /etc/passwd
44 /etc/passwd

上例中，.*表示零个或多个任意字符，空行也包含在内，它会把/etc/passwd文件里面的所有行都匹配到。

指定要过滤出的字符出现的次数

示例如下：

[root@bogon 1]# grep 'o\{2\}' /etc/passwd
root:x:0:0:root:/root:/bin/bash
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
setroubleshoot:x:991:985::/var/lib/setroubleshoot:/sbin/nologin

这里用到了符号{}，其内部为数字，表示前面的字符要重复的次数。需要强调的是，{}左右都需要加上转义字符\。另外，使用“{}”还可以表示一个范围，具体格式为{n1,n2},其中n1<n2,表示重复n1到n2次前面的字符，n2还可以为空，这时表示大于等于n1次。

9.3 grep下

为了试验方便，把test.txt编辑成如下内容：

rot:x:o:o:/rot:bin/bash
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
roooot:x:o:o:/rooooot:/bin/bash
1111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

过滤出一个或多个指定的字符

示例命令如下：

[root@bogon 1]# egrp 'o+' test.txt
bash: egrp: 未找到命令...
[root@bogon 1]# egrep 'o+' test.txt
rot:x:o:o:/rot:bin/bash
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
roooot:x:o:o:/rooooot:/bin/bash
[root@bogon 1]# egrep 'oo+' test.txt
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
roooot:x:o:o:/rooooot:/bin/bash
[root@bogon 1]# egrep 'ooo+' test.txt
roooot:x:o:o:/rooooot:/bin/bash

和grep不同，这里egerp使用的是符号+，它表示匹配1个或多个+前面的字符，这个“+”是不支持被grep直接使用的。包括上面的{}，也是可以被egrep使用，而不用加\转义。示例如下：

[root@bogon 1]# egrep 'o{2}' /etc/passwd
root:x:0:0:root:/root:/bin/bash
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
setroubleshoot:x:991:985::/var/lib/setroubleshoot:/sbin/nologin

过滤出零个或一个指定的字符

示例命令如下：

[root@bogon 1]# egrep 'o?' test.txt
rot:x:o:o:/rot:bin/bash
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
roooot:x:o:o:/rooooot:/bin/bash
1111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

[root@bogon 1]# egrep 'ooo?' test.txt
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
roooot:x:o:o:/rooooot:/bin/bash
[root@bogon 1]# egrep 'oooo?' test.txt
roooot:x:o:o:/rooooot:/bin/bash

过滤出字符串1或者字符串2

示例命令如下：

[root@bogon 1]# egrep 'aaa|111|ooo' test.txt
roooot:x:o:o:/rooooot:/bin/bash
1111111111111111111111111111111
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

egrep中(）的应用

示例命令如下：

[root@bogon 1]# egrep 'r(oo|at)o' test.txt
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
roooot:x:o:o:/rooooot:/bin/bash
[root@bogon 1]#

这里用（）表示一个整体，上例中会把包含rooo或者rato的行过滤出来，另外也可以把（）和其他符号组合在一起，例如（oo）+就表示1个或者多个oo。示例命令如下：

[root@bogon 1]# egrep '(oo)+' test.txt
operator:x:11:o:operator:/root:/sbin/nologin
operator:x:11:o:operator:/root:/sbin/nologin
roooot:x:o:o:/rooooot:/bin/bash
[root@bogon 1]#