This article explores T-SQL RegEx commands in SQL Server for performing data search using various conditions.

本文探讨了SQL Server中的T-SQL RegEx命令,用于在各种条件下执行数据搜索。

(Introduction)

We store data in multiple formats or data types in SQL Server tables. Suppose you have a data column that contains string data in alphanumeric format. We use LIKE logical operator to search specific character in the string and retrieve the result. For example, in the Employee table, we want to filter results and get the only employee whose name starts with character A.

我们在SQL Server表中以多种格式或数据类型存储数据。 假设您有一个数据列,其中包含字母数字格式的字符串数据。 我们使用LIKE逻辑运算符在字符串中搜索特定字符并检索结果。 例如,在Employee表中,我们要过滤结果并获得名称以字符A开头的唯一雇员。

We use regular expressions to define specific patterns in T-SQL in a LIKE operator and filter results based on specific conditions. We also call these regular expressions as T-SQL RegEx functions. In this article, we will use the term T-SQL RegEx functions for regular expressions.

我们使用正则表达式在LIKE运算符中定义T-SQL中的特定模式,并根据特定条件过滤结果。 我们还将这些正则表达式称为T-SQL RegEx函数。 在本文中,我们将术语T-SQL RegEx函数用于正则表达式。

We can have multiple types of regular expressions:

我们可以有多种类型的正则表达式:






(Pre-requisite)

In this article, we will use the AdventureWorks sample database. Execute the following query, and we get all product descriptions:

在本文中,我们将使用AdventureWorks示例数据库。 执行以下查询,我们将获得所有产品描述:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription];




sql server Reg_java


Let’s explore T-SQL RegEx in the following examples.

让我们在以下示例中探索T-SQL RegEx。

(Example 1: Filter results for description starting with character A or L)

Suppose we want to get product description starting with character A or L. We can use format [XY]% in the Like function.

假设我们要获得以字符A或L开头的产品描述。我们可以在Like函数中使用格式[XY]%。

Execute the following query and observe the output contains rows with first character A or L:

执行以下查询,并观察输出包含第一个字符A或L的行:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[AL]%'


sql server Reg_数据库_02


(Example 2: Filter results for description with first character A and second character L)

In the previous example, we filtered results for starting character A, or L. Suppose we want starting characters of descriptions AL. We can use T-SQL RegEx [X][Y]% in the Like operator.

在前面的示例中,我们过滤了起始字符A或L的结果。假设我们想要描述AL的起始字符。 我们可以在Like运算符中使用T-SQL RegEx [X] [Y]%。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A][L]%'

In the output, you can we get only records with first character A and second characters L.

在输出中,您可以只获取具有第一个字符A和第二个字符L的记录。


sql server Reg_java_03


We can specify multiple characters as well to filter records. The following query gives results for starting characters [All] together:

我们也可以指定多个字符来过滤记录。 以下查询给出一起开始字符[All]的结果:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A][L][L]%'


sql server Reg_sql server Reg_04


(Example 3: Filter results for description and starting character between A and D)

In the previous example, we specified a particular starting character to filter the results. We can specify character range using [X-Z]% functions.

在前面的示例中,我们指定了一个特殊的起始字符来过滤结果。 我们可以使用[XZ]%函数指定字符范围。

The following query gives results for description starting character from A and D:

下面的查询给出了描述结果从A和D的起始字符:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A-D]%'


sql server Reg_字符串_05


Similarly, we can specify multiple conditions for each character. For example, the below query does the following searches:

同样,我们可以为每个字符指定多个条件。 例如,以下查询执行以下搜索:



SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A-D][F-I]%'

In the output, you can see that both result set satisfies both conditions.

在输出中,您可以看到两个结果集都满足两个条件。


sql server Reg_python_06


(Example 4: Filter results for description and ending character between A and D)

In the previous examples, we filtered the data for the starting characters. We might want to filter for the end position character as well.

在前面的示例中,我们过滤了起始字符的数据。 我们可能还需要过滤结束位置字符。

In the previous examples, note the position of percentage (%) operator. We specified a percentage character at the end of search characters.

在前面的示例中,请注意百分比(%)运算符的位置。 我们在搜索字符的末尾指定了百分比字符。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A-D][F-I]%'

In the following query, we changed the position of percentage character at the beginning on search character. It looks for the characters with the following condition:

在以下查询中,我们更改了百分比字符在搜索字符开头的位置。 查找具有以下条件的字符:


SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '%[G-S]'

In the output, we get the character that satisfies our search condition.

在输出中,我们得到满足搜索条件的字符。


sql server Reg_数据库_07


(Example 5: Filter results for description starting letters AF and ending character between S)

Let’s make it a bit complex. We want to search using the following conditions:

让我们稍微复杂一点。 我们要使用以下条件进行搜索:



Execute the following query, and in the output, we can see it satisfies our requirement:

执行以下查询,在输出中,我们可以看到它满足我们的要求:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A][F]%[S]'


sql server Reg_数据库_08


(Example 6: Filter results for description starting letters excluding A to T )

In the following example, we do not want the first character of output rows from A to T. We can exclude characters using [^X-Y] format in Like operator.

在下面的示例中,我们不希望输出行从A到T的第一个字符。我们可以在Like运算符中使用[^ XY]格式排除字符。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[^A-T]%'

In the output, we do not have any first characters from A to T.

在输出中,我们没有从A到T的任何第一个字符。


sql server Reg_数据库_09


(Example 7: Filter results for description with a specific pattern)

In the example below, we want to filter records using the following conditions:

在下面的示例中,我们要使用以下条件过滤记录:






SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[R-S]%[P][I]%'


sql server Reg_字符串_10


(Example 8: Case sensitive search using T-SQL RegEx functions)

By default, we do not get case sensitive results. For example, the following queries return the same result set:

默认情况下,我们不会得到区分大小写的结果。 例如,以下查询返回相同的结果集:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[r-s]%[P][i]%'
 
  SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[R-S]%[P][I]%'


sql server Reg_数据库_11


We can perform case sensitive search using the following two ways:

我们可以使用以下两种方式执行区分大小写的搜索:

  1. Database collation setting: Each database in SQL Server have a collation. Right-click on the database and in the properties page, you can see the collation
    数据库排序规则设置:SQL Server中的每个数据库都有一个排序规则。 右键单击数据库,然后在属性页中,可以看到排序规则

    We have SQL_Latin1_General_CP1_CI_AS performs case insensitive behaviour for the database. We can change this collation to case sensitive collation. It is not a simple solution. It might create issues for your queries. It is not a recommended way unless you explicitly require case sensitive collation.
    我们让SQL_Latin1_General_CP1_CI_AS对数据库执行不区分大小写的行为。 我们可以将此排序规则更改为区分大小写的排序规则。 这不是一个简单的解决方案。 它可能会给您的查询带来问题。 除非明确要求区分大小写,否则不建议使用此方法。
    We can use Column Collation with T-SQL RegEx functions to perform case sensitive search.
    我们可以将列排序规则与T-SQL RegEx函数一起使用以执行区分大小写的搜索。
Create table Characters
  (Alphabet char(1)
  )
  Go
  Insert into Characters values ('A')
  Insert into Characters values ('a')
  Go

In the table, we have letter A in upper and lowercase. If we run the following select statement, it returns both uppercase and lowercase:

在表中,我们将字母A大小写。 如果我们运行以下select语句,它将返回大写和小写:

SELECT * from Characters 
  where Alphabet like '[A]%'


sql server Reg_字符串_12

Suppose we want to filter the uppercase letter in the result. We can use column collation as per the following query:

假设我们要过滤结果中的大写字母。 我们可以根据以下查询使用列排序规则:

select * from Characters 
  where Alphabet COLLATE Latin1_General_BIN  like '[A]%'

It returns uppercase letter A in the output.

它在输出中返回大写字母A。


sql server Reg_字符串_13

Similarly, the following query returns lowercase letter in the output:

同样,以下查询在输出中返回小写字母:

select * from Characters 
  where Alphabet COLLATE Latin1_General_BIN  like '[a]%'


sql server Reg_数据库_14

  1. We can use T-SQL RegEx function to find both upper and lowercase characters in the output.
    我们可以使用T-SQL RegEx函数在输出中查找大写和小写字符。
    We want the following output:
    我们需要以下输出:
SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] COLLATE Latin1_General_BIN  like '[C][h]%'


sql server Reg_数据库_15

(Example 9: Use T-SQL Regex to Find Text Rows that Contain a Number)

We can find a row that contains the number as well in the text. For example, we want to filter the results with rows that contain number 0 to 9 in the beginning.

我们可以在文本中找到包含数字的行。 例如,我们要使用开头包含数字0到9的行来过滤结果。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like  '[0-9]%'</p>
  <p>
    <img style="margin: 0px auto; display: block;" src="/wp-content/uploads/2019/09/use-t-sql-regex-to-find-text-rows-that-contain-a-n.png" alt="Use T-SQL Regex to Find Text Rows that Contain a Number" />
  </p>
  <p>
    Similar to the characters, we can also specify the numbers for different positions. In the following example, we want the first digit from 1 to 5. The second digit should be in between 0 to 9.
  </p>
  <p><pre lang="tsql">SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like  '[1-5][0-9]%'


sql server Reg_python_16


(Example 10: Use T-SQL Regex to Find valid email ID’s)

Let’s explore a practical scenario of the RegEX function. We have a customer table, and it holds the customer email address. We want to identify valid email address from the user data. Sometimes, users make typo mistake and enter @@ instead of @ character.

让我们探讨RegEX函数的实际情况。 我们有一个客户表,其中包含客户的电子邮件地址。 我们想从用户数据中识别有效的电子邮件地址。 有时,用户会输入错误,并输入@@而不是@字符。

First, create the sample table and insert some email address into it in different formats.

首先,创建示例表,并以不同的格式在其中插入一些电子邮件地址。

CREATE TABLE TSQLREGEX(
     Email VARCHAR(1000)
  )
 
  Insert into TSQLREGEX values('raj@gmail.com')
  Insert into TSQLREGEX values('HSDFX@gmail.com')
  Insert into TSQLREGEX values('JHKHKO.PVS@gmail.com')
  Insert into TSQLREGEX values('ABC@@gmail.com')
  Insert into TSQLREGEX values('ABC.DFG.LKF#@gmail.com')

Execute the following select statement with the T-SQL RegEx function and it eliminates invalid email addresses.

使用T-SQL RegEx函数执行以下select语句,它将消除无效的电子邮件地址。

Select * from TSQLREGEX where email
  LIKE '%[A-Z0-9][@][A-Z0-9]%[.][A-Z0-9]%'

We do not have following invalid email address in the list.

列表中没有以下无效的电子邮件地址。




sql server Reg_python_17


(Conclusion)

In this article, we explored T-SQL RegEx functions to perform a search using various conditions. You should be aware of these to search based on specific requirements.

在本文中,我们探讨了T-SQL RegEx函数以使用各种条件执行搜索。 您应了解这些内容,以根据特定要求进行搜索。

翻译自: https://www.sqlshack.com/t-sql-regex-commands-in-sql-server/