文章目录
- 字符串匹配
- 一、BF算法
- 二、子串获取
- 三、代码与数据测试
- 总结
字符串匹配
一、BF算法
在数据结构这门课中,关于串,模式匹配是其一个很重要的问题。针对这个问题,书上讲了两种模式匹配的算法,即BF算法和KMP算法,这段中使用的就是BF算法。BF算法,即暴力(Brute Force)算法,是普通的模式匹配算法,其实现过程没有任何技巧,就是简单粗暴地拿一个串同另一个串中的字符一一比对,得到最终结果。
例如,使用普通模式匹配算法判断字符串A(“abcac”)是否为串B(“ababcabcacbab”)子串的判断过程如下:
首先,将串A与串B的首字符对齐,然后逐个判断相对的字符是否相等,如图1 所示:
图1 第一次匹配
图1中,由于串A与串B的第3个字符匹配失败,因此需要将串A后移一个字符的位置,继续同串B匹配,如图2所示:
图2 第二次匹配
图2中可以看到,两串匹配失败,串A继续向后移动一个字符的位置,如图3所示:
图3 第三次匹配
图3中,两串的模式匹配失败,串A继续移动,一直移动至图4的位置才匹配成功:
图4 串模式匹配成功
串A匹配失败时进行的后移操作,是通过双重for循环来完成的,代码如下:
public int locate(MyString paraMyString) {
boolean tempMatch = false;
for (int i = 0; i < length - paraMyString.length + 1; i++) {
// Initialize.
tempMatch = true;
for (int j = 0; j < paraMyString.length; j++) {
if (data[i + j] != paraMyString.data[j]) {
tempMatch = false;
break;
} // Of if
} // Of for j
if (tempMatch) {
return i;
} // Of if
} // Of for i
return -1;
} // Of locate
匹配成功后,将返回子串在主串中的开始位置。
二、子串获取
子串获取的过程,通俗来讲,就是在主串中扣出一部分来。子串获取函数带有子串开始位置和子串长度参数,这个地方注意需要进行越界检查,若子串开始位置加上子串长度 > 主串长度,则越界。代码如下:
if (paraStartPosition + paraLength > length) {
System.out.println("The bound is exceeded.");
return null;
} // Of if
三、代码与数据测试
package datastructure;
/**
* My string. String is a class provided by the language, so I use anther name.
* It is essentially a sequential list with char type elements.
*
* @auther Weijie Pu weijiepu@163.com.
*/
public class MyString {
/**
* The maximal length.
*/
public static final int MAX_LENGTH = 10;
/**
* The actual length.
*/
int length;
/**
* The data.
*/
char[] data;
/**
*********************
* Construct an empty char array.
*********************
*/
public MyString() {
length = 0;
data = new char[MAX_LENGTH];
} // Of the first constructor.
/**
*********************
* Construct using a system defined string.
*
* @param paraString
* The given string. Its length should not exceed MAX_LENGTH - 1.
*********************
*/
public MyString(String paraString) {
data = new char[MAX_LENGTH];
length = paraString.length();
// Copy data.
for (int i = 0; i < length; i++) {
data[i] = paraString.charAt(i);
} // Of for i
} // Of the second constructor
/**
*********************
* Overrides the method claimed in Object, the superclass of any class.
*********************
*/
public String toString() {
String resultString = "";
for (int i = 0; i < length; i++) {
resultString += data[i];
} // Of for i
return resultString;
} // Of toString
/**
********************
* Locate the position of a substring.
*
* @param paraMyString
* The given substring.
* @return The first position. -1 for no matching.
********************
*/
public int locate(MyString paraMyString) {
boolean tempMatch = false;
for (int i = 0; i < length - paraMyString.length + 1; i++) {
// Initialize.
tempMatch = true;
for (int j = 0; j < paraMyString.length; j++) {
if (data[i + j] != paraMyString.data[j]) {
tempMatch = false;
break;
} // Of if
} // Of for j
if (tempMatch) {
return i;
} // Of if
} // Of for i
return -1;
} // Of locate
/**
********************
* Get a substring
*
* @param paraString
* The given substring.
* @param paraStartPosition
* The start position in the original string.
* @param paraLength
* The length of the new string.
* @return The first position. -1 for no matching.
********************
*/
public MyString substring(int paraStartPosition, int paraLength) {
if (paraStartPosition + paraLength > length) {
System.out.println("The bound is exceeded.");
return null;
} // Of if
MyString resultMyString = new MyString();
resultMyString.length = paraLength;
for (int i = 0; i < paraLength; i++) {
resultMyString.data[i] = data[paraStartPosition + i];
} // Of for i
return resultMyString;
} // Of substring
/**
********************
* The entrance of the program.
*
* @param args
* Not used now.
********************
*/
public static void main(String args[]) {
MyString tempFirstString = new MyString("I like ik.");
MyString tempSecondString = new MyString("ik");
int tempPosition = tempFirstString.locate(tempSecondString);
System.out.println(
"The position of \"" + tempSecondString + "\" in \"" + tempFirstString + "\" is: " + tempPosition);
MyString tempThirdString = new MyString("ki");
tempPosition = tempFirstString.locate(tempThirdString);
System.out.println(
"The position of \"" + tempThirdString + "\" in \"" + tempFirstString + "\" is: " + tempPosition);
tempThirdString = tempFirstString.substring(1, 2);
System.out.println("The substring is: \"" + tempThirdString + "\"");
tempThirdString = tempFirstString.substring(5, 5);
System.out.println("The substring is: \"" + tempThirdString + "\"");
tempThirdString = tempFirstString.substring(5, 6);
System.out.println("The substring is: \"" + tempThirdString + "\"");
} // Of main
} // Of class MyString
运行结果:
总结
BF算法虽然简单易实现,但其时间复杂度可不低,为O(m*n),上文中提到的KMP算法则是一种高效的字符串匹配算法,其时间复杂度为O(m+n)。