1, 匹配所有标签

regex:
\<.[^<>]*\>

source:
<external_network_location_id>20130401_TXNONC100FFS3101TAUSNPN1733590048828A_0048828</external_network_location_id>

result:
<external_network_location_id>
</external_network_location_id>

2, 匹配指定标签 eg:匹配指定的div标签

regex:
\<\bdiv.*\<\/div\b\>

source:
<div>23dd</div>
<div1>23dd</div1>
<div>23dd33ff</div>
 
result:
<div>23dd</div>
<div>23dd33ff</div>

3, 匹配某种特定格式的字符串 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
regex:
>.[^<>]+<

source:
<li><a href="​​​http://([^\s]+)".+?span.+?\[(.+?)\​​].+?>(.+?)<

source:
<li><a href="​​​http://www.wea.com/blog/a.html​​​"   title="怎样在百度空间添加友情链接"><span class="article-date">[2014/11/13]</span>怎样在百链接</a></li>
<li><a href="​​​http://www.a.com/blog/b.html2​​"   title="怎样在百度空间添加友情链接2"><span class="article-date">[2014/11/12]</span>怎样在百度链接2</a></li>

result:
​​​http://www.wea.com/blog/a.html​​​ 2014/11/13 怎样在百链接
​​​http://www.a.com/blog/b.html2​​ 2014/11/12 怎样在百度链接2

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

regex:
<external_network_location_id>(.*?)</external_network_location_id>

source:
<external_network_location_id>20130401_TXNONC100FFS3101TAUSNPN1733590048828A_0048828</external_network_location_id>
<external_network_location_id>abcd1234004488877</external_network_location_id>

result:
20130401_TXNONC100FFS3101TAUSNPN1733590048828A_0048828
abcd1234004488877

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

regex:
<requserid>([^<]+)</requserid>

source:
<Request><Action>getuser</Action><UserLogin></UserLogin><Password></Password><Signature></Signature><VerifyText></VerifyText><requserid>535</requserid><requserid>5335</requserid></Request>

result:
535
5335

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

5,提取所有标签中的内容

regex 1:
<.+?>(.+?)<.+?>

regex 2:
(?is)(?<=>)[^<>]+(?=<)

source:
<span style=''>内容1</span><img src=".."/>内容2<p><input .../>内容3</p><p>内容4</p><b>内容5</b><i>内容6</i>

result:
内容1
内容2
内容3
内容4
内容5
内容6

6, 提取所有 img标签中的属性值 (其它标签可以借鉴)

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
regex:
(?is)<img\s*((?<key>[^=]+)="(?<value>[^"]+)")+?\s*/?>

source:
<img src="acbdd"/><img src="33ff"/><img src="gggggeeee"/><a>33333</a>

result:
key=src value=acbdd
key=src value=33ff
key=src value=gggggeeee

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

regex 1:
(?is)<img\s*((?<key>[^=]+)=(["'])(?<value>[^'"]+)\2)+?\s*/?>([^<>]*?</img>)?
regex 2:
(?is)<img\s+((?<key>[^=]+)=(["']?)(?<value>[^'"]+)\2\s*)+?\s*/?>([^<>]*?</img>)?

source:
<img src="acbdd"/><img src="33ff"/><img src="gggggeeee"/><img src="bb"></img><a>33333</a>

result:
key=src value=acbdd
key=src value=33ff
key=src value=gggggeeee
key=src value=bb

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

(?<=^<external_provider_group_id>).*(?=</external_provider_group_id>)
 可取第一个 23dd0078243d4323
贪婪 匹配
<external_provider_group_id>23dd0078243d4323</external_provider_group_id>

 可取第一个 23dd0078243d4323
贪婪 匹配
(?<=^<external_provider_group_id>).*0078243.*(?=</external_provider_group_id>)


<[A-Za-z_-]+>\w+0078243\w+</[A-Za-z_-]+>
可取以下4项
\w+ 表示 取1个或多个
<external_provider_group_id>23dd0078243d4323</external_provider_group_id>
<a>dd0078243dsd</a>
<b_b>dd0078243dsd33</b_b>
<c-c>dd0078243dsd44</c-c>

<[A-Za-z_-]+>\w{0,}0078243\w{0,}</[A-Za-z_-]+>
可取以下6项
\w{0,} 表示取0个或多个
<external_provider_group_id>23dd0078243d4323</external_provider_group_id>
<external_provider_group_id>442232323</external_provider_group_id>
<external_provider_group_id>23dd0078243d432344</external_provider_group_id>
<a>dd0078243dsd</a>
<b_b>dd0078243dsd33</b_b>
<c-c>dd0078243dsd44</c-c>
<d-d>0078243</d-d>

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

其它:

(?<=^<[A-Za-z_-]+>).*(?=</[A-Za-z_-]+>)
只能取第一个

.*(?<=<\w+>.*</\w+>)*

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

(the end)