内容样本:
<ul class="panel_body">
<li>
<a href="/zhaoyangjian724/article/category/1756569" οnclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">Oracle dump解析</a><span>(20)</span>
</li>
<li>
<a href="/zhaoyangjian724/article/category/1756685" οnclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">sql 查询优化</a><span>(159)</span>
</li>
perl 提供的方法:
find_by_tag_name
@elements = $h->find_by_tag_name('tag', ...);
$first_match = $h->find_by_tag_name('tag', ...);
在列表上下文, 返回元素的列表在$h下面 有任何指定tag名字的。
在标量环境,返回第一次 找到的元素
node2:/root/pachong#cat test.pl
use LWP::UserAgent;
use POSIX;
use HTML::TreeBuilder::XPath;
use Encode;
use HTML::TreeBuilder;
use Data::Dumper;
my $ua = LWP::UserAgent->new;
$ua->timeout(10);
$ua->env_proxy;
$ua->agent("Mozilla/8.0");
use HTML::TreeBuilder::XPath;
my $tree= HTML::TreeBuilder::XPath->new;
$tree->parse_file( "csdn.html");
##获取博客分类的URL,根据a标签查找属性为href
@Links = $tree->find_by_tag_name('a');
print %{$Links[0]};
print "\n";
node2:/root/pachong#perl test.pl
HTML::Element=HASH(0x24ad1f8)
----hash-----
返回一个对象,调用对象的方法:
node2:/root/pachong#perl test.pl
onclick_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); _taga_contentARRAY(0x179c718)href/zhaoyangjian724/article/category/1756569_parentHTML::Element=HASH(0x1b2e2d0)
node2:/root/pachong#
@Links = $tree->find_by_tag_name('a'); 返回的是a标签下的元素列表
attr
$value = $h->attr('attr');
$old_value = $h->attr('attr', $new_value);
返回(可选的结果集) $h给定属性的值,属性值(不是值,如果提供的话)是强制为小写。
如果尝试读取属性的值不存在对于这个元素,返回值是undef.
如果 methods 是被提供来访问一个属性(像 $h->tag for "_tag", $h->content_list, etc. below),
使用那些替代 $h->attr,是否用于读取或者设置
$href = $_->attr('href');
取出对应属性的值:
$VAR1 = bless( {
'onclick' => '_gaq.push([\'_trackEvent\',\'function\', \'onclick\', \'blog_articles_wenzhangfenlei\']); ',
'href' => '/zhaoyangjian724/article/category/1756569',
'_content' => [
'Oracle dump解析'
],
这里 '_content' 是一个数组引用
node2:/root/pachong#cat test.pl
use LWP::UserAgent;
use POSIX;
use HTML::TreeBuilder::XPath;
use Encode;
use HTML::TreeBuilder;
use Data::Dumper;
my $ua = LWP::UserAgent->new;
$ua->timeout(10);
$ua->env_proxy;
$ua->agent("Mozilla/8.0");
use HTML::TreeBuilder::XPath;
my $tree= HTML::TreeBuilder::XPath->new;
$tree->parse_file( "csdn.html");
##获取博客分类的URL,根据a标签查找属性为href
@Links = $tree->find_by_tag_name('a');
#print Dumper($Links[0]);
print "\n";
print "--------------------\n";
print @{$Links[0]->{'_content'}};
#print "\n";
print "--------------------\n";
node2:/root/pachong#perl test.pl
--------------------
Oracle dump解析--------------------
取出属性为'_content' 对应的值
利用 $href = $_->attr('_content');
use HTML::TreeBuilder::XPath;
my $tree= HTML::TreeBuilder::XPath->new;
$tree->parse_file( "csdn.html");
##获取博客分类的URL,根据a标签查找属性为href
@Links = $tree->find_by_tag_name('a');
#print Dumper($Links[0]);
print "--------------------\n";
print $Links[0]->attr('_content');
print "\n";
print @{$Links[0]->attr('_content')};
print "\n";
print "--------------------\n";
node2:/root/pachong#perl test.pl
--------------------
ARRAY(0x20276b8)
Oracle dump解析
--------------------
<ul class="panel_body">
<li>
<a href="/zhaoyangjian724/article/category/1756569" οnclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">Oracle dump解析</a><span>(20)</span>
</li>
<li>
<a href="/zhaoyangjian724/article/category/1756685" οnclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">sql 查询优化</a><span>(159)</span>
</li>
根据a标签查询 相应属性的值
<a> 标签的 href 属性用于指定超链接目标的 URL
取href:
use HTML::TreeBuilder::XPath;
my $tree= HTML::TreeBuilder::XPath->new;
$tree->parse_file( "csdn.html");
##获取博客分类的URL,根据a标签查找属性为href
@Links = $tree->find_by_tag_name('a');
#print Dumper($Links[0]);
print "--------------------\n";
print $Links[0]->{'href'};
print "\n";
print $Links[0]->attr('href');
print "\n";
print "--------------------\n";
node2:/root/pachong#perl test.pl
--------------------
/zhaoyangjian724/article/category/1756569
/zhaoyangjian724/article/category/1756569
--------------------
perl 标签的 href 属性用于指定超链接目标的 URL的值
转载本文章为转载内容,我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题,欢迎原作者联系我们进行内容更正或删除文章。
提问和评论都可以,用心的回复会被更多人看到
评论
发布评论
相关文章