DOMIT! RSS parser
官方网站:http://www.phpclasses.org/browse/package/1767.html 只支持RSS,但是有cache功能,而且有不少其他系统使用。
PHP Universal Feed Parser
官方网站: http://www.phpclasses.org/browse/package/4548.html 作者的连接:http://www.ajaxray.com/blog/2008/05/02/php-universal-feed-parser-lightweight-php-class-for-parsing-rss-and-atom-feeds/
基于Orchild Framework(http://code.google.com/p/orchidframework/)和XML Parser(http://www.php.net/xml),支持RSS 1.0, 2.0 or ATOM,而且使用非常简单,非常类似MagpieRSS,可以利用它构建自己的后台数据库。打算使用这个替换之前的MagpieRSS,并构建自己的CMS。
MagpieRSS
官方网站:http://magpierss.sourceforge.net/
这里有详细的介绍和例子,使用很方便。
不幸的是从2005就没有更新了,使用了一下存在不少问题,如:
Notice: Undefined property: MagpieRSS::$last_modified in
Undefined property: MagpieRSS::$etag
*************************************************************************************************
Some time ago, I wrote a blog portal in PHP, that used MagpieRSS 0.72. Unfortunately, there are plenty of bugs in MagpieRSS , and it has no support for the newer format Atom 1.0. MagpieRSS hasn’t been updated since November 2005, so I improved on it myself. I have emailed my patches to Kellan, the maintainer of MagpieRSS, but I got no reply.
NOTE: I don’t use this code anymore, so don’t ask me for further fixes or improvements. I’m only publishing the code here, so that someone else may create a new project or take over the SourceForge project for MagpieRSS or something. It’s all open source, baby.
My improvements to the fetcher :
- Changed the name of the HTTP header from “If-Last-Modified” to the correct “If-Modified-Since” (this is very important, since your client will fetch feeds unnecessarily often otherwise)
- Made sure that the constant MAGPIE_CACHE_FRESH_ONLY actually works — it doesn’t in the original version
- Allows you to fetch feeds with the POST method as well as GET
- Trimming values from the headers “ETag” and “Last-Modified”
- Removed “Undefined property” warning messages for MagpieRSS::$etag and MagpieRSS::$last_modified
My improvements to the parser :
- Assumes version 1.0 if the format is Atom with no version
- Handles Atom links with different “rel” types better
- Reads the “subtitle” tag if Atom 1.0
- Reads the “published” and “updated” tags if Atom 1.0
- More stable when encountering illegal characters (I actually don’t remember what I meant by this!)
My improvements to the utils :
- Fixed bad regular expression to correctly match seconds in W3CDTF timestamps
I also replaced the bundled version of Snoopy (1.0) with a newer version (1.2.3). There is in fact an even newer version out, so you should probably download that one. Put the file Snoopy.class.php in the extlib directory and rename it Snoopy.class.inc.
However, I don’t think I changed the version numbers or the changelog. But at least I fixed some bugs, right?