惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

S
Secure Thoughts
Security Latest
Security Latest
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
GbyAI
GbyAI
L
LINUX DO - 最新话题
A
Arctic Wolf
T
Tor Project blog
G
GRAHAM CLULEY
I
InfoQ
博客园_首页
IT之家
IT之家
The Register - Security
The Register - Security
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
P
Proofpoint News Feed
The GitHub Blog
The GitHub Blog
Blog — PlanetScale
Blog — PlanetScale
N
Netflix TechBlog - Medium
K
Kaspersky official blog
博客园 - 三生石上(FineUI控件)
S
SegmentFault 最新的问题
U
Unit 42
PCI Perspectives
PCI Perspectives
量子位
P
Palo Alto Networks Blog
S
Securelist
T
Troy Hunt's Blog
博客园 - 【当耐特】
Recorded Future
Recorded Future
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
S
Security Affairs
Engineering at Meta
Engineering at Meta
T
The Blog of Author Tim Ferriss
博客园 - 聂微东
罗磊的独立博客
N
News and Events Feed by Topic
人人都是产品经理
人人都是产品经理
B
Blog RSS Feed
NISL@THU
NISL@THU
C
Cisco Blogs
T
Threatpost
有赞技术团队
有赞技术团队
Forbes - Security
Forbes - Security
Hugging Face - Blog
Hugging Face - Blog
Last Week in AI
Last Week in AI
T
The Exploit Database - CXSecurity.com
Cloudbric
Cloudbric
Cyberwarzone
Cyberwarzone
Google DeepMind News
Google DeepMind News
C
Cyber Attacks, Cyber Crime and Cyber Security

RSS Advisory Board

The RSS Advisory Board Just Turned 20 Downloading 50,000 Podcast Feeds to Analyze Their RSS Tara Calishain Explains: What is RSS? Be Unique And Use RSS Guid Like Everybody Else Has the RSS Advisory Board Followed the Roadmap? RSS Enclosure Support in Micro.Blog Atom Feed Format Was Born 20 Years Ago How to Read an RSS Feed with PHP Using SimplePie Every Mastodon User Has an RSS Feed RSS Enclosure Support in WordPress Where to Find the RSS Specification Should Feed Readers Count Unread Items? WordPress Uses RSS as Blog Export Format Yahoo Groups Dropped RSS Feed Support
How to Read an RSS Feed with Java Using XOM
2023-08-02 · via RSS Advisory Board
Cover of the O'Reilly book XML in a Nutshell, 3rd Edition, by Elliotte Rusty Harold and W. Scott Means. The cover features a black-and-white illustration of the neck and head of a peafowl bird with a crest of tipped feathers on his head.

There are a lot of libraries for processing XML data with Java that can be used to read RSS feeds. One of the best is the open source library XOM created by the computer book author Elliotte Rusty Harold.

As he wrote one of his 20 books about Java and XML, Harold got so frustrated with the available Java libraries for XML that he created his own. XOM, which stands for XML Object Model, was designed to be easy to learn while still being strict about XML, requiring documents that are well-formed and utilize namespaces in complete adherence to the specification. (At the RSS Advisory Board, talk of following a spec is our love language.)

XOM was introduced in 2002 and is currently up to version 1.3.9, though all versions have remained compatible since 1.0. To use XOM, download the class library in one of the packages available on the XOM homepage. You can avoid needing any further configuration by choosing one of the options that includes third-party JAR files in the download. This allows XOM to use an included SAX parser under the hood to process XML.

Here's Java code that loads items from The Guardian's RSS 2.0 feed containing articles by Ben Hammersley, displaying them as HTML output:

// create an XML builder and load the feed using a URL
Builder bob = new Builder();
Document doc = bob.build("https://www.theguardian.com/profile/benhammersley/rss");
// load the root element and channel
Element rss = doc.getRootElement();
Element channel = rss.getFirstChildElement("channel");
// load all items in the channel
Elements items = channel.getChildElements("item");
for (Element item : items) {
  // load elements of the item
  String title = item.getFirstChildElement("title").getValue();
  String author = item.getFirstChildElement("creator",
    "http://purl.org/dc/elements/1.1/").getValue();
  String description = item.getFirstChildElement("description").getValue();
  // display the output
  System.out.println("<h2>" + title + "</h2>");
  System.out.println("<p><b>By " + author + "</b></p>");
  System.out.println("<p>" + description + "</p>");
}

All of the classes used in this code are in the top-level package nu.xom, which has comprehensive JavaDoc describing their use. Like all Java code this is a little long-winded, but Harold's class names do a good job of explaining what they do. A Builder uses its build() method with a URL as the argument to load a feed into a Document over the web. There are also other build methods to load a feed from a file, reader, input stream, or string.

Elements can be retrieved by their names such as "title", "link" or "description". An element with only one child of a specific type can be retrieved using the getFirstChildElement() method with the name as the argument:

Element linkElement = item.getFirstChildElement("link");

An element containing multiple children of the same type uses getChildElements() instead:

Elements enclosures = item.getChildElements("enclosure"); if (enclosures.size() > 1) {
  System.out.println("I'm pretty sure an item should only include one enclosure");
}

If an element is in a namespace, there must be a second argument providing the namespace URI. Like many RSS feeds, the ones from The Guardian use a dc:creator element from Dublin Core to credit the item's author. That namespace has the URI "http://purl.org/dc/elements/1.1/".

If the element specified in getFirstChildElement() or getChild Elements() is not present, those methods return null. You may need to check for this when adapting the code to load other RSS feeds.

If the name Ben Hammersley sounds familiar, he coined the term "podcasting" in his February 2004 article for The Guardian about the new phenomenon of delivering audio files in RSS feeds.

No comments yet.

All comments are moderated before publication. These HTML tags are permitted: <p>, <b>, <i>, <a>, and <blockquote>.