OPML in FreshRSS
FreshRSS supports the OPML format to export and import lists of RSS/Atom feeds in a standard way, compatible with several other RSS aggregators.
However, FreshRSS also supports several additional features not covered by the basic OPML specification. Luckily, the OPML specification allows extensions:
An OPML file may contain elements and attributes not described on this page, only if those elements are defined in a namespace.
and:
OPML can also be extended by the addition of new values for the type attribute.
FreshRSS OPML extension
FreshRSS uses the XML namespace https://freshrss.org/opml to export/import extended information not covered by the basic OPML specification.
The list of the custom FreshRSS attributes can be seen in the source code, and here is an overview:
HTML+XPath or XML+XPath
<outline type="HTML+XPath" ...
: Additional type of source, which is not RSS/Atom, but HTML Web Scraping using XPath 1.0.
ℹ️ XPath 1.0 is a standard query language, which FreshRSS supports to enable Web scraping.
<outline type="XML+XPath" ...
: Same thanHTML+XPath
but using an XML parser.
The following attributes are using similar naming conventions than RSS-Bridge.
frss:xPathItem
: XPath expression for extracting the feed items from the source page.- Example:
//div[@class="news-item"]
- Example:
frss:xPathItemTitle
: XPath expression for extracting the item’s title from the item context.- Example:
descendant::h2
- Example:
frss:xPathItemContent
: XPath expression for extracting an item’s content from the item context.- Example:
.
- Example:
frss:xPathItemUri
: XPath expression for extracting an item link from the item context.- Example:
descendant::a/@href
- Example:
frss:xPathItemAuthor
: XPath expression for extracting an item author from the item context.- Example:
"Anonymous"
- Example:
frss:xPathItemTimestamp
: XPath expression for extracting an item timestamp from the item context. The result will be parsed bystrtotime()
.frss:xPathItemTimeFormat
: Date/Time format to parse the timestamp, according toDateTime::createFromFormat()
.frss:xPathItemThumbnail
: XPath expression for extracting an item’s thumbnail (image) URL from the item context.- Example:
descendant::img/@src
- Example:
frss:xPathItemCategories
: XPath expression for extracting a list of categories (tags) from the item context.frss:xPathItemUid
: XPath expression for extracting an item’s unique ID from the item context. If left empty, a hash is computed automatically.
JSON+DotNotation
-
<outline type="JSON+DotNotation" ...
: Similar toHTML+XPath
but for JSON and using a dot/bracket syntax such asobject.object.array[2].property
. frss:jsonItem
: JSON dot notation for extracting the feed items from the source page.- Example:
data.items
- Example:
frss:jsonItemTitle
: JSON dot notation for extracting the item’s title from the item context.- Example:
meta.title
- Example:
frss:jsonItemContent
: JSON dot notation for extracting an item’s content from the item context.- Example:
content
- Example:
frss:jsonItemUri
: JSON dot notation for extracting an item link from the item context.- Example:
meta.links[0]
- Example:
frss:jsonItemAuthor
: JSON dot notation for extracting an item author from the item context.frss:jsonItemTimestamp
: JSON dot notation for extracting an item timestamp from the item context. The result will be parsed bystrtotime()
.frss:jsonItemTimeFormat
: Date/Time format to parse the timestamp, according toDateTime::createFromFormat()
.frss:jsonItemThumbnail
: JSON dot notation for extracting an item’s thumbnail (image) URL from the item context.frss:jsonItemCategories
: JSON dot notation for extracting a list of categories (tags) from the item context.frss:jsonItemUid
: JSON dot notation for extracting an item’s unique ID from the item context. If left empty, a hash is computed automatically.
JSON Feed
<outline type="JSONFeed" ...
: UsesJSON+DotNotation
behind the scenes to parse a JSON Feed.
HTML+XPath+JSON
<outline type="HTML+XPath+JSON+DotNotation" frss:xPathToJson="..." ...
: Same asJSON+DotNotation
but first extracting the JSON string from an HTML document thanks to an XPath expression.- Example:
//script[@type='application/json']
- Example:
cURL
A number of cURL options are supported:
frss:CURLOPT_COOKIE
frss:CURLOPT_COOKIEFILE
frss:CURLOPT_FOLLOWLOCATION
frss:CURLOPT_HTTPHEADER
frss:CURLOPT_MAXREDIRS
frss:CURLOPT_POST
frss:CURLOPT_POSTFIELDS
frss:CURLOPT_PROXY
frss:CURLOPT_PROXYTYPE
frss:CURLOPT_USERAGENT
Miscellaneous
frss:cssFullContent
: CSS Selector to enable the download and extraction of the matching HTML section of each articles’ Web address.- Example:
div.main, .summary
- Example:
frss:cssFullContentFilter
: CSS Selector to remove the matching HTML elements from the full content retrieved byfrss:cssFullContent
.- Example:
.footer, .aside
- Example:
frss:filtersActionRead
: List (separated by a new line) of search queries to automatically mark a new article as read.
Dynamic OPML (reading lists)
frss:opmlUrl
: If non-empty, indicates that this outline (category) should be dynamically populated from a remote OPML at the specified URL.
Example
<?xml version="1.0" encoding="UTF-8"?>
<opml version="2.0">
<head>
<title>FreshRSS OPML extension example</title>
</head>
<body>
<outline xmlns:frss="https://freshrss.org/opml"
text="Example"
type="HTML+XPath"
xmlUrl="https://www.example.net/page.html"
htmlUrl="https://www.example.net/page.html"
description="Example of Web scraping"
frss:xPathItem="//a[contains(@href, '/interesting/')]/ancestor::article"
frss:xPathItemTitle="descendant::h2"
frss:xPathItemContent="."
frss:xPathItemUri="descendant::a[string-length(@href)>0]/@href"
frss:xPathItemThumbnail="descendant::img/@src"
frss:cssFullContent="article"
frss:filtersActionRead="intitle:⚡️ OR intitle:🔥 something"
/>
</body>
</opml>