text-to-10
----------
This parser (TextSAXParser.java) and stylesheet (text-to-10.xsl) allow to transform
a very simple (non XML) text format into RSS 1.0.
The reverse transformation (10-to-text.xsl) allows roundtrips as shown by the following example:
xml_com.in -> xml_com.txt -> xml_com.out3 .
An example of such a file could be (see example1.txt):
---------------
# Example1.txt
channel= /news/
title= 4xt
description= News 4xt
image= /images/4xt.gif
title= 4xt
link=
textinput= /subrss.xml
title= Subscribe
description= Stay tuned ! Enter your email address to receive news4xt.
name= email
item= /news/000815-0001.xml
title= RSS 1.0 support
description= A set of stylesheets is available on 4xt to translate RSS 0.9 and 0.9.1
documents into the recently announced RSS 1.0 format, together with Java
classes aimed at facilitating the transformation.
item= /news/000730-0001.xml
title= 4xt internals part 1
description= XHTML layout documents, a technique widely used to build 4xt, explained on XML.com
under the title Style-free XSLT.
item= /news/000703-0001.xml
title= UX brings new Unix-like features to XT
description= Paul Tchistopolskii announced the release 0.2 of his UX environment, with cleaner
APIs, a string to node set conversion and the implementation of a grep command.
item= /news/000601-0001.xml
title= Maintenance 4xt
description= Moving ahead, 4xt is organizing the maintenance of XT.
item= /news/000529-0001.xml
title= HDML Output Handler
description= Khun Yee Fung has developped a HDML Output Handler.
--------------
The format is following very simple rules and is almost only plain English.
A new element starts with a token followed by an equal sign (=) immediately at
the beginning of a line.
A new element closes the previous one (there is no hierarchy).
The white spaces are normalized.
Comments can be added: they start with a # sign at the beginning of a line
and end at the end of the line.
The TextSAXParser which converts this into (example1.asxml):
--------------
Example1.txt/news/4xtNews 4xt/images/4xt.gif4xt/subrss.xmlSubscribeStay tuned ! Enter your email address to receive news4xt.email/news/000815-0001.xmlRSS 1.0 supportA set of stylesheets is available on 4xt to translate RSS 0.9 and 0.9.1 documents into the recently announced RSS 1.0 format, together with Java classes aimed at facilitating the transformation./news/000730-0001.xml4xt internals part 1XHTML layout documents, a technique widely used to build 4xt, explained on XML.com under the title Style-free XSLT./news/000703-0001.xmlUX brings new Unix-like features to XTPaul Tchistopolskii announced the release 0.2 of his UX environment, with cleaner APIs, a string to node set conversion and the implementation of a grep command./news/000601-0001.xmlMaintenance 4xtMoving ahead, 4xt is organizing the maintenance of XT./news/000529-0001.xmlHDML Output HandlerKhun Yee Fung has developped a HDML Output Handler.
----------------
and a stylesheet which does the remining work to (example1.rss01):
----------------
/news/
4xtNews 4xt/images/4xt.gif4xt
/subrss.xml
SubscribeStay tuned ! Enter your email address to receive news4xt.email
/news/000815-0001.xml
RSS 1.0 supportA set of stylesheets is available on 4xt to translate RSS 0.9 and 0.9.1 documents into the recently announced RSS 1.0 format, together with Java classes aimed at facilitating the transformation.
/news/000730-0001.xml
4xt internals part 1XHTML layout documents, a technique widely used to build 4xt, explained on XML.com under the title Style-free XSLT.
/news/000703-0001.xml
UX brings new Unix-like features to XTPaul Tchistopolskii announced the release 0.2 of his UX environment, with cleaner APIs, a string to node set conversion and the implementation of a grep command.
/news/000601-0001.xml
Maintenance 4xtMoving ahead, 4xt is organizing the maintenance of XT.
/news/000529-0001.xml
HDML Output HandlerKhun Yee Fung has developped a HDML Output Handler.
------------------
Extensions:
----------
For simple applications, the text file could be further simplified to:
------------------
channel= /news/
item= /news/000815-0001.xml
item= /news/000730-0001.xml
item= /news/000703-0001.xml
item= /news/000601-0001.xml
item= /news/000529-0001.xml
------------------
if we accept to get the title and description from the URLs themselves
(using the HTML META tags or extracting them from the page itself).