RSS feeds from any site

Feed43I was reading, from the very good site downloadblog.it, about the new service FeedYes, that allows to “grab” informations from any web page, and to organize it in a feed, according to RSS specifications, even if the site doesn’t offer this service in “native” mode.

If you don’t know what is a RSS feed: they’re pages written following a global standard that permits to show the data in a virtually infinite number of modes, from “Outlook” style visualizations (one news for row, with the new ones in bold and so on: this function is well-done by the free software FeedReader, that will tell us about news with little notices on screen, it’s really useful), to the funny shows of data in “news ticker” style, like news on the bottom of the TV on CNN (RSSNewsTicker is a good example of this).

I tried FeedYes, it’s really simple to be used and it is semi-automatic: it’s sufficient to write the address of the web page from we want to grab informations and the application used by the site captures every link, excluding that ones that are not “sensibles” (e.g.: if we ask for a page of a forum, it will exclude links to the single pages of the single topics, that are repetitions) and then it will ask us if we want to exclude more, as the menu links or the other “fixed” element, or simply part of the pages that are not of interest for us. The final result is a bit discutible: FeedYes often makes mistakes and it takes as news parts of the page that are not in the section we want: I tried with a forum: it often took links to author profiles, while I wanted it to grab only links to the topics. We can say that it’s good, but for the same reason it’s very automatic, it’s too much imprecise.

If you want better results, if you have 15 minutes for the configuration, and if you know HTML basics, I suggest Feed43 that is less automatic; it asks for the address to fetch news and then to define a “pattern”, a search model to apply to the page’s code, to recognize news and also to delimit the different fields in these: the link, the title, the description (or the subtitle, in other words). So, with a simple pattern like:

Pattern Feed 2

I got a perfect RSS feed of the page with the list of topics in a forum based on Invision Board, in which every news is presented by the title of the single thread, the link is directly to the page with messages, and the description is the subtitle (if present).

EDIT: Feed43 is a free service, even if they plan to start a “premium” version in the future. The free version provides updates to feeds every 6 hours.

Leave a comment