Personal tools
You are here: Home Products Plone Roadmap #137: Improved Inbound Feed Syndication
Document Actions

#137: Improved Inbound Feed Syndication

Contents
  1. Definitions
  2. Motivation
  3. Proposal
by Jonah Bossewitch last modified March 21, 2007 - 00:58
Consolidate on a single, intuitive, inbound feed syndication UI and paradigm
Proposed by
Jonah Bossewitch
Seconded by
Nate Aune
Proposal type
User interface
State
being-discussed

Definitions

Existing inbound syndication Products:

Motivation

In the mashed-up, distributed, decoupled web 2.0 world, RSS is arguably functioning as the duct tape of the web. More and more services are exposing their data and services over RSS, and we should have better tools to integrate this content into plone sites.

While RSS is only used explicitly by < 4% of internet users, it is becoming a very important tool for developers, perhaps as important as CGI was in its day.

We need look no further than planet plone to see how improved inbound feed syndication could serve the Plone platform. In fact, multiple inbound feed solutions exist in the plone world - we need to centralize and consolidate these tools and offer a singular, powerful inbound aggregation story.


In an environment where live searches, and REST apis are exposing their results as RSS, entire applications can now be stitched together from a hodge-podge of backends. This is more than automatically updated news feeds. We can imagine backing a photo album with flickr itself, or distributed research with a tool like delicious. We can imagine collaborative content contribution models more similar to darcs than svn.


Other CMS platforms serving similar constituencies have powerful inbound RSS features built into the core - not in iteslf a good reason to follow, but something to track and consider: drupal aggregator .


See the "Subscribing to RSS feeds from Plone" section from the Snow Sprint's topic page for more details http://plone.org/events/sprints/snow-sprint3/syndication

Proposal

In many respects the hardest issue to contend with when it comes to inbound rss syndication is a UI one.

I propose to extend the 'smart folder' metaphor and create 'remote smart folders' (in spite of my recent questioning of the entire folder metaphor, I imagine it will still be around and useful for a while).

These folders can be backed by any of our syndication tools. The will allow content creators to define inbound rss feeds w/in the Plone ui - not as portlets, but as first class content.

Remote smart folders can work like bloglines folders. They can aggregate sub-folders feeds into a single feed. The top level folders can define aggregation and filtering policies.

I think that one of the issues that has held up RSS syndication is the question of whether or not syndicated content ever needs to end up in the zodb. I don't think it does, if you have good enough control over filters so that moderation can happen differerently. Also consider that some of the applications powering these feeds might also allow for control of this content, so that workflows can be accomplished by controlling an account on the 'provider' end.

Search integration is something that needs to be considered, but in my mind what needs to be resolved is a firm committment to the importance of inbound RSS syndication and a good UI metaphor for pulling it into Plone.


+1

Posted by Jon Stahl at April 10, 2006 - 02:24
I couldn't possibly agree more on the importance of Plone supporting inbound RSS better.

I really like the idea of an "RSS Smart Folder" that doesn't store content in the ZODB.

However, I also think that it is important to have good tools that DO store content in the ZODB for editing/moderation. (PloneRSS is the best thing we currently have.) Most of the use-cases I find are ones where the Plone site doesn't have control over or necessarily completely trust the RSS source.

For example, I have clients that would like to auto-populate their press rooms with the results of a Google or Yahoo news search, but those searches are not 100% reliable, and so the groups would need the ability to delete irrelevant items.

While you can argue that folks could always use an external "reblogging" tool, it seems to me that RSS items are content and that Plone is about managing content. Besides PloneRSS already gets us about 90% of the way there in terms of RSS items-as-content-objects.

+1

Posted by Rob Oliver at April 10, 2006 - 23:48

For me this is a basic requirement of a modern CMS nowadays. Before adding new features to Plone I think its important that Plone brings its existing infrastructure upto date. A RSS aggregator (together with more update RSS syndication i.e RSS2/Atom support out-of-the-box) is sorely missing and could also provide a simple but powerful means to share/export content between Plone Sites for site-admins.

Why wait?

Posted by Martin Aspeli at April 11, 2006 - 11:14
Why wait for this? The aggreagate folder idea could be built today using Archetypes and whatever python RSS parsing infrastructure is most commonly accepted (I don't see the need to go down a Zope/CMF/Plone specific route, all you need is to parse the inbound RSS and push it to a page template). I don't see much of a reason why this would need to be in the core, though if it matures in time we may consider shipping with it.

If you can see additional points of integration beyond what would go in a specific content type, it'd be good to bring them up here.

RE why wait?

Posted by Rob Oliver at April 12, 2006 - 00:19

Personally I think inbound RSS aggregation is just as much core as RSS syndication. As mentioned above this is content management.

To core or not to core

Posted by Martin Aspeli at April 12, 2006 - 07:24
People get so obsessed by things having to be "core". It's irrelevant. There is a technical decision about whether we bloat CMFPlone-the-product with a lot of features, or whether we make things more modular, and therefore more re-usable, configurable and flexible.

What you need to do is to talk to the people who have done RSS products already. There was work started around basesyndication and fatsyndication, and some work by Nate et. al. at the snow sprint. These are the guys who know the most about this. You need to come up with a strategy for consolidation. That doesn't mean that the release manager or the guy that normally codes up the member data implementation needs to push that.

When you have a design that works, by all means, discuss it on plone-dev, bring some excitement, let it be known that you intend to solve this once and for all. But make it a *product*. If it *needs* to be in Plone core to integrate properly, that probably means we need to make better UI hooks in Plone itself, so that not only your product can use them, but other products too. In this regard, scrunching things into the core and hard-wiring them in is determimental to the quality of Plone as a whole, as a constellation of products.

Now, here's the point: If your product is good, and if you stay in touch with the rest of the team, manage the release cycle and show commitment, then we may just ship with your product. To the user, there's no difference, but technically, it's better, and organisationally it makes it easier for you to get started and less likely that the rest of Plone will get dragged down if/when the people interested in this suddenly lose interest.

So make it work - experiment, consolidate, communicate. This is the way open source works: if your product is good, it will survive on its own merits. If it's unmaintained, buggy or messy, it will die.

thoughts

Posted by Justin Ryan at March 20, 2007 - 07:19
I agree somewhat that it is important to go out and stake some ground and try to get things done, but it often is important that things become core. It's good for the community when a problem is sorta-kinda solved several times and product x works with plone version y and product z works with plone version y.5 and you have to redo things and yadda yadda.

There comes a point, I think, when it's a burden on the Plone community and a barrier to new users that something isn't in core. And, perhaps there needs to be an extended core, or a few addons like p4a which are developed more in-step.

I don't mean to be an anti-darwinist by any means, but some software may die because of the circumstance of funding, the coincidence even. We are fond in the community of saying that X or Y will happen, yanno, as soon as someone wants it enough to fund it.

But, sometimes it's difficult for momentum to build properly. Look at Calendaring products - they are so fragmented that most people want at least one distinctive feature each from more than one package and it just hasn't been practical to combine them. A lot of them should split out to more focused responsibilities, but for lack of that it could have been nice at some point for someone to say okay, damnit, this is becoming part of ATCT and we're all going to agree on how it works.

Re: Why wait?

Posted by Jonah Bossewitch at April 12, 2006 - 00:32

The issue is there are too many options. I remember when I first encountered Zope there were always at least 3 different products to solve any one task, and rarely were any of them complete. This particular problem is important enough that it just needs a little bit of leadership and tlc to land, and land well.

BTW - rss->html transformations is a great candidate for microapp treatment. People might be amused to consider how you can even nowadays do something like this in pure js (against a server side web service) : feedsplitter

The advantage of tight plone integration is with things like search, ttw control over cacheing and flushing, perhaps administrative filtering, or even moderation. Doing this well could be a fun little project.

/jsb

Options

Posted by Martin Aspeli at April 12, 2006 - 07:29
Jonah - the "too many options" thing is a good thing and a bad thing, a byproduct of the open source development process. There is a need to make it easier to discover the best-of-breed products, which we are (slowy) working on. You're absolutely right, it needs leadership. But putting something in CMFPlone itself is not a magic bullet that it will get led. All that will happen is that the core becomes so big all the parts will suffer equally. You seem to know a lot more about this than me, for example, why don't you lead it? :)

Search, TTW control, moderation ... all of that can be done with a separate product, which, as I say, could very well be "blessed" and bundled with Plone. But I'd rather see running code that works than vague ideas for something completely different that partially overlaps with existing implementations labeled "implement this in Plone core please".

Martin

GData API

Posted by Rob Oliver at June 8, 2006 - 12:53

Support for Google GData API would be very worthwile in this area. One of Drupals SOC projects is to integrate GData with Drupal.

Drupal SOC - http://drupal.org/node/60490 GData API - http://code.google.com/apis/gdata/overview.html

Quick excerpt from Google & Drupal -

The Google data APIs ("GData" for short) provide a simple standard protocol for reading and writing data on the web. GData combines common XML-based syndication formats (Atom and RSS) with a feed-publishing system based on the Atom publishing protocol, plus some extensions for handling queries.

This API will generate syndication similar to the RSS and ATOM feeds but, will add functionality for queries, updates, optimistic concurrency (versioning), and authentication. This will enable outside clients to access information on a drupal site and modify it.

rob

GData / S3 / Lucene / XPath / etc..

Posted by Justin Ryan at March 20, 2007 - 07:12
Whenever people bring up GData I try to mention also that similar efforts are important. For instance, the Lucene catalog is able to outperform ZCatalog running over xml-rpc for millions of objects, though I don't know if it is faster for smaller datasets. Amazon S3 can be amazingly powerful to support because it would enable very inexpensive, redundant, geo-load-balanced clusters of Plone to run at very low cost.

Then we can even dip into things like pure XML Databases, FourSuite, anything that talks XPath, etc.. There's a world of external data storage protocols. :)

Maybe this will also have some relation to the RDF plip? I'm starting to really think so.

two new products: ClearRSS and feedfeeder

Posted by Nate Aune at March 21, 2007 - 01:02
Since this PLIP was written, there have been two products released for Plone which provide inbound RSS. ClearRSS by Andy McKay is an RSS news feed parser written in Ajax, that uses the AjaxProxy to pull RSS feeds from remote sites and display them in your site. http://www.agmweb.ca/blog/andy/1841/

feedfeeder by Rocky Burt pulls in remote Atom or RSS feeds and creates firstclass Plone content. This means that the content pulled in via RSS is searchable using Plone's search tool, and the content can be workflowed. http://plone.org/products/feedfeeder

For any issues with the web site functionality, please file a ticket.

Please consult the policy on plone.org content if you want your content published on this site.

Servers and hosting by