Since I have managed to end up on Planet PHP thanks to Henri Bergius, I think it is a good time to make another post on some of the features that are required by robust CMS frameworks. So, here we go:
NewsML Integration: Any wire service worth their name sends data these days using NewsML. The articles from the services should be accessible in the framework for review/edit/publishing within the CMS interface. This is one place where most proprietary/paid solutions score over the free implementations. Of course, they do have the luxury of getting paid for parsing archaic and unpredictable formats, but it is a big plus point for them.
Speed & Reliability: Nobody really cares whether the interface for your CMS is a browser-based one or a custom built client application as long as it works fast and works well 100 out of 100 times. Most of the new fangled frameworks have interfaces that are as slow as a limping pregnant walrus. If you can't push the news out fast enough, you are out of the business. If your application breaks 20 times out of 100 because it can work only on 'X' or 'Y' browser, you lose again. It is 100 out of 100 or nothing. Simple.
Collaboration & Versioning: Article elements and objects should have the capability to exist in the system as multiple instances when they are being worked on by different users. The changes should be rolled into user-specific versions, which can be flushed at fixed periods, while the actual published elements should have their own version tree. The framework should also have the capability to display diffs between versions or highlight/track the changes through versions.
Authentication: LDAP authentication for existing user base is a must for serious CMS frameworks, but it should not be the only available option. Leave it to the individual deployments to figure out whether they want to use, but having the ability to seamlessly integrate your existing user directory with roles and permissions score very highly with the corporates.
Editions: A lot of media websites are primarily newspaper companies or magazines. While the workaround of keeping the articles in a separate section is always available to solve this problem, it is not a robust solution. Most of the frameworks only allow for the 'browse by date' or 'browse by section' feature while editions are a mix of both. This is basically a presentation layer issue, which can be easily sorted by giving different and flexible options for controlling the presentation.
Templating: Don't give users PHP/Python/pick your favourite scripting language to deploy templates. Don't give them advanced Smarty style funky templating either. Give them either a set of tags that already exist and are extensible (through the conditional route) or give them some sort of simplistic metalanguage with which they can define things from the ground up. Most CMS solutions assume that they know the end-user's requirements, which differ vastly from organisation to organisation.
Object manager: Everything, other than actual articles, should be made into objects (it can be pictures, video, audio, HTML or anything else). Objects should be extensible and reusable. For example, the base object of a score card for a sports page can be extended to make a one off card that would require additional fields. Objects should have metadata (article and internal and general) attached to them, which can be called into any page.
Content Clusters: Content clusters can be root aggregation points like actual sections (sports, news, etc) or derived (virtual) aggregation points like subsections/subcategories of existing sections/categories. Articles on the item level should exist as lone rangers that are associated with a single section/category, with the option for multiple parenting. Granular control should be available to multiple parent items on the basis of 'AND' 'OR' logic.
Internal Metadata: This is to enable extensive cross-linking of articles. For objects that are used across the system, the metadata should be extracted/inserted using existing conventions like ID tags. All metadata should be stored in such a way that endpoints are provided to query and access the objects and articles. This would ensure the reusability for existing content and components.
User Tracking: Track every click and track every finger that the user wiggles when he/she is on the framework. Corporates are sold only on one thing - metrics. Show them the user trail, show them where people are jumping off their website, show them how a 5% slowdown on a particular component is causing a loss of 25% of your top spending visitors. For the editorial, metrics can be used to fine tune content and cross-linking to better retain users and deliver more targeted advertising. I do not think any CMS has this feature yet.
SMS/Mobile Modules: Very few frameworks today take into account the fact that news is served to an ever-growing number of mobile users. The data for this, at the base level, is truncated at 160 characters and in the higher end deals we have strict WML and strict XHTML compliance. They also ignore the fact that the content needs for the mobile audience is of a different nature. There are no default endpoints for carriers to source the WAP pages nor is there any value add through metadata for mobile content.
KISS It: Don't give users cryptic messages and interfaces that make sense only to a geek. The primary use of CMS in a news publication is to get the news out there first. Which means it should give the shortest turnaround time from a rough draft to display on site progression. Don't give them screen after screen of buttons and options to select. Show dependencies everywhere. Parent objects should never be allowed deletion if child objects exist in the database.
Granular Caching: Caching needs vary from site to site. Don't dump a monolithic solution on anybody’s head. Make it as granular as possible. Business sites abhor having market data being cached anywhere, while news sites want high performance caching for their homepages. Do not assume your end-user's needs. Give them caching heaven, earth and everything in-between and let them choose their own existence.
Snap-on Components: Keep the backend separate from the front end. Any decent framework should have the following separate modules: frontend site, backend CMS, site management module. All three should be independently scaleable and ALWAYS sperate the frontend from the CMS. Under no circumstances should a backend CMS be unavailable due to heavy load on the frontend. This is a cardinal sin. All components (polls, utility boxes, message boards, blogs) should be 'snap-on' enabled.