01.01.2015 Views

EMC Documentum Architecture: Delivering the Foundations and ...

EMC Documentum Architecture: Delivering the Foundations and ...

EMC Documentum Architecture: Delivering the Foundations and ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong>: <strong>Delivering</strong><br />

<strong>the</strong> <strong>Foundations</strong> <strong>and</strong> Services for Managing<br />

Content Across <strong>the</strong> Enterprise<br />

A Detailed Review<br />

Abstract<br />

<strong>EMC</strong> ® <strong>Documentum</strong> ® is a comprehensive enterprise content management platform for ordering <strong>the</strong> flow<br />

<strong>and</strong> delivery of unstructured business information across an extended enterprise. Based on an extensible,<br />

open, scalable, <strong>and</strong> secure architecture that meets <strong>the</strong> needs of global, distributed organizations,<br />

<strong>Documentum</strong> comprises a set of integrated products <strong>and</strong> services that work toge<strong>the</strong>r. From creation or<br />

capture, organization, <strong>and</strong> electronic storage through just-in-time delivery <strong>and</strong> archiving, <strong>the</strong> <strong>Documentum</strong><br />

end-to-end content management solution solves a range of strategic business issues.<br />

January 2008


Copyright © 2008 <strong>EMC</strong> Corporation. All rights reserved.<br />

<strong>EMC</strong> believes <strong>the</strong> information in this publication is accurate as of its publication date. The information is<br />

subject to change without notice.<br />

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” <strong>EMC</strong> CORPORATION<br />

MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE<br />

INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED<br />

WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.<br />

Use, copying, <strong>and</strong> distribution of any <strong>EMC</strong> software described in this publication requires an applicable<br />

software license. For <strong>the</strong> most up-to-date listing of <strong>EMC</strong> product names, see <strong>EMC</strong> Corporation Trademarks<br />

on <strong>EMC</strong>.com All o<strong>the</strong>r trademarks used herein are <strong>the</strong> property of <strong>the</strong>ir respective owners.<br />

Part Number H3411<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 2


Table of Contents<br />

Executive summary ............................................................................................4<br />

Introduction.........................................................................................................4<br />

Audience ...................................................................................................................................... 4<br />

Bringing order to unstructured business information ....................................4<br />

Business benefits: Beyond information silos ............................................................................... 4<br />

What <strong>EMC</strong> <strong>Documentum</strong> delivers................................................................................................ 5<br />

<strong>EMC</strong> <strong>Documentum</strong>: A layered architecture ......................................................5<br />

The kernel group: Storing, accessing, <strong>and</strong> securing content in a unified<br />

content infrastructure.........................................................................................6<br />

Content objects ............................................................................................................................ 7<br />

Storing content objects ................................................................................................................ 7<br />

Anatomy of <strong>the</strong> repository............................................................................................................ 8<br />

Connecting to an underlying storage infrastructure................................................................... 10<br />

Security services........................................................................................................................ 11<br />

The application services group: Managing content as<br />

interrelated modules.........................................................................................16<br />

Compliance Services ................................................................................................................. 16<br />

Core Content Services............................................................................................................... 19<br />

Process Services ....................................................................................................................... 27<br />

The tools group: Creating content applications ............................................30<br />

Enterprise Content Services <strong>and</strong> <strong>the</strong> <strong>Documentum</strong> API............................................................ 30<br />

<strong>Documentum</strong> Foundation Services............................................................................................ 31<br />

<strong>EMC</strong> <strong>Documentum</strong> Foundation Classes.................................................................................... 32<br />

Business Objects Framework .................................................................................................... 33<br />

The experiences group: Managing <strong>the</strong> end user’s interactions ...................34<br />

The Web Development Kit framework ....................................................................................... 34<br />

Application Connectors .............................................................................................................. 35<br />

A Webtop extension................................................................................................................... 36<br />

Portlets for enterprise portals..................................................................................................... 36<br />

Conclusion ........................................................................................................36<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 3


Executive summary<br />

In today’s digitally driven economy, business information comes in many forms: text documents,<br />

spreadsheets, pictures, XML files, web pages, full-motion video, streaming audio, e-mail messages, instant<br />

messages, <strong>and</strong> fixed content such as reports, records, <strong>and</strong> scanned images. From engineering drawings <strong>and</strong><br />

manufacturing procedures to marketing collateral <strong>and</strong> sales presentations, this unstructured content is<br />

critical to <strong>the</strong> smooth <strong>and</strong> efficient functioning of a firm. This unstructured content, like <strong>the</strong> financial data<br />

that drive accounting systems, needs to be managed in a systematic way.<br />

An enterprise content management system provides just such a systematic solution for capturing,<br />

organizing, storing, <strong>and</strong> delivering unstructured content within an enterprise <strong>and</strong> beyond. With an<br />

enterprise content management system, unstructured information is managed according to predefined<br />

business rules, policies, <strong>and</strong> procedures; relationships are established among pieces of content so <strong>the</strong> same<br />

items can be used in different contexts <strong>and</strong> renditions. The system adds intelligence to content<br />

collections—creating categorization schema <strong>and</strong> metadata that make search <strong>and</strong> retrieval faster <strong>and</strong> more<br />

efficient. The system facilitates <strong>the</strong> publication of content through multiple channels; for example, <strong>the</strong> same<br />

set of words <strong>and</strong> pictures can be published on a website, broadcast as a fax, printed as a hard copy<br />

document, <strong>and</strong> sent to a h<strong>and</strong>held wireless device. The system ensures archiving <strong>and</strong> long-term retention to<br />

meet compliance requirements. In short, enterprise content management systems automate <strong>the</strong> lifecycle<br />

processing of content.<br />

Introduction<br />

The <strong>EMC</strong> ® <strong>Documentum</strong> ® content management platform is <strong>the</strong> foundation on which content-based<br />

applications <strong>and</strong> solutions are built—from managing business documents to publishing content across<br />

multilingual websites to enabling collaboration with interactive tools. This white paper describes in detail<br />

<strong>the</strong> architecture of <strong>EMC</strong> <strong>Documentum</strong> <strong>and</strong> identifies <strong>the</strong> four primary groups of capabilities that form <strong>the</strong><br />

foundation of an enterprise content management strategy. It also explains how <strong>Documentum</strong> fits into a<br />

service-oriented approach to content-based applications.<br />

Audience<br />

This white paper is intended for application developers <strong>and</strong> IT executives who are looking to unite <strong>the</strong>ir<br />

vertical information silos by st<strong>and</strong>ardizing on a service-oriented platform with a solid architecture that can<br />

manage an organization’s content assets while providing superior scalability <strong>and</strong> ease of use.<br />

Bringing order to unstructured business information<br />

Business benefits: Beyond information silos<br />

Enterprise content management systems help integrate departments <strong>and</strong> o<strong>the</strong>r groups that previously<br />

functioned within separate information silos. In fact, information can be shared with business partners <strong>and</strong><br />

all o<strong>the</strong>r members of <strong>the</strong> extended enterprise.<br />

Why is this necessary—<strong>and</strong> powerful To be sure, <strong>the</strong> research <strong>and</strong> development department will continue<br />

to produce product specifications <strong>and</strong> patents, while <strong>the</strong> marketing department generates collateral <strong>and</strong><br />

press releases, <strong>and</strong> <strong>the</strong> customer service organization responds to customers’ queries. Yet more <strong>and</strong> more,<br />

employees <strong>and</strong> business partners need to access <strong>and</strong> share information across departmental boundaries,<br />

such as when <strong>the</strong>y launch a new product or create an innovative customer experience.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 4


What <strong>EMC</strong> <strong>Documentum</strong> delivers<br />

<strong>EMC</strong> <strong>Documentum</strong> is a comprehensive enterprise content management platform for ordering <strong>the</strong> flow <strong>and</strong><br />

delivery of unstructured business information across an extended enterprise.<br />

Based on an extensible, open, scalable, <strong>and</strong> secure architecture that meets <strong>the</strong> needs of global, distributed<br />

organizations, <strong>Documentum</strong> comprises a set of integrated products <strong>and</strong> services that work toge<strong>the</strong>r in<br />

varying combinations. From creation or capture, categorization, <strong>and</strong> electronic storage through just-in-time<br />

delivery <strong>and</strong> archiving, <strong>the</strong> <strong>Documentum</strong> end-to-end content management solution solves a range of<br />

strategic business issues.<br />

• Global <strong>and</strong> distributed. For enterprises with sites <strong>and</strong> customers around <strong>the</strong> world, <strong>Documentum</strong><br />

h<strong>and</strong>les users <strong>and</strong> content regardless of physical location. It includes unique content caching<br />

capabilities for high performance content management to any place around <strong>the</strong> world. The architecture<br />

stores multilingual content <strong>and</strong> metadata in shared repositories to accommodate local languages <strong>and</strong><br />

currencies, forming a single virtual repository that spans geographical boundaries <strong>and</strong> languages.<br />

• Extensible. <strong>Documentum</strong> can be extended to meet unique operational needs by embedding business<br />

rules or custom-designed content objects. <strong>Documentum</strong> incorporates a service-oriented architecture<br />

(SOA) that exploits <strong>the</strong> capabilities of web services for integrating with disparate enterprise<br />

applications. Customized plug-ins can be developed <strong>and</strong> deployed in key areas including user<br />

au<strong>the</strong>ntication, rich media h<strong>and</strong>ling, <strong>and</strong> legacy storage support.<br />

• Open. Because <strong>Documentum</strong> is st<strong>and</strong>ards-based, it easily integrates with existing IT infrastructures.<br />

There are st<strong>and</strong>ard <strong>Documentum</strong> APIs for WebDAV, FTP, SMB, JDBC, <strong>and</strong> <strong>the</strong> web services<br />

st<strong>and</strong>ard, WSDL. The architecture is fully J2EE compliant (for web-based applications) <strong>and</strong><br />

completely supports <strong>the</strong> Microsoft .NET environment <strong>and</strong> XML processing. Moreover, <strong>Documentum</strong><br />

integrates “out of <strong>the</strong> box” with enterprise applications <strong>and</strong> systems, including directory services using<br />

<strong>the</strong> LDAP st<strong>and</strong>ard.<br />

• Scalable. As <strong>the</strong> content management needs of an organization grow larger <strong>and</strong> more complex, <strong>the</strong><br />

<strong>Documentum</strong> solution efficiently manages ever-increasing volumes of content, high traffic loads, more<br />

users, <strong>and</strong> complex workflow processes—<strong>and</strong> it does so cost-effectively with continued high<br />

performance. <strong>Documentum</strong> addresses <strong>the</strong> network latency <strong>and</strong> large-scale distribution issues that<br />

global enterprises face. The <strong>Documentum</strong> architecture takes full advantage of <strong>the</strong> underlying hardware<br />

platform’s scalability by utilizing multiprocessor systems as well as caching <strong>and</strong> clustering<br />

environments (vertical <strong>and</strong> horizontal scalability).<br />

• Secure. <strong>Documentum</strong> enforces appropriate levels of security as organizations make <strong>the</strong>ir repository<br />

content available to a wide range of contributors <strong>and</strong> users. Access control lists define <strong>the</strong> users,<br />

groups, <strong>and</strong> roles that can access <strong>the</strong> repository or <strong>the</strong> discrete objects that it contains, as well as <strong>the</strong><br />

operations that can be performed. Sensitive information in <strong>the</strong> repository file stores can be encrypted.<br />

Network communications among servers <strong>and</strong> with desktop clients can be secured through <strong>the</strong> Secure<br />

Sockets Layer (SSL). <strong>Documentum</strong> also supports electronic signatures <strong>and</strong> offers extensive auditing of<br />

all system activities. Finally, <strong>Documentum</strong> secures roving content—documents <strong>and</strong> o<strong>the</strong>r objects that<br />

are moving around <strong>the</strong> network <strong>and</strong> beyond <strong>the</strong> purview of <strong>the</strong> repository.<br />

<strong>EMC</strong> <strong>Documentum</strong>: A layered architecture<br />

The <strong>Documentum</strong> platform provides a unified environment for capturing, storing, accessing, organizing,<br />

controlling, retrieving, delivering, <strong>and</strong> archiving any type of unstructured information within an enterprise.<br />

It also supports <strong>the</strong> resources for managing that content across an extended enterprise as well as for<br />

publishing content on <strong>the</strong> Internet.<br />

The <strong>Documentum</strong> platform consists of four conceptual groups:<br />

• The kernel is a unified environment where content is stored, accessed, <strong>and</strong> secured.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 5


• The application services provide various application-level services for organizing, controlling,<br />

sequencing, <strong>and</strong> delivering content to, <strong>and</strong> from, <strong>the</strong> repository.<br />

• The tools provide capabilities for developing <strong>and</strong> deploying content applications—enterprise-scale<br />

applications that use content within <strong>the</strong> context of business processes. This group also provides <strong>the</strong><br />

web services for integrating content-related objects with external enterprise applications.<br />

• The experiences provide <strong>the</strong> framework <strong>and</strong> interfaces enabling users to process <strong>and</strong> use content<br />

management functionality in desktop- or browser-based applications.<br />

Each of <strong>the</strong>se groups comprises a series of components, which toge<strong>the</strong>r form a unified, consistent, <strong>and</strong><br />

extensible architecture, as shown in Figure 1.<br />

Figure 1. The <strong>Documentum</strong> platform consists of four groups —kernel (bottom purple),<br />

application services (middle gold), experiences (top gray), <strong>and</strong> tools (right blue).<br />

Let’s examine <strong>the</strong> capabilities of <strong>the</strong>se four groups <strong>and</strong> identify how <strong>the</strong>y interrelate to provide a<br />

comprehensive environment for managing content across an enterprise.<br />

The kernel group: Storing, accessing, <strong>and</strong> securing<br />

content in a unified content infrastructure<br />

The <strong>Documentum</strong> platform is based on an enterprisewide repository in which <strong>the</strong> logical services for<br />

accessing content are separated from <strong>the</strong> underlying systems for storing it. To an application, <strong>the</strong><br />

<strong>Documentum</strong> repository appears as a unified environment—though content may reside on multiple servers<br />

<strong>and</strong> physical storage devices <strong>and</strong> be distributed throughout an organization. Put ano<strong>the</strong>r way, <strong>the</strong> operation<br />

of <strong>the</strong> repository is independent of <strong>the</strong> network typology.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 6


The <strong>Documentum</strong> repository stores content in a consistent manner, regardless of content type, file size or<br />

complexity, <strong>and</strong> file format. Content types include but are not limited to <strong>the</strong> following:<br />

• Ordinary text documents<br />

• Compound documents (containing interlinked <strong>and</strong> highly formatted text <strong>and</strong> graphics)<br />

• Web pages<br />

• XML files <strong>and</strong> XML-file hierarchies<br />

• Scanned images<br />

• Digitized photographs<br />

• Multimedia digital assets (such as music, sounds, <strong>and</strong> full-motion video)<br />

• Medical images<br />

• Fixed documents (such as <strong>the</strong> outputs <strong>and</strong> reports from enterprise applications)<br />

• E-mail <strong>and</strong> instant messages<br />

• Collaborative content such as threaded discussions, chats, wikis, votes, <strong>and</strong> notes<br />

• Computer-aided design (CAD) drawings<br />

• Documents <strong>and</strong> data records from enterprise resource planning (ERP) applications<br />

• Virtual reality environments<br />

Content objects<br />

The <strong>Documentum</strong> platform defines repository content as objects. (Content objects may consist of a<br />

collection of objects.) Objects comprise three parts: content assets or source data; content attributes or<br />

metadata; <strong>and</strong> methods or operations.<br />

• Content assets or source data represent <strong>the</strong> core information stored in its native format.<br />

• Content attributes or metadata describe <strong>the</strong> content assets with descriptors such as keywords, owner,<br />

version, links, <strong>and</strong> creation date.<br />

• Methods or operations are <strong>the</strong> instructions to be performed on <strong>the</strong> content assets, such as transform,<br />

notify, <strong>and</strong> display.<br />

A content object’s set of attributes <strong>and</strong> set of methods are configurable <strong>and</strong> extensible. Using <strong>Documentum</strong><br />

development tools, developers can create new object types that behave exactly as dictated by specific<br />

business needs.<br />

Fur<strong>the</strong>rmore, content attributes characterize <strong>the</strong> relationships among <strong>the</strong> stored content objects. The<br />

repository organizes content around its metadata; users <strong>and</strong> applications use <strong>the</strong> metadata to retrieve<br />

relevant content.<br />

Storing content objects<br />

The <strong>Documentum</strong> repository serves as a unified environment for storing content objects. These objects are<br />

stored in <strong>the</strong>ir native formats, <strong>and</strong> can be fur<strong>the</strong>r encrypted as business requirements need. Thus<br />

applications rely on a single set of services <strong>and</strong> programming interfaces to access content, regardless of<br />

where <strong>and</strong> how <strong>the</strong> content objects <strong>the</strong>mselves are stored. The repository enforces security measures to<br />

ensure that only authorized users <strong>and</strong> applications can access <strong>the</strong> content assets <strong>and</strong> indexes of content<br />

attributes.<br />

The <strong>Documentum</strong> repository is responsive to <strong>the</strong> business needs of <strong>the</strong> organization. This adaptability <strong>and</strong><br />

flexibility is particularly important for large organizations that operate in multiple locations <strong>and</strong> require a<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 7


distributed repository for storing, caching, fetching, <strong>and</strong> updating content, while also managing rapid access<br />

across <strong>the</strong> enterprise. The virtual reach of <strong>the</strong> <strong>Documentum</strong> repository makes it possible to implement<br />

distributed environments in various ways that ensure enterprisewide access, enhances system performance,<br />

<strong>and</strong> maintains underlying security <strong>and</strong> compliance requirements. An enterprise has many options for<br />

designing <strong>and</strong> deploying <strong>the</strong> virtual repository in ways that best meet its operational objectives.<br />

For example, a global company might host a <strong>Documentum</strong> content repository in multiple geographical<br />

regions, storing content locally to meet corporate quality-of-service guarantees. This company can also<br />

support a series of branch offices in remote locations to fur<strong>the</strong>r enhance end-user productivity <strong>and</strong> business<br />

objectives. Important documents, large multimedia files, <strong>and</strong> o<strong>the</strong>r types of mission-critical content can be<br />

predicatively distributed <strong>and</strong> cached at <strong>the</strong> branch offices, where <strong>the</strong>y are immediately available to local<br />

users (with no performance degradation of accessing files across low b<strong>and</strong>width connections). Users at<br />

branch offices can access <strong>and</strong> modify <strong>the</strong> content as <strong>the</strong>ir jobs require–<strong>the</strong> overall security <strong>and</strong> access<br />

controls extend across <strong>the</strong> entire enterprise environment in a seamless fashion. Updates made by branch<br />

office users can be synchronized with <strong>the</strong> regional repositories in a predictable manner, optimized to ensure<br />

<strong>the</strong> responsiveness of <strong>the</strong> user’s experience <strong>and</strong> <strong>the</strong> currency of <strong>the</strong> revised content. The end result is a<br />

distributed virtual repository where content is managed, regardless of geography or network b<strong>and</strong>width, to<br />

meet strategic business goals <strong>and</strong> objectives.<br />

Anatomy of <strong>the</strong> repository<br />

The <strong>Documentum</strong> repository consists of three main components, which behave as a single entity from an<br />

application point of view: a file store containing <strong>the</strong> content assets; attribute tables within a relational<br />

database; <strong>and</strong> full-text indexes (see Figure 2).<br />

Figure 2. The <strong>Documentum</strong> repository is composed of four components: a file store<br />

containing <strong>the</strong> content assets; attribute tables within a relational database; full-text<br />

indexes; <strong>and</strong> directory services. All components behave as a single entity from an<br />

application point of view.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 8


File store <strong>and</strong> RDBMS<br />

Typically, content attributes are stored in a relational database for rapid query <strong>and</strong> retrieval; content assets<br />

are stored as files in <strong>the</strong> file store. The file store can be a file system of <strong>the</strong> host operating system or a<br />

content-addressed storage (CAS) system such as <strong>EMC</strong> Centera ® . The files’ system-based stores can be<br />

hosted within different types of storage environments.<br />

For example, full-motion video files can reside on a high-performance streaming server while text-oriented<br />

files are hosted on a file server tuned to rapidly look up filenames. If, for operational, performance, or<br />

security reasons, an enterprise manages all of its content in a relational database management system<br />

(RDBMS), <strong>the</strong>n <strong>the</strong> content assets can also be stored as binary large objects (BLOBs) adjacent to <strong>the</strong><br />

attribute tables.<br />

Full-text indexes<br />

The <strong>Documentum</strong> platform maintains a full-text index of all text-based content assets stored within <strong>the</strong><br />

<strong>Documentum</strong> repository so it can rapidly search through large collections of unstructured information. The<br />

indexed content assets include documents, text files, XML components, HTML files, <strong>and</strong> closed-caption<br />

tracks of video files.<br />

The FAST Index Server, an industry-leading enterprise search technology, is embedded in <strong>the</strong><br />

<strong>Documentum</strong> platform. The search capability is modular, with alternate engines for market-specific<br />

<strong>Documentum</strong> offerings. For instance, <strong>the</strong> <strong>Documentum</strong> OEM edition, built for software vendors that embed<br />

<strong>the</strong> <strong>Documentum</strong> platform in <strong>the</strong>ir products, offers <strong>the</strong> open-source alternative Lucene as <strong>the</strong> default<br />

engine. For all st<strong>and</strong>ard enterprise customer offerings, however, a FAST search engine is built into <strong>the</strong><br />

repository.<br />

The full-text index, which is automatically created by an indexing process when content is added to <strong>the</strong><br />

repository, contains:<br />

• All words of <strong>the</strong> content assets stored within <strong>the</strong> repository<br />

• All keywords <strong>and</strong> o<strong>the</strong>r content attributes (or metadata) that describe <strong>the</strong> content assets<br />

The indexing process is typically hosted on a separate server. As part of <strong>the</strong> content ingestion process, an<br />

index agent forwards content to an index server, which maintains <strong>the</strong> full-text index database. The<br />

<strong>Documentum</strong> platform ensures that query performance <strong>and</strong> scalability are not affected by repository size:<br />

To scale up for high-speed content ingestion, <strong>the</strong> indexing process can run on multiple indexing pipelines<br />

deployed on multiple CPUs. This is particularly important in content archiving applications for e-mail,<br />

enterprise reports, <strong>and</strong> SAP data. Figure 3 shows <strong>the</strong> indexing <strong>and</strong> query process flows.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 9


Query<br />

Content<br />

Server<br />

Query<br />

Plug-in<br />

SPI<br />

Fast<br />

Query<br />

Plug-in<br />

FAST<br />

Query<br />

API<br />

DFC<br />

FAST<br />

Index<br />

Server<br />

Index<br />

Agent<br />

(Java)<br />

DFTXML<br />

Index<br />

Plug-in<br />

SPI<br />

Fast<br />

Index<br />

Plug-in<br />

FAST<br />

Index<br />

API<br />

Index<br />

Figure 3. The <strong>Documentum</strong> platform maintains a full-text index of all text-based content<br />

assets stored within <strong>the</strong> <strong>Documentum</strong> repository. The integration is accomplished through<br />

a set of plug-ins <strong>and</strong> APIs for querying <strong>and</strong> indexing functions.<br />

In addition to searching <strong>the</strong> text within <strong>the</strong> content assets, <strong>the</strong> full-text engine also searches all content<br />

attributes. So within a single query, <strong>the</strong> search engine analyzes content on two levels—<strong>the</strong> content assets<br />

<strong>and</strong> <strong>the</strong> content attributes—<strong>and</strong> returns a unified results lists. As part of its query algorithms, <strong>the</strong> search<br />

engine analyzes <strong>and</strong> normalizes text, <strong>and</strong> identifies synonyms based on a <strong>the</strong>saurus of related terms. The<br />

search engine can store <strong>and</strong> support multiple languages within a single index, eliminating <strong>the</strong> need for<br />

multiple, language-specific indexes. More than 70 languages are now supported.<br />

Connecting to an underlying storage infrastructure<br />

The <strong>Documentum</strong> repository transparently connects with <strong>the</strong> underlying storage infrastructure, which<br />

consists of multiple disk drives <strong>and</strong> o<strong>the</strong>r types of mass storage devices. The storage infrastructure can be<br />

designed to meet <strong>the</strong> specific reliability, security, policy, cost, <strong>and</strong> operational needs of various<br />

organizations. The <strong>Documentum</strong> platform makes no distinction among content stored in different types of<br />

environments; ra<strong>the</strong>r, it relies on <strong>the</strong> file system APIs to communicate with <strong>the</strong> file system interface of <strong>the</strong><br />

underlying file store.<br />

<strong>Documentum</strong> supports any type of storage system—from a server’s local hard drives <strong>and</strong> networkaccessible<br />

RAID arrays to network-attached storage (NAS) or complex storage area networks (SAN)—<br />

from any storage manufacturer. The storage system is transparent to <strong>the</strong> <strong>Documentum</strong> platform.<br />

The <strong>Documentum</strong> platform also provides two storage-specific services that enable system designers to<br />

enhance content storage capabilities: Content Storage Services <strong>and</strong> Content Services for <strong>EMC</strong> Centera.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 10


Content Storage Services<br />

Content Storage Services add a storage policy engine to <strong>the</strong> <strong>Documentum</strong> repository that enables eventtriggered,<br />

ad hoc, <strong>and</strong> batch execution of storage allocation <strong>and</strong> migration policies. Storage administrators<br />

can define, manage, <strong>and</strong> update <strong>the</strong> content storage policies to store “live” or frequently updated content on<br />

one set of devices, <strong>and</strong> archived content on ano<strong>the</strong>r. Content Storage Services include audit events <strong>and</strong><br />

migration logs, which enable easy reporting <strong>and</strong> chargeback capabilities.<br />

For example, when content is initially created it can be automatically stored in an online storage device.<br />

Frequently accessed content can remain within a high-performance storage environment, while rarely<br />

accessed content can migrate on a scheduled basis to a near-line, more economical storage environment.<br />

Valuable content that needs to be preserved for a predetermined period of time, such as <strong>the</strong> final versions of<br />

business documents, can be automatically stored in a highly secure storage environment. Transitory<br />

content, such as successive drafts of business documents or o<strong>the</strong>r work-in-progress items, can be securely<br />

stored <strong>and</strong> rapidly accessed as needed, <strong>and</strong> <strong>the</strong>n routinely purged when <strong>the</strong> project ends.<br />

Content Services for <strong>EMC</strong> Centera<br />

Content Services for <strong>EMC</strong> Centera is <strong>the</strong> bridge between <strong>the</strong> <strong>Documentum</strong> repository <strong>and</strong> Centera, an <strong>EMC</strong><br />

content-addressed storage (CAS) system that ensures fast, easy, online access with assured content<br />

au<strong>the</strong>nticity <strong>and</strong> petabyte scalability. The enterprise content management capabilities of <strong>the</strong> <strong>Documentum</strong><br />

platform function seamlessly with <strong>the</strong> <strong>EMC</strong> Centera CAS architecture to deliver an extensible <strong>and</strong> scalable<br />

kernel layer for fixed content assets. By providing <strong>the</strong>se valuable capabilities on <strong>the</strong> storage level, <strong>EMC</strong><br />

Centera complements <strong>the</strong> software-level security <strong>and</strong> compliance <strong>Documentum</strong> provides for fixed content<br />

assets.<br />

<strong>EMC</strong> Centera provides a scalable, secure storage environment for cost-effective retention, protection, <strong>and</strong><br />

disposition of fixed content—including electronic records, e-mail archives, <strong>and</strong> scanned images—within an<br />

enterprise environment. <strong>EMC</strong> Centera is optimized to store long-lived <strong>and</strong> archival content.<br />

Content Services for <strong>EMC</strong> Centera relies on <strong>the</strong> plug-in architecture of <strong>the</strong> <strong>Documentum</strong> platform. The<br />

content is stored directly within <strong>EMC</strong> Centera, which serves as a file store instead of a file system of <strong>the</strong><br />

underlying operating system. The content objects contain Centera-issued “claim checks” that are stored as<br />

properties of <strong>the</strong> content objects in <strong>the</strong> <strong>Documentum</strong> repository.<br />

<strong>EMC</strong> Centera ensures <strong>the</strong>re are no duplicate or redundant versions, which improves overall storage<br />

efficiency <strong>and</strong> performance.<br />

Security services<br />

Core security is provided by <strong>EMC</strong> <strong>Documentum</strong> Content Server; additional security can be added via<br />

Trusted Content Services <strong>and</strong> Information Rights Management Services. The core security services include:<br />

• Au<strong>the</strong>ntication<br />

• Authorization<br />

• Auditing<br />

Each fulfills a unique function within an organization’s security architecture. First, <strong>the</strong> <strong>Documentum</strong><br />

platform builds on <strong>the</strong> underlying enterprisewide security infrastructure to au<strong>the</strong>nticate access to <strong>the</strong><br />

repository. Next, <strong>the</strong> platform manages access control lists (ACLs) to authorize access to content stored<br />

within <strong>the</strong> repository. Any activity can be audited using flexible auditing tools, with an audit trail stored in<br />

<strong>the</strong> repository. The platform can <strong>the</strong>n encrypt all communications between <strong>the</strong> content server <strong>and</strong> o<strong>the</strong>r<br />

systems such as clients, web-based applications, <strong>and</strong> directory servers.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 11


Let’s examine each in turn.<br />

Au<strong>the</strong>ntication<br />

The <strong>Documentum</strong> platform relies, initially, on <strong>the</strong> au<strong>the</strong>ntication mechanisms of <strong>the</strong> underlying operating<br />

system or database, such as a username/password challenge, to manage access to <strong>the</strong> repository. The<br />

platform supports token-based au<strong>the</strong>ntication for application-level access, ensuring that client applications<br />

have valid tokens to connect to <strong>the</strong> repository <strong>and</strong> gain access to <strong>the</strong> content. The platform includes RSA<br />

Access Manager connections for single sign-on. The au<strong>the</strong>ntication mechanisms are extendable to include<br />

Kerberos validation <strong>and</strong> au<strong>the</strong>ntication plug-ins from CA Netegrity.<br />

Enterprise identity management<br />

The <strong>Documentum</strong> platform is designed to integrate seamlessly within an enterprise security architecture;<br />

where an enterprise directory service exists, <strong>the</strong> platform relies on it for enterprise identity management.<br />

The <strong>Documentum</strong> platform supports connections to multiple directory services <strong>and</strong> can be integrated with<br />

many popular directory servers, including Microsoft Active Directory, Sun ONE Directory Server, Oracle<br />

Internet Directory, IBM Tivoli Directory Server, <strong>and</strong> Novell eDirectory. The platform also supports <strong>the</strong><br />

Microsoft Active Directory Application Mode (ADAM) service. The platform uses Lightweight Directory<br />

Access Protocol (LDAP) to synchronize user <strong>and</strong> group identities enterprisewide, ensuring user identities<br />

are managed as an enterprisewide resource without adding extra administrative burden.<br />

Authorization<br />

Once a user or application au<strong>the</strong>nticates an identity, <strong>the</strong> person or program can access <strong>the</strong> stored content<br />

based on <strong>the</strong> privileges associated with that identity. The authorization rules (also called access controls or<br />

permissions) <strong>the</strong>n determine what content can be accessed or modified.<br />

The <strong>Documentum</strong> platform assigns authorization rules through access control lists (ACLs), which are<br />

automatically applied to all repository objects when <strong>the</strong> objects are created. The ACLs can be modified<br />

manually by users as well as automatically via lifecycle changes, through business processes, <strong>and</strong> through<br />

o<strong>the</strong>r applications.<br />

The <strong>Documentum</strong> platform applies authorization at <strong>the</strong> object level. So every content object, version, <strong>and</strong><br />

rendition, as well as every container (ranging from folders to storage servers) <strong>and</strong> any o<strong>the</strong>r object<br />

(business process, policy, audit trail, <strong>and</strong> so on) is secured by an ACL throughout its lifecycle.<br />

Three criteria for ACLs<br />

The <strong>Documentum</strong> platform authorizes access based on one of three criteria:<br />

• Explicit assignment to an individual user<br />

• Membership in a user group<br />

• Assignment to a predefined role<br />

Individuals, groups, <strong>and</strong> roles can own a content object managed by <strong>the</strong> <strong>Documentum</strong> platform. For<br />

example, when developing a press release, anybody with <strong>the</strong> role of “PR Manager” might be authorized to<br />

create a new press release, <strong>and</strong> any member of <strong>the</strong> “PR Group” can have privileges to edit it. The tasks can<br />

be shared—<strong>and</strong> coordinated—by managing role definitions across a workgroup, so managing a press<br />

release is no longer limited to a predefined named individual.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 12


Basic permissions<br />

The <strong>Documentum</strong> platform provides seven levels of basic permissions, or access privileges:<br />

• None—Content objects in <strong>the</strong> repository cannot be seen, reducing complexity by hiding content<br />

irrelevant to predefined users. It is also an effective way to screen sensitive documents or projects, <strong>and</strong><br />

ensure that only those people <strong>and</strong> processes with proper privileges can find object references in <strong>the</strong><br />

repository.<br />

• Browse—Content attributes (or metadata) for content objects can be viewed, but <strong>the</strong> content assets<br />

cannot be opened <strong>and</strong> read.<br />

• Read—Content assets can be opened <strong>and</strong> read, but not changed.<br />

• Relate—A user can create relationships between a given content object <strong>and</strong> o<strong>the</strong>r objects within <strong>the</strong><br />

repository. This permission is used by tools such as annotations where each annotation is a new object<br />

that relates to an existing content object.<br />

• Version—A user can make changes to a content asset but cannot overwrite an existing version;<br />

changes are saved in a new version, which can include a modified file, modified metadata, or both.<br />

• Write—A user can make changes to a content object (both <strong>the</strong> content asset <strong>and</strong> <strong>the</strong> associated<br />

metadata) <strong>and</strong> save those changes without creating a new version. This level of access control is<br />

usually restricted to <strong>the</strong> content owner.<br />

• Delete—A user can delete a content object.<br />

This set of permissions is cumulative: Each level automatically grants all access rights of <strong>the</strong> levels below<br />

it. For example, a user with “write” privileges can also “version,” “relate,” “read,” <strong>and</strong> “browse” <strong>the</strong><br />

contents. “Delete” is a special case, discussed next.<br />

Object-level delete privileges<br />

The “delete object” permission grants deletion privileges while denying o<strong>the</strong>r levels of access; that is, a<br />

user or process can delete a content object without having permission to write, version, read, or relate. This<br />

capability enables a corporate archivist, librarian, or records manager to dispose of objects from <strong>the</strong><br />

repository according to retention policies, without being able to access any aspects of its contents.<br />

Extended permissions<br />

The <strong>Documentum</strong> platform supports multiple extended permissions for managing <strong>the</strong> content objects<br />

within <strong>the</strong> repository.<br />

• Change location—A user can change <strong>the</strong> location of a content asset from one folder to ano<strong>the</strong>r. By<br />

default, a user with “browse” permission or greater has “change location” privileges.<br />

• Change permission—A user o<strong>the</strong>r than <strong>the</strong> content owner can change a content asset’s st<strong>and</strong>ard<br />

permissions.<br />

• Change owner—A user o<strong>the</strong>r than <strong>the</strong> content owner can change <strong>the</strong> owner of a content asset. This is<br />

important when content ownership is to be reassigned, <strong>and</strong> <strong>the</strong> original content owner is unavailable.<br />

• Execute procedure—A user can execute an external procedure on content assets, such as creating a<br />

rendition. By default, a user with “browse” permission or greater inherits “execute procedure”<br />

privileges.<br />

• Change state—A user can change a content asset’s lifecycle state.<br />

The <strong>Documentum</strong> platform controls access to <strong>the</strong> content objects as well as secures how <strong>the</strong>y are organized<br />

<strong>and</strong> categorized in <strong>the</strong> repository. As a result, <strong>the</strong> <strong>Documentum</strong> platform provides <strong>the</strong> core security services<br />

that determine what actions can be performed on a content object.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 13


Auditing<br />

Every operation performed by <strong>the</strong> <strong>Documentum</strong> repository can be recorded in an auditable record. The<br />

audit trail can be fully configured in <strong>the</strong> <strong>Documentum</strong> administrator (where it can also be viewed) <strong>and</strong> is<br />

secured in <strong>the</strong> repository by strong encryption.<br />

The audit trail meets <strong>the</strong> rigorous requirements of <strong>the</strong> FDA 21 CFR Part 11 regulation, considered a<br />

benchmark for auditing. But <strong>the</strong> audit trail can go fur<strong>the</strong>r into <strong>the</strong> scope <strong>and</strong> granularity of audited events,<br />

<strong>and</strong> can be used to trace possible security breaches <strong>and</strong> optimize system utilization.<br />

Each auditable record lists both <strong>the</strong> new <strong>and</strong> <strong>the</strong> previous values associated with an event (such as <strong>the</strong> time<br />

<strong>and</strong> username when a documents is checked out of <strong>the</strong> repository), enabling quick determination of what<br />

has changed. End users <strong>and</strong> administrators can also view <strong>the</strong> history of documents <strong>and</strong> o<strong>the</strong>r objects stored<br />

in <strong>the</strong> repository, so that <strong>the</strong>y can determine how <strong>and</strong> when <strong>the</strong> information changes.<br />

Encrypted communication<br />

All communications involving <strong>the</strong> content server—such as between Content Server <strong>and</strong> an application<br />

server, Content Server <strong>and</strong> Desktop clients, <strong>and</strong> Content Server <strong>and</strong> a directory server—use SSL st<strong>and</strong>ard<br />

encryption to prevent security breaches by eavesdropping.<br />

Trusted Content Services<br />

The <strong>Documentum</strong> platform adds Trusted Content Services to tackle application-specific security situations<br />

beyond <strong>the</strong> au<strong>the</strong>ntication <strong>and</strong> authorization mechanisms provided by <strong>the</strong> core security services of <strong>the</strong><br />

content platform.<br />

Trusted Content Services include:<br />

• Encrypted file stores. Repository content files can be encrypted to prevent system-level intrusion <strong>and</strong><br />

to secure content files stored on backup media. The encryption can be done selectively per file store, so<br />

each repository can combine encrypted <strong>and</strong> unencrypted content.<br />

• Digital shredding of deleted items. Shredding irrevocably destroys content at an operating system<br />

level by overwriting <strong>the</strong> data on <strong>the</strong> storage device. The <strong>Documentum</strong> platform shreds content stored<br />

on both file systems <strong>and</strong> CAS devices.<br />

• Support for electronic signatures. Users can sign electronic documents in a manner that meets<br />

established industry st<strong>and</strong>ards to verify <strong>the</strong> integrity of <strong>the</strong> signed document.<br />

In addition, Trusted Content Services can enrich <strong>the</strong> underlying security model <strong>and</strong> extend authorization<br />

mechanisms through m<strong>and</strong>atory access control (MAC). This mechanism provides an additional level of<br />

security before granting an au<strong>the</strong>nticated user access to a content object.<br />

Specifically, MAC can:<br />

• Enforce membership rules. Ensures a user is a member of an externally defined group before<br />

verifying authorization privileges.<br />

• Enforce restriction rules. Restricts a user’s access privileges to a specific level even if <strong>the</strong> ACL<br />

provides for a higher level of access.<br />

• Apply application-level access control. Augments an ACL with an application-specific security<br />

setting.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 14


Information Rights Management Services<br />

Information Rights Management (IRM) Services extend <strong>the</strong> security <strong>and</strong> access controls on documents <strong>and</strong><br />

o<strong>the</strong>r types of content beyond <strong>the</strong> boundaries content platform. IRM Services secure “roving content” that<br />

require persistent protection across <strong>the</strong> network <strong>and</strong> wherever <strong>the</strong> content is located <strong>and</strong> stored.<br />

IRM Services augment <strong>the</strong> <strong>Documentum</strong> platform by adding an IRM policy server to <strong>the</strong> enterprise<br />

environment, as shown in Figure 4. This server establishes <strong>the</strong> policies by which documents, e-mail<br />

messages, or o<strong>the</strong>r types of objects can be opened, displayed, printed, <strong>and</strong> fur<strong>the</strong>r distributed outside <strong>the</strong><br />

repository. Before leaving <strong>the</strong> <strong>Documentum</strong> Content Server, <strong>the</strong> content toge<strong>the</strong>r with <strong>the</strong> usage policy is<br />

secured via encryption. Only <strong>the</strong> encrypted file (containing <strong>the</strong> content) is transferred from <strong>the</strong> repository—<br />

<strong>and</strong> is available outside <strong>the</strong> security perimeter. IRM Services support Microsoft Office applications (Word,<br />

PowerPoint, Excel, <strong>and</strong> Outlook), Adobe PostScript, HTML, RIM BlackBerry, <strong>and</strong> Lotus Notes<br />

applications, <strong>and</strong> can be customized to support o<strong>the</strong>r types of file formats.<br />

Figure 4. IRS Services add an IRM Policy Server to <strong>the</strong> enterprise information environment<br />

to secure <strong>the</strong> content that is no longer managed by <strong>Documentum</strong> Content Server.<br />

IRM Services <strong>the</strong>n control <strong>the</strong> process by which <strong>the</strong> content is decrypted <strong>and</strong> made accessible to recipients.<br />

An end user needs to obtain a key to decrypt <strong>the</strong> content by accessing a policy server over <strong>the</strong> network.<br />

This policy server verifies <strong>the</strong> user’s identity through its own au<strong>the</strong>ntication mechanism. Once<br />

au<strong>the</strong>nticated, <strong>the</strong> policy server provides <strong>the</strong> end user with a key to decrypt <strong>the</strong> content. Once decrypted, <strong>the</strong><br />

user’s use of <strong>the</strong> content is limited by <strong>the</strong> predefined usage policy. For example, <strong>the</strong>re might be limits on<br />

<strong>the</strong> number of times <strong>the</strong> content can be viewed, whe<strong>the</strong>r recipients can print or copy <strong>the</strong> document into<br />

ano<strong>the</strong>r file, whe<strong>the</strong>r recipients can forward <strong>the</strong> document to third parties, or o<strong>the</strong>r operational constraints.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 15


The application services group:<br />

Managing content as interrelated modules<br />

The <strong>Documentum</strong> platform leverages <strong>the</strong> capabilities of <strong>EMC</strong> <strong>Documentum</strong> Content Server by providing a<br />

comprehensive suite of application services for managing content. These services function as interrelated<br />

modules—one service calls ano<strong>the</strong>r to obtain needed information or functionality.<br />

The <strong>Documentum</strong> platform incorporates three sets of application services: Compliance Services, Core<br />

Content Services, <strong>and</strong> Process Services.<br />

Compliance Services<br />

Compliance Services provide capabilities for retaining content <strong>and</strong> managing content as records. These are<br />

Retention Policy Services <strong>and</strong> Records Manager, respectively.<br />

Retention Policy Services<br />

Retention Policy Services (RPS) specify <strong>and</strong> enforce <strong>the</strong> retention of objects in <strong>the</strong> <strong>Documentum</strong> repository<br />

by attaching one or more retention policies to those objects. The retained objects, or records, are<br />

immutable—<strong>the</strong>y cannot be changed or deleted for <strong>the</strong> duration of <strong>the</strong> retention policy. An additional<br />

“hold” capability retains documents according to ad hoc events such as an audit or litigation.<br />

By applying policies to containers (such as folders) or processes (such as workflows or lifecycles),<br />

document retention is enforced programmatically—with little to no human involvement. The policies <strong>and</strong><br />

automation tools can also be used for content disposition (or permanent archiving, or destruction), ensuring<br />

that files are appropriately disposed <strong>and</strong> helping to limit content accumulation.<br />

RPS enhances <strong>the</strong> st<strong>and</strong>ard <strong>Documentum</strong> controls along three important dimensions:<br />

• Notifications: Notifies owners or authorities based on trigger events such as entry into, or completion<br />

of, a retention phase.<br />

• Auditing: Audits <strong>and</strong> records <strong>the</strong> “before” <strong>and</strong> “after” of metadata changes during a recordkeeping<br />

action.<br />

• Reporting: Provides report query engines with st<strong>and</strong>ard recordkeeping criteria <strong>and</strong> predefined<br />

recordkeeping reports.<br />

Using RPS, organizations can meet compliance regulations, legal requirements, <strong>and</strong> best practices. RPS can<br />

be added independently to any supported <strong>Documentum</strong> environment. RPS is <strong>the</strong> retention engine powering<br />

<strong>the</strong> <strong>EMC</strong> <strong>Documentum</strong> Records Manager application.<br />

Records Manager<br />

<strong>EMC</strong> <strong>Documentum</strong> Records Manager extends <strong>the</strong> <strong>Documentum</strong> core content management capabilities by<br />

adding features <strong>and</strong> functionality such as corporate file plans, classification, <strong>and</strong> file-level <strong>and</strong> field-level<br />

security.<br />

The Records Manager architecture provides recordkeeping functionality as services that can be used for<br />

electronic <strong>and</strong> physical records alike (see Figure 5). Like functionality is aggregated into discrete modules.<br />

By selecting <strong>the</strong> appropriate Records Manager modules, customers can deploy a records solution that meets<br />

<strong>the</strong>ir unique requirements. Customers can also add additional modules if <strong>and</strong> when <strong>the</strong>ir requirements<br />

change.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 16


Core Components of Records Manager<br />

Security &<br />

File Plan<br />

Reporting &<br />

Access<br />

Auditing<br />

Control<br />

Notification<br />

Storage<br />

Mgmt.<br />

Physical/<br />

Paper<br />

Complementary<br />

Offerings<br />

Trusted<br />

Info. Rights<br />

Content<br />

Management<br />

Services<br />

Services<br />

Content<br />

Content<br />

Transformation<br />

Intelligence<br />

Services<br />

Services<br />

BPM<br />

ECI<br />

Workflow<br />

Services<br />

Retention Policy Services<br />

<strong>Documentum</strong> Platform<br />

Figure 5. The <strong>Documentum</strong> records management capabilities support electronic<br />

documents, e-mail, <strong>and</strong> paper-based documents as managed records. These capabilities<br />

leverage <strong>the</strong> complementary offerings of <strong>the</strong> overall <strong>Documentum</strong> platform.<br />

Records Manager leverages Retention Policy Services <strong>and</strong> <strong>the</strong> capabilities of <strong>the</strong> <strong>Documentum</strong> platform to<br />

provide records management capabilities in a modular fashion. The modules <strong>and</strong> <strong>the</strong>ir capabilities are<br />

described in Table 1.<br />

Table 1. Records Manager modules <strong>and</strong> features<br />

Records Manager<br />

module<br />

Containment<br />

policies<br />

File plan<br />

Naming policies<br />

Security policies<br />

Capabilities<br />

Controls <strong>the</strong> number of tiers within <strong>the</strong> folder or file plan hierarchy <strong>and</strong> <strong>the</strong> actions<br />

that are permitted within each tier, such as a check-in or records declaration.<br />

Containment policies also allow or block recordkeeping governance by document<br />

type <strong>and</strong> limit <strong>the</strong> number of a record’s classifications, which is architecturally<br />

equivalent to <strong>the</strong> number of links associated with an object.<br />

Provides a permanent, systemwide classification schema for records, defining<br />

record naming, organization, <strong>and</strong> descriptive metadata, specified <strong>and</strong> managed by a<br />

records administrator. A document is overtly declared as a record by storing it in a<br />

location managed by <strong>the</strong> file plan, <strong>and</strong> classified using metadata specified by <strong>the</strong><br />

file plan. Retention is defined by <strong>the</strong> classification.<br />

Configures <strong>the</strong> naming conventions for records <strong>and</strong> <strong>the</strong> file plan by controlling what<br />

attributes are used, what date format is enforced, whe<strong>the</strong>r human entries get<br />

validated, how names are dynamically generated, <strong>and</strong> more.<br />

Extends existing <strong>Documentum</strong> security by adding document-level permissions that<br />

are discrete ra<strong>the</strong>r than cumulative. For example, <strong>the</strong> capability to grant “browse”<br />

capabilities to a certain user, group, or role for a specific document type such as<br />

invoices.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 17


Records Manager<br />

module<br />

Retention policies<br />

Supplemental<br />

markings / Shared<br />

markings<br />

Capabilities<br />

Determines <strong>the</strong> length of time a document, folder, or cabinet is retained, based on<br />

operational, legal, regulatory, fiscal or internal requirements. For <strong>the</strong> duration of its<br />

applied retention policy, <strong>the</strong> managed object cannot be deleted, nor can it be revised<br />

in any way, although a new version of <strong>the</strong> object may be checked in.<br />

Extends access controls by adding permissions based on participation in a<br />

designated group, <strong>and</strong> restricting permissions to users who are members of all<br />

designated groups.<br />

The Records Manager modularity <strong>and</strong> service-oriented architecture make it easy to incorporate<br />

<strong>Documentum</strong> recordkeeping functionality into o<strong>the</strong>r systems, including external applications. Table 2<br />

outlines <strong>the</strong>se <strong>and</strong> o<strong>the</strong>r architectural principles.<br />

Table 2. Architectural principles of Records Manager<br />

Architectural principles Why it matters Example<br />

Modular architecture<br />

Aggregates similar<br />

recordkeeping functionality<br />

within discrete, plug-<strong>and</strong>-play<br />

modules.<br />

Multi-tier architecture<br />

Separates business<br />

intelligence from <strong>the</strong> user<br />

interface.<br />

Simplifies <strong>and</strong> speeds deployment,<br />

enabling sites to install <strong>the</strong><br />

functionality without complicating<br />

<strong>the</strong> configuration, administration, or<br />

user interface.<br />

Simplifies sharing <strong>and</strong> <strong>the</strong><br />

incorporation of records functionality<br />

into external applications by relying<br />

exclusively on business logic <strong>and</strong> not<br />

<strong>the</strong> user interface.<br />

Provides greater efficiency when<br />

using or extending <strong>the</strong> provided APIs<br />

because changes only need to be<br />

made in one place; provides <strong>the</strong><br />

business logic layer.<br />

Align <strong>the</strong> recordkeeping controls<br />

with your regulatory<br />

environment. Or start simple <strong>and</strong><br />

add functionality when it<br />

becomes relevant.<br />

Automate record declarations<br />

within line-of-business systems<br />

such as a contracts management<br />

application.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 18


Architectural principles Why it matters Example<br />

Policy frameworks<br />

Tailor or enhance system<br />

behavior by adding business<br />

logic through <strong>the</strong> applied<br />

policy manager according to<br />

clear, st<strong>and</strong>ardized<br />

framework guidelines.<br />

Open interface<br />

Java-service <strong>and</strong> web-service<br />

based interfaces that extend<br />

existing <strong>Documentum</strong><br />

functionality while adhering<br />

to st<strong>and</strong>ard <strong>Documentum</strong><br />

practices.<br />

Simplifies extensions <strong>and</strong><br />

customizations; no developer needed.<br />

Enables customization based on<br />

multiple varied attributes, including<br />

policy qualifiers.<br />

Enables integration via web services<br />

without supporting Java.<br />

Adds Records Manager functionality<br />

to a <strong>Documentum</strong> environment<br />

without discarding or duplicating<br />

prior work.<br />

Add different notification<br />

recipients simply by adding a<br />

policy to <strong>the</strong> existing “action<br />

framework.”<br />

Apply policies by object type or<br />

o<strong>the</strong>r conditions. For example,<br />

applying different naming rules<br />

for different levels in <strong>the</strong> file plan.<br />

Or automating <strong>the</strong> appropriate<br />

record classification by document<br />

type, such as invoices or<br />

contracts.<br />

Enable partner applications or<br />

internal business systems to<br />

incorporate records declaration as<br />

a web service within <strong>the</strong><br />

application.<br />

Add records functionality, such as<br />

DOD security clearance levels, to<br />

current <strong>Documentum</strong> users.<br />

Core Content Services<br />

The Core Content Services provide <strong>the</strong> fundamental capabilities for accessing <strong>and</strong> storing repository<br />

content. These include library services, workflow services, lifecycle services, XML services, Enterprise<br />

Content Integration Services, Content Transformation Services, Content Intelligence Services, <strong>and</strong> Content<br />

Delivery Services.<br />

Library services<br />

Library services manage content in three critical ways:<br />

• Check-in/check-out (or locking) capabilities ensure users with editing privileges do not overwrite one<br />

ano<strong>the</strong>r’s versions or make incompatible updates. For example, when one person is editing a<br />

document, ano<strong>the</strong>r person cannot overwrite <strong>the</strong>ir edits.<br />

• Versioning capabilities track <strong>the</strong> multiple versions of documents or o<strong>the</strong>r content objects, <strong>and</strong> provide<br />

<strong>the</strong> ability to revert to prior versions as required. For example, <strong>the</strong> repository can maintain multiple<br />

versions of a set of web pages, <strong>and</strong> revert to a version from a prior date when needed.<br />

• Basic renditioning capabilities maintain alternative representations of documents or o<strong>the</strong>r content<br />

objects in <strong>the</strong>ir different formats, resolutions, or natural languages. The <strong>Documentum</strong> platform can<br />

automatically generate renditions through embedded converters <strong>and</strong> maintain <strong>the</strong> relationship between<br />

<strong>the</strong> original object <strong>and</strong> its renditions, ensuring <strong>the</strong> object’s integrity <strong>and</strong> enabling users to manage<br />

renditions individually or collectively. For example, content initially authored as a Microsoft Word<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 19


document can be rendered as a fixed formatted Adobe Acrobat PDF file, or an HTML formatted web<br />

page with associated embedded image files.<br />

Library services, in turn, rely on an extensive set of security services to determine how users or applications<br />

are au<strong>the</strong>nticated <strong>and</strong> authorized to access repository content.<br />

Workflow services<br />

The <strong>Documentum</strong> workflow automates business activities <strong>and</strong> policies for repository content. A workflow<br />

is defined by a model, <strong>the</strong> sequence of steps that comprise <strong>the</strong> process, <strong>and</strong> <strong>the</strong> actions that must occur at<br />

each step. A workflow can describe a simple or complex process; it can be serial, with activities occurring<br />

one after ano<strong>the</strong>r, or parallel, with all activities occurring simultaneously; <strong>and</strong> it can combine serial <strong>and</strong><br />

parallel activities. Because an object’s workflow state is defined by a set of content attributes attached to<br />

<strong>the</strong> object, it travels with <strong>the</strong> object.<br />

For example, a press release workflow might require an approval process involving five people <strong>and</strong> seven<br />

serial steps.<br />

The <strong>Documentum</strong> platform persistently manages <strong>the</strong> state of multiple instances of each workflow—often<br />

hundreds or thous<strong>and</strong>s of instances—by storing workflow objects in <strong>the</strong> <strong>Documentum</strong> repository.<br />

Similarly, workflow templates (definitions) are stored as repository objects so various services, such as<br />

security, versioning, <strong>and</strong> retention, can be applied.<br />

Lifecycle services<br />

The <strong>Documentum</strong> platform defines, maps, <strong>and</strong> implements flexible content lifecycle rules according to <strong>the</strong><br />

business policies established by <strong>the</strong> enterprise.<br />

Like workflow, an object’s lifecycle state is defined by a set of content attributes attached to <strong>the</strong> object, so<br />

it also travels with <strong>the</strong> object. But instead of being defined by a flexible workflow model, lifecycle services<br />

are defined by a set of business policies or business rules. While a workflow routes a document among<br />

various users <strong>and</strong> automatic tasks, lifecycles define <strong>the</strong> business rules for changes that apply to content as it<br />

moves through predefined stages (such as “draft,” “in review,” “active,” <strong>and</strong> “obsolete”). As you might<br />

expect, unlike workflow, each content object has only a single lifecycle.<br />

Lifecycle services automate <strong>the</strong> lifecycle policies of repository content. These services assign a lifecycle<br />

stage to <strong>the</strong> content object, <strong>and</strong> <strong>the</strong>n manage <strong>the</strong> object’s transition from one stage to ano<strong>the</strong>r. An<br />

organization can extend <strong>the</strong> lifecycle stages to encompass its own operating policies (see Figure 6).<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 20


Figure 6. Lifecycle services assign a lifecycle stage to a content object <strong>and</strong> <strong>the</strong>n manage<br />

<strong>the</strong> object's transition from one stage to ano<strong>the</strong>r.<br />

Lifecycle services are a powerful content management capability. Policies instigating changes in access<br />

control, logical <strong>and</strong> physical location, retention rules, labeling, naming, versioning, renditioning, <strong>and</strong><br />

workflow <strong>and</strong> business processes can be mapped to <strong>the</strong> lifecycle stages. Different object types can have<br />

different lifecycle definitions.<br />

For example, consider <strong>the</strong> lifecycles for press releases <strong>and</strong> patent applications.<br />

• When a company develops a press release, any member of <strong>the</strong> corporate communications department<br />

may edit it prior to approval. Only marketing managers <strong>and</strong> product managers responsible for products<br />

mentioned in <strong>the</strong> press release may read <strong>the</strong> drafts. Once <strong>the</strong> press release is approved, all senior<br />

managers in <strong>the</strong> company can read it, but only <strong>the</strong> director of corporate communications can change it.<br />

When <strong>the</strong> final version is published on <strong>the</strong> company’s website, all prior (or draft) versions are<br />

automatically deleted from <strong>the</strong> repository after 30 days. These access policies are distinct from a<br />

workflow that routes <strong>the</strong> press release to <strong>the</strong> company managers who have to approve it before it can<br />

be promoted to <strong>the</strong> final stage.<br />

• When a company creates a patent application, only designated researchers <strong>and</strong> staff attorneys can edit<br />

<strong>the</strong> content, while research directors <strong>and</strong> <strong>the</strong> corporate counsel can read it. Once <strong>the</strong> application is<br />

finalized <strong>and</strong> submitted to an external patent authority, o<strong>the</strong>r company researchers <strong>and</strong> managers can<br />

<strong>the</strong>n read <strong>the</strong> application. All drafts of <strong>the</strong> application are automatically archived for seven years. The<br />

submitted version is automatically classified as a record <strong>and</strong> submitted to <strong>the</strong> company archives for<br />

perpetual storage in a secure storage environment.<br />

XML services<br />

The <strong>Documentum</strong> platform provides a core set of XML services for managing XML documents in <strong>the</strong>ir<br />

native format.<br />

XML documents are a special type of content: text files encompassing predefined sets of XML elements,<br />

where <strong>the</strong> elements are identified by XML-formatted tags. As an industry st<strong>and</strong>ard, XML itself is a<br />

platform-neutral, structured markup language that separates content from format. Content tagging, <strong>and</strong><br />

separating content from formatting, have many benefits for content management, including enhanced<br />

content intelligence <strong>and</strong> content reuse; for instance, content can be queried by predefined tags <strong>and</strong> values to<br />

enhance searching precision. Content can also be stored as a single source, <strong>and</strong> <strong>the</strong>n repurposed <strong>and</strong><br />

rendered in multiple formats on various types of display devices. In recent years, XML has emerged as <strong>the</strong><br />

lingua franca for automatically exchanging content among disparate applications running within web-based<br />

environments.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 21


The <strong>Documentum</strong> platform preserves <strong>the</strong> hierarchical structure <strong>and</strong> links among XML components <strong>and</strong><br />

documents. It provides <strong>the</strong> ability to automatically parse, validate, transform, map, <strong>and</strong> store incoming<br />

XML documents. It also supports XML applications that directly store XML-tagged content to (<strong>and</strong><br />

manage <strong>the</strong> content within) <strong>the</strong> <strong>Documentum</strong> repository.<br />

XML services provide two features that are essential for managing XML documents in <strong>the</strong>ir native format:<br />

XML content validation <strong>and</strong> XML chunking.<br />

XML content validation<br />

XML content validation ensures that <strong>the</strong> XML elements within an XML document are well formed <strong>and</strong><br />

conform to a predefined definition. The <strong>Documentum</strong> platform can validate XML documents at any time,<br />

including during ingestion into <strong>the</strong> repository.<br />

An XML document can be validated against a Document Type Definition (DTD) or an XML schema.<br />

Extended validation is also available through <strong>the</strong> SAX2 <strong>and</strong> DOM interfaces. The validation process<br />

ensures that <strong>the</strong> components, attributes, structure, <strong>and</strong> types <strong>and</strong> values correspond to <strong>the</strong> specified format.<br />

In addition, <strong>the</strong> <strong>Documentum</strong> platform also manages <strong>the</strong> DTDs <strong>and</strong> schemas as <strong>Documentum</strong> repository<br />

objects that can be versioned, secured, or retained as records.<br />

XML chunking<br />

When segmenting (or chunking) an XML document into its elements, <strong>the</strong> resulting chunks are <strong>the</strong>n<br />

managed separately as discrete content objects. These chunks are just like o<strong>the</strong>r content objects: Each has<br />

its own predefined security levels <strong>and</strong> content attributes, as shown in Figure 7.<br />

Figure 7. XML chunks are managed as discrete objects, just like any o<strong>the</strong>r content object<br />

in <strong>the</strong> <strong>Documentum</strong> repository.<br />

Chunking facilitates reuse. A set of discrete content objects can be combined <strong>and</strong> rendered in different<br />

contexts to meet various business situations. The chunks are components of virtual documents. For<br />

instance, a set of news-related headlines can be displayed as a news summary, while each headline can be<br />

paired with <strong>the</strong> relevant news-related paragraphs to produce a press release.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 22


ECI Services for federated search<br />

The <strong>Documentum</strong> platform includes technologies <strong>and</strong> services to integrate, access, <strong>and</strong> query content<br />

beyond <strong>the</strong> information stored within a <strong>Documentum</strong> repository. These federated search services are based<br />

on an ECI technology leveraging a framework of adapters for various internal <strong>and</strong> external repositories.<br />

Federated search is useful when interacting with information stored in third-party (non-<strong>Documentum</strong>)<br />

repositories <strong>and</strong> external websites.<br />

The <strong>Documentum</strong> platform relies on federated search for cross-repository searches as well as to query <strong>and</strong><br />

retrieve content from external information sources, including:<br />

• FileNet, Open Text, Microsoft SharePoint, IBM Lotus Notes, <strong>and</strong> content stores from o<strong>the</strong>r vendors<br />

• SAP, Oracle, <strong>and</strong> o<strong>the</strong>r enterprise application vendors<br />

• Lexis/Nexis <strong>and</strong> Factiva infobases <strong>and</strong> o<strong>the</strong>r dynamic web-based content environments<br />

• Static intranets accessed by third-party search environments such as <strong>the</strong> Autonomy search engine <strong>and</strong><br />

<strong>the</strong> Google enterprise search appliance<br />

• Desktop search engines provided by Google <strong>and</strong> any online search engine such as Google, Yahoo, <strong>and</strong><br />

Voila<br />

ECI Services use an adapter framework <strong>and</strong> a query-brokering environment to enable <strong>the</strong>se federated<br />

search capabilities (see Figure 8). Each information source gets a unique adapter that maps <strong>the</strong> contentrelated<br />

metadata defined within external information source into a schema supported by <strong>the</strong> <strong>Documentum</strong><br />

platform.<br />

Figure 8. ECI Services are based on an adapter framework to enable federated search<br />

capabilities<br />

ECI Services function through a two-step process. First, <strong>the</strong> ECI query broker maps a query into a format<br />

supported by an external information source <strong>and</strong> <strong>the</strong>n submits <strong>the</strong> query to <strong>the</strong> source. Then <strong>the</strong> query<br />

processor receives <strong>the</strong> requested information from <strong>the</strong> external source, extracts <strong>the</strong> metadata, filters <strong>the</strong><br />

response, <strong>and</strong> returns <strong>the</strong> results.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 23


Users can simultaneously submit a single query to multiple information sources via any client, receive <strong>the</strong><br />

results from <strong>the</strong> multiple query processors interacting with <strong>the</strong> external sources, <strong>and</strong> merge <strong>the</strong> results into a<br />

single set based on predefined criteria (such as relevance or date published).<br />

Content Transformation Services<br />

<strong>Documentum</strong> provides a suite of Content Transformation Services (CTS) for changing various kinds of<br />

content—such as documents, photos, video, <strong>and</strong> medical images—into different formats <strong>and</strong> resolutions.<br />

CTS also provide content analysis, metadata extraction, <strong>and</strong> thumb nailing for rich media content types.<br />

Content Transformation Services (see Figure 9) are built as self-contained modules for accomplishing<br />

specific tasks. Some of <strong>the</strong> modules include:<br />

• Document Transformation Services (DTS)—Supports document transformations, such as rendering<br />

MS Office documents as PDF <strong>and</strong> HTML files. DTS runs as a separate server-side process without<br />

requiring user au<strong>the</strong>ntication. The transformation can be triggered by users from <strong>the</strong> user interface or<br />

automatically by a business process or lifecycle stage change.<br />

• Advanced Document Transformation Services (ADTS)—Extends DTS by adding support for<br />

additional document formats: Microsoft Project, Microsoft Visio, AutoCAD, <strong>and</strong> multi-page TIFF.<br />

ADTS creates bookmarks <strong>and</strong> preserves links within documents <strong>and</strong> supports many advanced options<br />

for controlling PDF output formats. ADTS includes an active storyboard capability for browsing<br />

directly through PDF documents stored in <strong>the</strong> <strong>Documentum</strong> repository.<br />

• XML Transformation Services (XTS)—Features extensive XML format transformations, an<br />

eXtensible Stylesheet Language Transformations (XSLT) engine with full XSL-FO support, a style<br />

sheet tool kit, <strong>and</strong> XML schema transformation support. XTS transforms XML to popular web formats<br />

(such as HTML), mobile formats (such as WML, cHTML, <strong>and</strong> XHTML Basic), Portable Document<br />

Format (PDF), help file formats (such as JavaHelp, Microsoft WinHelp, <strong>and</strong> Microsoft Compiled<br />

HTML Help), Rich Text Format (RTF), <strong>and</strong> PostScript. The tool kit provides support for Darwin<br />

Information Typing <strong>Architecture</strong> (DITA) <strong>and</strong> DocBook st<strong>and</strong>ards. XTS can convert XML from one<br />

schema to ano<strong>the</strong>r, invoked by workflows, lifecycles, user-based actions, or o<strong>the</strong>r applications.<br />

• <strong>Documentum</strong> Regulatory Publishing Transformation Services—Delivers enhanced PDF<br />

transformation capabilities in support of <strong>the</strong> electronic common technical document (eCTD)<br />

specification submission process. These services expose advanced transformation options for<br />

<strong>the</strong> creation of PDF files.<br />

• Media Transformation Services (MTS)—Provides rich media transformations <strong>and</strong> analysis for static<br />

digital assets, including photos, scanned images, <strong>and</strong> Microsoft PowerPoint slide decks. MTS can read<br />

<strong>and</strong> write metadata associated with digital assets, such as Adobe XMP tagging technology. MTS<br />

includes capabilities for automatically managing PowerPoint slides as discrete objects, as well as<br />

extracting thumbnails <strong>and</strong> low-resolution images from high-resolution assets. As a result, digital assets<br />

can be centrally managed <strong>and</strong> reused in different contexts. MTS configuration capabilities can<br />

integrate <strong>the</strong> <strong>Documentum</strong> platform’s support of rich media repositories with <strong>the</strong> underlying content<br />

storage infrastructure.<br />

• Audio/Video Transformation Services—Extends <strong>the</strong> capabilities of MTS to support multiple audio,<br />

video, <strong>and</strong> animation formats. These services also integrate streaming media storage <strong>and</strong> delivery into<br />

<strong>the</strong> content storage infrastructure.<br />

• Medical Imaging Transformation Services—Extends MTS to support <strong>the</strong> metadata access,<br />

management, <strong>and</strong> storage of medical images. This service supports <strong>the</strong> Digital Imaging <strong>and</strong><br />

Communications in Medicine (DICOM) st<strong>and</strong>ard, a predefined set of metadata for storing medical<br />

images within <strong>the</strong> <strong>Documentum</strong> repository.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 24


Figure 9. Application developers can use <strong>the</strong> Content Transformation Services modular,<br />

plug-in architecture to develop <strong>and</strong> deploy new transformation services.<br />

Content Intelligence Services<br />

Content Intelligence Services (CIS) analyze <strong>the</strong> text within documents <strong>and</strong> o<strong>the</strong>r content objects,<br />

automatically classifying <strong>the</strong> content assets; put ano<strong>the</strong>r way, CIS determines what <strong>the</strong> text is about. The<br />

results of <strong>the</strong> classification can be used to automatically populate <strong>the</strong> content metadata or to map <strong>the</strong><br />

content assets into a taxonomy.<br />

CIS uses linguistic algorithms to analyze content, utilizing content-related terms, keywords, <strong>and</strong> attributes<br />

related to <strong>the</strong> information domain of an enterprise. CIS aggregates content from disparate sources, runs it<br />

through a parser, <strong>and</strong> uses three engines to analyze <strong>the</strong> resulting text, as shown in Figure 10.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 25


Figure 10. Content Intelligence Services analyze <strong>the</strong> text within documents <strong>and</strong> o<strong>the</strong>r<br />

content objects, <strong>and</strong> automatically classify <strong>the</strong> content assets.<br />

The three analysis engines include:<br />

• Information Extraction Engine—Extracts tags, content properties, <strong>and</strong> text from <strong>the</strong> parsed content <strong>and</strong><br />

generates metadata; is itself fur<strong>the</strong>r refined by <strong>the</strong> o<strong>the</strong>r two engines.<br />

• Conceptual Classification Engine—Relates <strong>the</strong> parsed content to predetermined categories or<br />

conceptual taxonomies.<br />

• Semantic Analysis Engine—Analyzes <strong>the</strong> content based on enterprise-specific taxonomies or o<strong>the</strong>r<br />

semantic considerations.<br />

CIS produces a list of concepts contained within <strong>the</strong> set of documents or o<strong>the</strong>r content objects. These<br />

concepts can improve search accuracy as well as provide <strong>the</strong> ability to automatically categorize <strong>the</strong><br />

repository.<br />

Content Delivery Services<br />

The <strong>Documentum</strong> platform provides sophisticated content deployment <strong>and</strong> delivery services to supply<br />

content to web server farms, enterprise portals, <strong>and</strong> application servers. Distribution can be based on sets of<br />

business rules or queries, which define <strong>the</strong> frequency of updates <strong>and</strong> <strong>the</strong> content to be distributed. The<br />

platform can support discrete sets of distribution rules for each environment.<br />

The <strong>Documentum</strong> platform can be integrated with (<strong>and</strong> supply content to) a wide variety of networkaccessible<br />

application, personalization, portal, <strong>and</strong> e-commerce servers from enterprise vendors such as<br />

BEA, IBM, Microsoft, Oracle, Sun, <strong>and</strong> SAP.<br />

Site Caching Services<br />

The <strong>Documentum</strong> platform includes Site Caching Services, which add flexibility for distributing content to<br />

disparate delivery environments. Managers of <strong>the</strong>se external environments can rely on <strong>the</strong> <strong>Documentum</strong><br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 26


platform’s versioning, workflow, lifecycle, <strong>and</strong> o<strong>the</strong>r content management capabilities to maintain <strong>the</strong><br />

content within <strong>the</strong>ir applications.<br />

Site Deployment Services<br />

<strong>Documentum</strong> Site Deployment Services complement Site Caching Services by automatically delivering<br />

content to multiple external web servers or web server farms. If <strong>the</strong> content cannot be delivered as<br />

scheduled, <strong>the</strong>se services also support rollback with self-repair (see Figure 11).<br />

Figure 11. The <strong>Documentum</strong> platform includes Site Caching Services <strong>and</strong> Site Deployment<br />

Services for content delivery to web <strong>and</strong> application servers.<br />

The <strong>Documentum</strong> platform can cache predefined sets of documents or o<strong>the</strong>r content objects—including<br />

both <strong>the</strong> content assets <strong>and</strong> <strong>the</strong> content attributes, or metadata—on intermediate servers in a high-speed,<br />

optimized cached repository. Applications can <strong>the</strong>n access <strong>the</strong>se attributes <strong>and</strong> assets to automatically<br />

personalize <strong>and</strong> incorporate enterprise-managed content.<br />

Process Services<br />

The Process Services capabilities of <strong>the</strong> <strong>Documentum</strong> platform includes Collaborative Services,<br />

capabilities for managing shared workspaces, as well as business process management, a set of products for<br />

managing business processes across <strong>the</strong> enterprise.<br />

Collaborative Services<br />

The <strong>Documentum</strong> platform provides Collaborative Services based on a set of six collaborative objects:<br />

rooms, discussion threads, contextual folders, notes, calendars, <strong>and</strong> data tables.<br />

• Rooms are shared, ad hoc workspaces that have <strong>the</strong>ir own membership lists <strong>and</strong> ownership. Only users<br />

listed as members can access a room <strong>and</strong> <strong>the</strong> content stored within. Rooms support internal <strong>and</strong><br />

external users. Members can be external to <strong>the</strong> organization <strong>and</strong> not o<strong>the</strong>rwise au<strong>the</strong>nticated to access<br />

<strong>the</strong> Document platform.<br />

• Discussion threads are a collection of messages organized around a predefined topic. A discussion<br />

thread can be attached to any o<strong>the</strong>r object stored within <strong>the</strong> <strong>Documentum</strong> repository—such as a<br />

document or a collection of documents stored within a folder.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 27


• Contextual folders collect <strong>and</strong> organize content within a collaborative environment, providing<br />

additional information about <strong>the</strong> purpose of a folder. This descriptive information can appear as a<br />

banner headline or as a “mini-help” environment within <strong>the</strong> context of a folder display.<br />

• Notes are web-based text files, stored in <strong>the</strong> repository, that maintain <strong>the</strong> context (<strong>and</strong> links) to related<br />

objects. For example, a note can be a comment on a paragraph within a document, an annotation for a<br />

document as a whole, or a summary for a set of documents stored within a folder.<br />

• Calendars provide <strong>the</strong> capabilities for members to organize, track, <strong>and</strong> schedule events for <strong>the</strong>ir teams.<br />

• Data tables are an easy way to collect information via a form, <strong>and</strong> <strong>the</strong>n organize <strong>the</strong> resulting fielded<br />

entries in a tabular form. Each row in <strong>the</strong> data table is an object within <strong>the</strong> repository, <strong>and</strong> can be<br />

routed through a workflow for review <strong>and</strong> approval. Notes <strong>and</strong> discussion threads can also be attached<br />

to <strong>the</strong> row.<br />

These collaborative objects are stored just like o<strong>the</strong>r content objects within <strong>the</strong> <strong>Documentum</strong> repository.<br />

They are managed with various repository services including check-in/out, search, workflow, retention,<br />

security, <strong>and</strong> lifecycle.<br />

Collaborative Services support subscriptions. Members can subscribe to any object of interest within a<br />

room (such as all <strong>the</strong> items in a folder or a particular discussion thread) <strong>and</strong> <strong>the</strong>n automatically receive<br />

notifications when information related to <strong>the</strong> object changes.<br />

Collaborative Services provide <strong>the</strong> services-oriented interfaces to call <strong>the</strong> collaborative objects. In turn,<br />

Collaborative Services can be combined with related platform services. For instance, a discussion thread<br />

accompanying <strong>the</strong> authoring <strong>and</strong> editing of a patent application can automatically be managed as a record<br />

<strong>and</strong> be subject to <strong>the</strong> identical retention policies as <strong>the</strong> draft patent documents <strong>the</strong>mselves.<br />

Business process management<br />

The <strong>Documentum</strong> platform provides a complete suite of BPM products, “<strong>the</strong> <strong>Documentum</strong> Process Suite,“<br />

that manages <strong>the</strong> complete lifecycle of business processes across <strong>the</strong> enterprise (see Figure 12). The suite<br />

supports continuous business performance improvement methodologies. It orchestrates processes spanning<br />

beyond <strong>Documentum</strong> to external systems, data sources, <strong>and</strong> applications.<br />

Process Suite combines a process engine <strong>and</strong> a business activity monitoring (BAM) engine, in addition to<br />

<strong>the</strong> core content repository, to deliver extensive BPM capabilities. Since <strong>the</strong> suite is based on <strong>the</strong> unified<br />

architecture of <strong>the</strong> <strong>Documentum</strong> platform, it easily h<strong>and</strong>les any type of content—from e-forms <strong>and</strong> XML<br />

documents to compound documents <strong>and</strong> rich media—as <strong>the</strong> process payload.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 28


Analytical<br />

Reports<br />

BAM<br />

Dashboards<br />

Process Analyzer<br />

Process/Forms<br />

Builder<br />

Modeling,<br />

Configure<br />

Simulation,<br />

Services, Workflow<br />

Analysis<br />

Tasks, Forms<br />

BPM Definition API<br />

TaskSpace Forms<br />

Admin Portal<br />

Web Application Server<br />

<strong>Documentum</strong> WDK Web Apps<br />

BPM Execution API<br />

BAM Engine<br />

<strong>Documentum</strong> Server<br />

J2EE App Server<br />

Automated Alerts<br />

& Actions,<br />

Library Services<br />

Monitoring,<br />

Object<br />

Reports<br />

Management<br />

RDBMS<br />

Event Pipe<br />

<strong>Documentum</strong><br />

Repository<br />

Operational <strong>and</strong> Data Events<br />

Process Engine<br />

J2EE App Server<br />

Process<br />

Orchestration<br />

Work Queue<br />

Management<br />

Timers <strong>and</strong> Deadline<br />

Mgmt<br />

Integration<br />

Events <strong>and</strong><br />

Data Flows<br />

SOA, Adaptors,<br />

Event Correlation, Messaging<br />

Process Integrator<br />

Web<br />

Services<br />

_____<br />

ESB<br />

_____<br />

Data<br />

Source<br />

Adaptors<br />

SAP<br />

Siebel<br />

IBM<br />

…<br />

Figure 12. The <strong>Documentum</strong> platform provides a suite of BPM products to manage<br />

content-intensive business processes across <strong>the</strong> enterprise. The business activity<br />

monitoring (BAM) engine monitors critical aspects of <strong>the</strong> business processes <strong>and</strong><br />

provides up-to-date reports. The Business Process Engine runs <strong>and</strong> manages <strong>the</strong> end-toend<br />

processes, <strong>and</strong> integrates with external applications through a SOA framework. All of<br />

<strong>the</strong> content is stored <strong>and</strong> managed within <strong>the</strong> repository.<br />

Process Suite supports a graphical, object-oriented business process design environment. The Process<br />

Builder specifies <strong>the</strong> flow of content from activity to activity, as well as <strong>the</strong> logic that determines <strong>the</strong><br />

sequence of activities. The processes <strong>and</strong> activities are reusable <strong>and</strong> fully distributed. The Process Builder<br />

supports global structured data types as part of its underlying data model. Consequently, structured data can<br />

be incorporated as a lightweight data type into <strong>the</strong> operation of <strong>the</strong> process models, <strong>and</strong> exposed by <strong>the</strong><br />

reporting tools.<br />

At runtime, <strong>the</strong> Business Process Engine interacts with repository content, following <strong>the</strong> steps in a business<br />

process as defined by <strong>the</strong> Process Suite. The Business Process Engine thus collects information from a<br />

browser-based form or a Simple Object Access Protocol (SOAP) service, <strong>the</strong>n runs a set of processoriented<br />

services. The Process Engine includes persistent state management, queue management services,<br />

automated task framework, timers/deadline services, <strong>and</strong> audit tracking, data collection, <strong>and</strong> aggregation<br />

services to structure <strong>the</strong> predefined sequence of actions <strong>and</strong> activities that constitute <strong>the</strong> business process.<br />

The Process Suite supports an extensible business process management environment, in which third-party<br />

tools—such as <strong>the</strong> ILOG Rules engine, <strong>the</strong> Cognos analytics engine, <strong>and</strong> <strong>the</strong> IDS Scheer<br />

optimizer/simulator—can be added.<br />

The result is a robust business process management environment that leverages managed content <strong>and</strong><br />

structures <strong>the</strong> flow of content across an enterprise.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 29


The tools group: Creating content applications<br />

The <strong>Documentum</strong> platform includes a tools group that provides access to <strong>the</strong> repository content <strong>and</strong> to all<br />

platform-level services. This group consists of predefined components <strong>and</strong> associated application<br />

programming interfaces (APIs) for enabling customizations, integrations, <strong>and</strong> application development. In<br />

addition, <strong>the</strong> APIs are abstracted <strong>and</strong> exposed as loosely coupled interactive components within a serviceoriented<br />

architecture (SOA). The ECM capabilities are exposed as a comprehensive catalog of shared<br />

services <strong>and</strong> web services.<br />

This group provides a consistent set of APIs, <strong>and</strong> a unified object <strong>and</strong> programming model. Application<br />

developers can use <strong>the</strong>se components <strong>and</strong> APIs to develop client-side <strong>and</strong> server-based applications that<br />

interact with repository content. They can leverage composite objects that aggregate content-related<br />

functions to rapidly develop integrated enterprise applications. Application developers can combine content<br />

management services <strong>and</strong> objects with o<strong>the</strong>r enterprise application functions to exploit <strong>the</strong> flexibility of a<br />

SOA development framework.<br />

Enterprise Content Services <strong>and</strong> <strong>the</strong> <strong>Documentum</strong> API<br />

The Tools group encompasses <strong>the</strong> Enterprise Content Services (ECS), st<strong>and</strong>ards-based APIs, <strong>and</strong> an<br />

extensible set of <strong>the</strong> business objects for developing <strong>and</strong> deploying content applications (see Figure 13).<br />

Programming Interfaces<br />

Web Services-based<br />

Enterprise Content Services<br />

St<strong>and</strong>ards-based<br />

API Protocols<br />

Application<br />

Layer<br />

Customization<br />

Layer<br />

<strong>Documentum</strong><br />

<strong>Documentum</strong><br />

Compliance<br />

Collaboration<br />

Services**<br />

Services**<br />

<strong>Documentum</strong><br />

Foundation<br />

Services (DFS)<br />

WebDAV FTP ODBC JDBC<br />

DFC/BOF<br />

** Future services bundle within <strong>the</strong> ECS family<br />

Figure 13. The Tools group encompasses <strong>the</strong> <strong>Documentum</strong> Foundation Classes (DFC),<br />

st<strong>and</strong>ards-based APIs, <strong>and</strong> an extensible set of business objects for developing <strong>and</strong><br />

deploying content applications. <strong>Documentum</strong> Foundation Services (DFS) expose<br />

<strong>Documentum</strong> content management functionality as web services. DFS is <strong>the</strong> first set of<br />

business objects within Enterprise Content Services, <strong>the</strong> <strong>Documentum</strong> service-oriented<br />

architecture for integrating with external applications in a st<strong>and</strong>ards-compliant manner.<br />

ECS encapsulates <strong>the</strong> core content management functions of <strong>the</strong> <strong>Documentum</strong> platform as a set of discrete<br />

web services, <strong>and</strong> exposes <strong>the</strong>se functions as business objects. ECS is designed to make content<br />

applications easier to develop <strong>and</strong> support. ECS promotes reuse <strong>and</strong> reduces <strong>the</strong> learning curve for<br />

developers. By collecting common services into business-related objects, ECS is designed to provide<br />

greater agility for application developers to meet <strong>the</strong> dem<strong>and</strong>s of rapidly evolving business environments.<br />

The Tools group delivers <strong>the</strong> <strong>Documentum</strong> Foundation Services (DFS) as <strong>the</strong> initial set of ECS objects.<br />

DFS is a set of SOA-compliant objects <strong>and</strong> services for developing content applications within a web<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 30


services framework. <strong>Documentum</strong> Compliance Services <strong>and</strong> <strong>Documentum</strong> Collaboration Services are<br />

forthcoming sets of ECS objects to be delivered within <strong>the</strong> <strong>Documentum</strong> platform.<br />

<strong>Documentum</strong> Foundation Services<br />

<strong>Documentum</strong> Foundation Services (DFS) are an SOA development framework <strong>and</strong> API. This framework<br />

replaces <strong>and</strong> significantly enhances <strong>the</strong> previous web services framework. DFS delivers a set of out-of-<strong>the</strong>box<br />

business objects <strong>and</strong> services, designed from <strong>the</strong> ground up, to expose key content management<br />

functionality as st<strong>and</strong>ards-compliant web services. DFS ensures that <strong>the</strong> <strong>Documentum</strong> platform can<br />

function as an integral part of an organization’s information infrastructure, developed using web services.<br />

Loosely coupled components<br />

DFS provides content-related services that are loosely coupled <strong>and</strong> can be dynamically assembled to meet<br />

business needs. These are self-contained services—modifying or enhancing functions within one service<br />

does not affect o<strong>the</strong>rs.<br />

DFS are based on web services, a st<strong>and</strong>ards-based software environment, recognized by <strong>the</strong> World Wide<br />

Web Consortium (W3C), <strong>and</strong> designed to support interoperable machine-to-machine interaction over a<br />

network. DFS components are registered <strong>and</strong> discovered through a central registry or directory, (such as <strong>the</strong><br />

Universal Description, Discovery, <strong>and</strong> Integration Directory). DFS components are described in terms of<br />

Web Services Description Language (WSDL). Each DFS component offers a small range of simple<br />

services to o<strong>the</strong>r components.<br />

DFS provides:<br />

• A set of core <strong>and</strong> extended services, implemented as web services, that expose <strong>Documentum</strong> content<br />

management functionality.<br />

• A Java SDK to enable development of service consumers using client runtime support, <strong>and</strong><br />

development of custom services based on Plain Old Java Objects (POJOs), or Service-based Business<br />

Objects (SBOs) using service runtime support.<br />

• A WSDL service interface to enable development of service consumers using development platforms<br />

that support SOAP messaging, including .NET.<br />

The design of <strong>the</strong> DFS services <strong>and</strong> data model simplifies <strong>the</strong> process of enterprise application development<br />

by reducing <strong>the</strong> overall complexity of <strong>the</strong> API <strong>and</strong> aligning <strong>the</strong> semantics of both services <strong>and</strong> data objects<br />

to <strong>the</strong> needs of ECM business logic. This supports rapid, agile application development using business<br />

process orchestration tools (such as BPM), <strong>and</strong> facilitates integration of enterprise content management into<br />

a service-oriented enterprise (SOE).<br />

DFS services also honor BOF objects. Thus <strong>the</strong> services can call <strong>and</strong> invoke predefined objects when<br />

integrating with <strong>the</strong> <strong>Documentum</strong> repository.<br />

Out-of-<strong>the</strong>-box services<br />

<strong>Documentum</strong> delivers a core set of services that represent essential functions of a generic ECM system.<br />

Each service provides a set of independent operations. The object service, for example, provides basic<br />

content management functionality in operations such as “create,” “get,” “update,” <strong>and</strong> “delete.” The current<br />

DFS services <strong>and</strong> <strong>the</strong>ir related functions are as follows in Table 3.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 31


Table 3. DFS services <strong>and</strong> functions<br />

Service<br />

Description<br />

Object<br />

Version control<br />

Query<br />

Schema<br />

Search<br />

Workflow<br />

Fundamental ECM operations for creating, getting, updating, <strong>and</strong><br />

deleting repository objects, as well as copy <strong>and</strong> move operations<br />

Operations that produce <strong>and</strong> control versions within <strong>the</strong> repository,<br />

such as check-in <strong>and</strong> check-out<br />

Operations for obtaining data from repositories using ad hoc queries,<br />

such as pass though, cache query, results, <strong>and</strong> query builder<br />

Operations that examine repository metadata<br />

Operations that concern full-text <strong>and</strong> property-based searches against<br />

both <strong>the</strong> enterprise repository <strong>and</strong> external information sources<br />

Operations that obtain data about workflow process templates stored<br />

in repositories, <strong>and</strong> an operation that starts a workflow process<br />

instance<br />

Application developers can develop rich Internet applications by linking <strong>the</strong> content services, provided by<br />

DFS, with web services from external applications <strong>and</strong> frameworks, <strong>and</strong> thus provide content-enabled<br />

solutions that leverage enterprise content in new ways.<br />

<strong>Documentum</strong> Composer<br />

Application developers can use <strong>EMC</strong> <strong>Documentum</strong> Composer, an Eclipse-based integrated development<br />

environment (IDE) for developing, deploying, <strong>and</strong> configuring applications running on <strong>the</strong> <strong>Documentum</strong><br />

platform. By leveraging a st<strong>and</strong>ards-based IDE, developers increase <strong>the</strong>ir productivity while reducing <strong>the</strong><br />

cost of application development. Eclipse enables an ecosystem of customers, partners, <strong>and</strong> business<br />

analysts.<br />

<strong>Documentum</strong> Composer supports a series of mechanisms for rapid application development. It includes a<br />

well-defined plug-in model for adding functionality to <strong>the</strong> application environment. As an Eclipse-based<br />

IDE, Composer integrates with <strong>the</strong> broad range of application resources (<strong>and</strong> <strong>the</strong>ir varied editors) available<br />

to developers within <strong>the</strong>ir enterprise computing environments. Composer enables multiple tools to share a<br />

common set of information resources. It provides an open environment, with well-defined interfaces <strong>and</strong><br />

extension points. As a result, application developers can leverage <strong>the</strong>ir investments in DFS <strong>and</strong> ECS; <strong>the</strong>y<br />

can easily develop web services-oriented applications that integrate content-related objects with resources<br />

<strong>and</strong> services of external enterprise applications.<br />

<strong>EMC</strong> <strong>Documentum</strong> Foundation Classes<br />

<strong>Documentum</strong> Foundation Classes (DFC) is <strong>the</strong> published <strong>and</strong> supported programming interface for<br />

accessing <strong>the</strong> functionality of <strong>the</strong> <strong>Documentum</strong> platform. DFC exposes <strong>the</strong> <strong>Documentum</strong> object model as<br />

an object-oriented library for o<strong>the</strong>r applications to use. DFC provides Java <strong>and</strong> component object model<br />

(COM) class libraries that expose <strong>the</strong> functions that drive <strong>the</strong> <strong>Documentum</strong> platform.<br />

Application developers can use programming languages <strong>and</strong> development tools—including those<br />

developed in Java, Visual Basic, C#, <strong>and</strong> C++—to build customized applications.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 32


St<strong>and</strong>ards-based application programming interfaces<br />

<strong>EMC</strong> <strong>Documentum</strong> provides a unified application development environment. In addition to <strong>the</strong> DFC, <strong>the</strong><br />

<strong>Documentum</strong> platform supports a number of st<strong>and</strong>ards-based APIs—from authoring applications, capture<br />

devices, third-party databases, application servers, <strong>and</strong> o<strong>the</strong>r enterprise components—for interacting with<br />

<strong>the</strong> <strong>Documentum</strong> repository. These st<strong>and</strong>ards-based APIs interact seamlessly with—by calling <strong>and</strong> being<br />

called by—<strong>the</strong> DFC. Application developers can choose <strong>the</strong> APIs that best suit <strong>the</strong>ir applications.<br />

The st<strong>and</strong>ard APIs include JDBC, WebDAV, file transfer protocol (FTP), <strong>and</strong> operating system-level,<br />

network file services (File Share Services). They are described as follows:<br />

• JDBC—Many server-based applications use st<strong>and</strong>ard Java data access protocols to access repository<br />

content through <strong>Documentum</strong> JDBC Services. These services act as a driver that makes <strong>the</strong><br />

<strong>Documentum</strong> repository “look like” a JDBC database.<br />

• WebDAV—The <strong>Documentum</strong> platform supports a WebDAV server that enables WebDAV-aware<br />

desktop applications, such as Adobe Photoshop <strong>and</strong> Adobe InDesign, to use this protocol to interact<br />

with <strong>the</strong> <strong>Documentum</strong> repository.<br />

• FTP—The <strong>Documentum</strong> platform provides an FTP server that enables third-party tools, such as<br />

Macromedia Dreamweaver, to integrate with <strong>the</strong> <strong>Documentum</strong> repository.<br />

• File Share Services—The <strong>Documentum</strong> platform supports network-level file sharing services, enabling<br />

<strong>the</strong> <strong>Documentum</strong> repository to “look like” a shared network drive to disparate desktop applications.<br />

These applications can <strong>the</strong>n use <strong>the</strong>ir own file system access mechanisms to access <strong>and</strong> add content to<br />

<strong>the</strong> <strong>Documentum</strong> repository.<br />

Business Objects Framework<br />

The <strong>Documentum</strong> platform includes a Business Objects Framework (BOF), a structured environment for<br />

developing content applications. The BOF shields application developers from <strong>the</strong> implementation details<br />

of <strong>the</strong> platform’s granular DFC <strong>and</strong> <strong>the</strong> underlying object model on which <strong>the</strong> DFC is based. Thus BOF<br />

enables application developers to easily develop highly reusable components that can be shared by multiple<br />

applications.<br />

BOF functions by abstracting <strong>the</strong> <strong>Documentum</strong> APIs <strong>and</strong> aggregating sets of <strong>the</strong>se APIs into a business<br />

logic layer. BOF provides a way to develop reusable business logic components, called business objects.<br />

(Business objects are entities with predefined classes <strong>and</strong> properties (attributes) <strong>and</strong> can have unstructured<br />

content associated with <strong>the</strong>m.) The BOF can implement business logic as reusable components that can be<br />

plugged into middle-tier network applications or client applications. These business objects combine<br />

presentation <strong>and</strong> business logic with direct access to all Content Services.<br />

Types of business objects<br />

The <strong>Documentum</strong> platform supports several types of business objects.<br />

• A type-based business object is tightly linked to an object type stored in <strong>the</strong> <strong>Documentum</strong> repository.<br />

Application developers can add additional methods to <strong>the</strong> built-in or configured object type. Examples<br />

include “catalog,” “product,” “contract,” <strong>and</strong> “customer.”<br />

• A service-based business object provides methods that perform more generalized procedures not<br />

usually bound to a specific object type or repository. Ra<strong>the</strong>r, such objects represent a collection of<br />

functions that may operate on o<strong>the</strong>r kinds of business objects. Examples include “mailbox alert,”<br />

“catalog export,” <strong>and</strong> “syndicate” service.<br />

A method associated with a business object can be called by o<strong>the</strong>r DFC-based applications. JSP, ASP,<br />

Visual Basic, <strong>and</strong> o<strong>the</strong>r languages can access both types of business objects.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 33


Aspects<br />

The <strong>Documentum</strong> platform supports aspects, an addition framework for extending object behavior <strong>and</strong><br />

attributes. Aspects are a type of BOF entity that can be dynamically attached to object instances, to provide<br />

fields <strong>and</strong> methods beyond <strong>the</strong> st<strong>and</strong>ard ones for <strong>the</strong> object type. The extended behavior can include<br />

functionality that applies to types across <strong>the</strong> object hierarchy. Aspects speed application development <strong>and</strong><br />

improve code reuse, as <strong>the</strong> extended attributes <strong>and</strong> behavior do not alter <strong>the</strong> underlying type definitions.<br />

For example, an aspect can label objects as retainable or web-viewable. This single aspect can <strong>the</strong>n be<br />

applied to multiple distinct object types. Aspects speed application development <strong>and</strong> improve code reuse,<br />

because <strong>the</strong> extended attributes <strong>and</strong> behavior do not alter <strong>the</strong> underlying type definitions.<br />

Aspects can be associated with ei<strong>the</strong>r an individual object or an object type. When associated with an<br />

object type, <strong>the</strong> aspect is automatically associated with each new object of <strong>the</strong> specified object type.<br />

Aspects can also have properties defined for <strong>the</strong>m. Properties defined for an aspect appear to users as if<br />

<strong>the</strong>y are defined for <strong>the</strong> object type of <strong>the</strong> object to which <strong>the</strong> aspect is attached.<br />

The experiences group: Managing <strong>the</strong> end user’s<br />

interactions<br />

The Web Development Kit framework<br />

The <strong>Documentum</strong> platform includes a Web Development Kit (WDK), an application development<br />

framework for developing web-based clients <strong>and</strong> user applications. The <strong>Documentum</strong> platform also uses<br />

<strong>the</strong> WDK to provide a series of Application Connectors for integrating <strong>Documentum</strong> functionality within<br />

Word, Excel, PowerPoint, <strong>and</strong> <strong>Documentum</strong> Client for Outlook, as well as portlets for exposing<br />

<strong>Documentum</strong> functionality from within a portal.<br />

The WDK framework provides application developers with a consistent <strong>and</strong> unified environment for<br />

creating web-based applications accessing <strong>the</strong> <strong>Documentum</strong> repository. The WDK framework relies on a<br />

form-control-event approach, consistent with .NET WebForms <strong>and</strong> <strong>the</strong> Java Server Faces st<strong>and</strong>ard (JSR<br />

127).<br />

The WDK provides hundreds of pre-packaged JSR 168-compliant components—JSR 168 is <strong>the</strong> Java<br />

community st<strong>and</strong>ard for developing portlets—that enable <strong>Documentum</strong> developers to easily build <strong>and</strong><br />

customize web-based content applications. In fact, all <strong>Documentum</strong> clients <strong>and</strong> applications—including<br />

Webtop, Web Publisher, <strong>and</strong> Compliance Manager—are built using WDK.<br />

The WDK framework provides a set of WDK services that run locally on a client-side device—ei<strong>the</strong>r<br />

within a browser or a desktop applications—<strong>and</strong> interact with server-side business objects (developed using<br />

<strong>the</strong> BOF) or with DFC functions (see Figure 14).<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 34


Figure 14. The <strong>Documentum</strong> platform includes a Web Development Kit for developing<br />

browser-based, web-centric applications <strong>and</strong> Windows-based desktop applications.<br />

Within Windows-based desktop applications, <strong>the</strong> WDK framework provides COM objects for sending <strong>and</strong><br />

receiving HTTP messages to <strong>and</strong> from a web application server. The messages are exchanged as XML<br />

documents.<br />

Application Connectors<br />

Application Connectors are WDK components that provide access to <strong>the</strong> <strong>Documentum</strong> repository <strong>and</strong><br />

content services from within desktop applications such as Microsoft Office. Application Connectors are<br />

built on an open framework that enables application developers to add connectors as plug-ins. Because<br />

Application Connectors function consistently within various desktop applications, a single set meets all of<br />

an application developer’s needs.<br />

Application Connectors appear as menu items within a desktop application’s pull-down menu. From<br />

Microsoft Word, Excel, <strong>and</strong> PowerPoint, <strong>the</strong> Application Connectors directly call <strong>the</strong> server-side<br />

components within <strong>the</strong> <strong>Documentum</strong> platform, perform <strong>the</strong> action, <strong>and</strong> return <strong>the</strong> results to <strong>the</strong> calling<br />

Office application.<br />

For example, a Microsoft Word user could use <strong>the</strong> <strong>Documentum</strong> menu to query <strong>and</strong> access documents<br />

stored within <strong>the</strong> <strong>Documentum</strong> repository. The Application Connector first au<strong>the</strong>nticates <strong>the</strong> user <strong>and</strong> <strong>the</strong>n<br />

authorizes access rights, enabling <strong>the</strong> user to easily access <strong>the</strong> documents within Word. Meanwhile, <strong>the</strong><br />

server-side content is managed by <strong>the</strong> business policies of <strong>the</strong> <strong>Documentum</strong> platform.<br />

Application developers can use <strong>the</strong> Application Connector SDK to develop additional application<br />

connectors for <strong>the</strong> desktop applications of <strong>the</strong>ir choice.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 35


A Webtop extension<br />

The <strong>Documentum</strong> platform provides extensions to <strong>EMC</strong> <strong>Documentum</strong> Webtop for additional functionality,<br />

such as collaboration <strong>and</strong> records management, which can be added to any <strong>Documentum</strong> client including<br />

Webtop, Digital Asset Manager, <strong>and</strong> Web Publisher. These extensions are browser-based WDK<br />

components that can access <strong>the</strong> records services provided by <strong>the</strong> platform extensions. The platform<br />

extensions add functionality <strong>and</strong> new object types managed within <strong>the</strong> repository such as rooms, discussion<br />

threads, contextual folders, <strong>and</strong> notes.<br />

Portlets for enterprise portals<br />

Finally, <strong>the</strong> WDK framework supports JSR-168 for developing portlets—pluggable components managed<br />

<strong>and</strong> displayed within an enterprise portal. The WDK provides native access to <strong>the</strong> underlying content<br />

management capabilities of <strong>the</strong> <strong>Documentum</strong> platform within <strong>the</strong> context of an enterprise portal.<br />

<strong>EMC</strong> offers a set of pre-packaged WDK-based JSR 168 portlets with common functionality such as<br />

“inbox,” “my folders,” <strong>and</strong> “search.” Developers can, however, use <strong>the</strong> WDK to build any o<strong>the</strong>r kind of<br />

portlet based on <strong>the</strong> WDK components.<br />

Note: <strong>EMC</strong> also provides portlets for <strong>the</strong> SAP Portal, called iViews. As SAP Portal does not support <strong>the</strong><br />

JSR 168 st<strong>and</strong>ard, <strong>the</strong> SAP portlets are built using native SAP technology.<br />

Conclusion<br />

The <strong>EMC</strong> <strong>Documentum</strong> architecture provides a strategy for solving today’s needs to manage unstructured<br />

content, <strong>and</strong> for investing in tomorrow’s opportunities to profit from content-centric applications.<br />

<strong>Documentum</strong> delivers <strong>the</strong> services for managing unstructured business information within an enterprise <strong>and</strong><br />

beyond. Using <strong>the</strong> <strong>Documentum</strong> platform, companies can ensure that unstructured content is stored,<br />

secured, delivered, <strong>and</strong> archived in a systematic manner that follows predefined business rules <strong>and</strong><br />

conforms to established policies <strong>and</strong> procedures.<br />

The <strong>Documentum</strong> platform enables companies to develop robust content applications that solve missioncritical<br />

business problems. For example, marketers <strong>and</strong> external business partners can always have easy<br />

access to <strong>the</strong> latest product information, while engineers <strong>and</strong> scientists follow established business<br />

processes when documenting new technologies. Companies can archive <strong>and</strong> retain content to meet<br />

compliance requirements, while enabling multiple departments <strong>and</strong> external business partners to easily<br />

work toge<strong>the</strong>r <strong>and</strong> share any type of content over <strong>the</strong> network.<br />

Finally, <strong>the</strong> <strong>Documentum</strong> platform provides application-level components for developing enterprise-scale<br />

applications that use content within <strong>the</strong> context of business processes <strong>and</strong> delivers a broad range of<br />

application experiences to desktop- <strong>and</strong> browser-based applications. These capabilities form <strong>the</strong> foundation<br />

for tomorrow’s solutions: managed content that disparate applications can access <strong>and</strong> consume, as needed,<br />

as flexible web services based on a SOA environment.<br />

<strong>EMC</strong> <strong>Documentum</strong> <strong>Architecture</strong><br />

A Detailed Review 36

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!