12.07.2015 Views

Implementation of a Distributed Document-Based System

Implementation of a Distributed Document-Based System

Implementation of a Distributed Document-Based System

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Abstract<strong>Implementation</strong> <strong>of</strong> a <strong>Distributed</strong> <strong>Document</strong>-<strong>Based</strong> <strong>System</strong>David Leigh DellingerComputer Science DepartmentWichita State UniversityWichita, KSThe World Wide Web and commercial s<strong>of</strong>tware products such as Lotus Notes havedemonstrated the value and popularity <strong>of</strong> distributed document-based systems. The user knowswhat information he wants, but does not know where the data is located, how the data is accessedor that other users can access the data at the same time, three aspects <strong>of</strong> transparency in adistributed system. The user may not even know how to generate a similar document. Thisconcept is also in use in business today through company intranets, using Product LifecycleManagement (PLM) applications such as UGS’ Teamcenter and Dassault <strong>System</strong>es’ ENOVIA ordocument storage systems such as Micros<strong>of</strong>t’s Windows SharePoint Services[4]. A user-levelapplication (web-based or otherwise) provides the mechanism for specifying the informationdesired and hides the implementation and process(es) <strong>of</strong> retrieval. This reduces the level <strong>of</strong>training for the user and provides a consistent, repeatable, modular method <strong>of</strong> extractinginformation upon request.Keywords<strong>Distributed</strong>, document-based, transparency, Product Lifecycle Management1 IntroductionAs businesses progress from a paper-based environment to an electronic documentsystem, the need to store and retrieve documents upon demand grows. <strong>Document</strong>s are usedinternally between departments and may also be exchanged with suppliers to ensure that theinterface between parts is precisely defined. Certification documents may also be exchangedbetween business partners to satisfy regulatory agency audits.The challenge for any document management system is to provide simple, consistentaccess to documents with a minimum <strong>of</strong> training for each user, while maintaining proper securitymeasures. When a product lasts for more than ten years, such as aircraft, Product LifecycleManagement (PLM), the industry term for managing the data for the product over such anextended period <strong>of</strong> time, becomes the concern."Without some way to manage all this information, we waste time and sometimes peoplearen't looking at the latest version <strong>of</strong> what they need," explains Jim McKenzie, HendrickMotorsports' applications manager. "If we have a problem at the race track, we have three days atmost to fix it. If you spend a few hours <strong>of</strong> that time searching for data, it can really handicapyou."[1]."To enhance our end-to-end engine lifecycle, we have decided to implement a DassaultSystèmes and IBM PLM solution based on CATIA V5, ENOVIA VPM, ENOVIA Portal andDELMIA," says Amal Girgis, CIO, Pratt &Whitney Canada [2].


2 Build, Buy or Both22.1 One Size Does Not Fit AllEven when a business has an electronic PLM system, providing access to legacy datasystems can add a level <strong>of</strong> complexity, particularly as the system ages and falls far behind theleading edge <strong>of</strong> technology. The PLM system may work extremely well with documents that fallwithin its product line, but may lack the flexibility to work with legacy applications. Thebusiness has three choices: contract with the PLM supplier to integrate the legacy applicationsinto its system, contract with a third-party s<strong>of</strong>tware company or perform the work in-house.Unless the integration project has widespread appeal to an industry, the PLM supplier may not beinterested in modifying their s<strong>of</strong>tware. A third-party company may need education and trainingon both products, delaying the implementation. Assuming the PLM system providesprogramming hooks for user-defined functions, the in-house solution may be worth considering.The company’s Information Technology staff already knows the intricacies <strong>of</strong> the legacyapplications and may be able to provide a cost-efficient integration solution.2.2 The In-House SolutionThis programming project is a pro<strong>of</strong>-<strong>of</strong>-concept technology demonstration. It representsaspects <strong>of</strong> an in-house solution where the different data repositories are linked through a singleuser interface. The project implements a program that retrieves information stored in differentrepositories throughout a network <strong>of</strong> computers using Remote Method Invocation (RMI) and theCommon Object Request Broker Architecture (CORBA) to find, retrieve and display documentsto the user. The project demonstrates the ability to provide a modular solution regardless <strong>of</strong> themeans <strong>of</strong> data access, data location, data replication, local data caching and other features <strong>of</strong> adistributed system.3 Project Scope3.1 Project LimitationsAs a technology demonstrator, this project lacks features that would be part <strong>of</strong> anenterprise application. For instance, at the enterprise level, maintaining a list <strong>of</strong> files isimpractical as is supporting only text and Adobe Portable <strong>Document</strong> Format (PDF) files . Thetrue solution would include a search engine and touch all data repositories in the companyincluding the PLM system, databases, flat filesystems, and the legacy systems mentioned above.The application could be deployed as a web-based application, Java applet or stand-aloneapplication, depending on the deployment method preferred..3.2 Application configurationThe application revolves around a single configuration file maintained by anadministrator. The configuration file lists the server hostnames, server type, RMI and CORBA


43.4 Server communicationsThe servers use two types <strong>of</strong> communication with the client, RMI and CORBA. Whichtype the server uses is determined by the administrator and defined in the configuration file. Theserver startup script queries the configuration file, establishes the appropriate communicationsmethod and port (rmiregistry for RMI and tnameserv for CORBA) and starts the server for thattype. New communication methods, such as sockets or Remote Process Communications (RPC)can be implemented as well.3.5 User Guide and Source CodeAppendix A gives the user guide for the application.4 Project ImprovementsThis technology demonstrator would require several refinements before it could be readyfor use in a business. The configuration file would be replaced with a database so that optimizedqueries could be executed against the file records. An administration tool would decrease thelevel <strong>of</strong> effort for the administrator and reduce the chance <strong>of</strong> data entry errors. The number andtype <strong>of</strong> documents supported would have to be increased to be valuable to a user community.Security and data integrity are not addressed in this project and defining access for groups <strong>of</strong>users could be handled in the database. The local caching <strong>of</strong> files could be improved by a bettermethod <strong>of</strong> comparison than the file size and the last modified date. The last modified date <strong>of</strong> afile is particularly vulnerable to inconsistent data problems because it relies on the physicalclocks <strong>of</strong> the machines. The caching method would benefit from a means <strong>of</strong> guaranteeing eventorder processing such as Lamport timestamps [3].5 ConclusionsA company probably cannot simply use a commercial PLM system and may not beinterested in developing and supporting a total in-house solution. A choice that many aerospacecompanies have chosen is to place the documents critical to the life <strong>of</strong> the business (such asEngineering design data) in a PLM system and develop a user interface which includes the PLMsystem and their legacy document systems.References[1] UGS Teamcenterhttp://www.ugs.com/about_us/success/hendrick.shtml[2] Dassault <strong>System</strong>es ENOVIAhttp://www.3ds.com/your-strategy/customers-stories/story-description/story/32/1/


5[3] Andrew S. Tanenbaum and Maarten van Steen <strong>Distributed</strong> <strong>System</strong>s: Principlesand Paradigms Prentice Hall, 2002, ISBN: 0-13-088893-1[4] Micros<strong>of</strong>t Windows SharePoint Serviceshttp://www.micros<strong>of</strong>t.com/windowsserver2003/technologies/sharepoint/default.mspx


Appendix AApplication User Guide


Compiling the programTo compile the program, log into any Linux workstation and change to the source codedirectory:kira> cd cs843/projectRun the compile_pgm script to generate program and helper classes. The script shouldstart a bash shell and set the environment to use Java 1.5. It will then generate CORBA helperclasses using idlj, compile the CORBA code, then compile the RMI code. It should leave theuser at a bash shell prompt. Please check the environment to ensure that JAVA_HOME points toJava 1.5, otherwise the application will not work.kira> compile_pgmNote: DLDCorbaApp/DLDCorbaPOA.java uses unchecked or unsafe operations.Note: Recompile with -Xlint:unchecked for details.Note: DLDClient.java uses unchecked or unsafe operations.Note: Recompile with -Xlint:unchecked for details.Done with compiler scriptbash-2.05a$If the environment still isn’t correct, execute the following commands:bash-2.05a$ . ~cs843/bin/jdk.shbash-2.05a$ export CLASSPATH=$CLASSPATH:.Server Setup and Starting the ServersUse rsh or ssh in separate windows to sisko, kirk and spock. These three servers arelisted in the configuration file, CS843Project.cfg. Let sisko be server 1, kirk is server 2 andspock is server 3. It does not matter which server is which number, but for the next part <strong>of</strong> thesetup, consistency is required. The server_copy scripts are for project testing convenience.sisko> cd cs843/project (or source directory)sisko> server_copy1sisko> server_startDLDRMIServer Listening on port 50168DLDRMI server running on sisko.cs.wichita.edukirk> cd cs843/project (or source directory)kirk> server_copy2kirk> server_startInitial Naming Context:IOR:000000000000002b49444c3a6f6d672e6f72672f436f734e616d696e672f4e616d696e67436f6e746578744578743a312e30000000000001000000000000009a000102000000000e3135362e32362e31302e32333900c3f900000045afabcb00000000201f0de98a0000000100000000000000020


0000008526f6f74504f41000000000d544e616d65536572766963650000000000000008000000010000000114000000000000020000000100000020000000000001000100000002050100010001002000010109000000010001010000000026000000020002TransientNameServer: setting port for initial object references to: 50169Ready.DLDCorbaServer ready and waiting ...spock> cd cs843/project (or source directory)spock> server_copy3spock> server_startDLDRMIServer Listening on port 50168DLDRMI server running on spock.cs.wichita.eduServer Closure and CleanupWhen the client application has exited, change to each <strong>of</strong> the server windows and stop theserver process (CTRL-C). Find and kill the rmiregistry process (RMI servers) or the tnameservprocess (CORBA) associated with each server. Execute the following server_del scripts (forproject testing convenience) to delete the project files from the servers:sisko>server_del1kirk> server_del2spock>server_del3Starting the Client ApplicationOnce the server programs are running, execute the client application. The application hasa graphical interface using Java Swing. The compile_pgm should have started a bash shell andset the environment. Please check the environment to ensure that JAVA_HOME points to Java1.5, otherwise the application will not work.If the environment still isn’t correct, execute the following commands:kira> /bin/bashbash-2.05a$ . ~cs843/bin/jdk.shbash-2.05a$ export CLASSPATH=$CLASSPATH:.bash-2.05a$ java -classpath . DLDClientThe application will display the graphical user interface as shown in Figure A1.


Figure A1. Graphical user interfaceWhen the user selects a text file, it is retrieved from a server, written to a local filesystemand displayed in the text area to the right <strong>of</strong> the file list (see Figure A2). The “Clear” button canbe used to clear the text area at any time. The text area will be cleared every time a new file isselected.Figure A2. Text file selectionWhen the user selects an Adobe PDF file, it also is retrieved from a server and written toa local filesystem. The application calls a PDF viewer, xpdf, as a separate process whichdisplays beside the application window as shown in Figure A3.


Figure A3. PDF file selection


As the client retrieves files from the server, the console displays the results <strong>of</strong> the search.These messages can be easily removed from the client source code or enabled through a debugswitch in the client’s invocation. A set <strong>of</strong> typical console results are displayed in Figure A4.Note the results in red in Figure A4. This message is printed when the client workstation’slocally cached copy is the same as the server’s copy.bash-2.05a$ java -classpath . DLDClientSelected 2005_assign1.txtChecking for 2005_assign1.txt on RMI server: siskoNew copy <strong>of</strong> /tmp/2005_assign1.txt returned from serverSelected 2005_hwk1.txtChecking for 2005_hwk1.txt on RMI server: siskoChecking for 2005_hwk1.txt on CORBA server: kirkChecking for 2005_hwk1.txt on RMI server: spockNew copy <strong>of</strong> /tmp/2005_hwk1.txt returned from serverSelected 2003_Final_Exam.pdfChecking for 2003_Final_Exam.pdf on RMI server: siskoChecking for 2003_Final_Exam.pdf on CORBA server: kirkNew copy <strong>of</strong> /tmp/2003_Final_Exam.pdf returned from serverSelected 2005_assign1.txtChecking for 2005_assign1.txt on RMI server: siskoNo changes in /tmp/2005_assign1.txt using local cached copySelected CS843_Syllabus.pdfChecking for CS843_Syllabus.pdf on RMI server: siskoChecking for CS843_Syllabus.pdf on CORBA server: kirkNew copy <strong>of</strong> /tmp/CS843_Syllabus.pdf returned from serverFigure A4. Sample Client Console OutputEnding the Client ApplicationThe “Exit” button is used to end the client application. The window decoration (“X”) inthe upper right corner can also be used to end the client application. Ending the clientapplication does not terminate the servers as another user may be executing the client applicationfrom another workstation.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!