11.07.2015 Views

Encyclopedia of Computer Science and Technology

Encyclopedia of Computer Science and Technology

Encyclopedia of Computer Science and Technology

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

data mining 135Further ReadingArimura, Mitsuharu. “Mitsuharu Arimura’s Bookmarks on SourceCoding/Data Compression.” Available online. URL: http://www.hn.is.uec.ac.jp/~arimura/compression_links.html.Accessed July 8, 2007.“Data Compression Reference Center.” Available online. URL:http://www.rasip.fer.hr/_research/compress/index.html.Accessed July 8, 2007.Saloman, David. Data Compression: The Complete Reference. 4th ed.New York: Springer-Verlag, 2006.Sayood, Khalid. Introduction to Data Compression. 3rd ed. SanFrancisco: Morgan Kaufmann, 2005.data conversionThe developer <strong>of</strong> each application program that writes datafiles must define a format for the data. The format mustbe able to preserve all the features that are supported bythe program. For example, a word processing program willinclude special codes for font selection, typestyles (such asbold or italic), margin settings, <strong>and</strong> so on.In most markets there are more than one vendor, sothere is the potential for users to encounter the need toconvert files such as word processing documents from onevendor’s format to another. For example, a Micros<strong>of</strong>t Worduser needing to send a document to a user who has Word-Perfect, or the user may encounter another user who alsohas Micros<strong>of</strong>t Word, but a later version.There are some ways in which vendors can relieve some<strong>of</strong> their users’ file conversion issues (<strong>and</strong> thus potentialcustomer dissatisfaction). Vendors <strong>of</strong>ten include facilities toread files created by their major rivals’ products, <strong>and</strong> to savefiles back into those formats. This enables users to exchangefiles. Sometimes the converted document will look exactlylike the original, but in some cases there is no equivalencebetween a feature (<strong>and</strong> thus a code) in one application <strong>and</strong> afeature in the other application. In that case the formattingor other feature may not carry over into the converted version,or may be only partially successful.Vendors generally make a new version <strong>of</strong> an applicationdownwardly compatible with previous versions (seealso compatability <strong>and</strong> portability). This means that thenew version can read files created with the earlier versions.(After all, users would not be happy if none <strong>of</strong> their existingdocuments were accessible to their new s<strong>of</strong>tware!) Similarly,there is usually a way to save a file from the later version inthe format <strong>of</strong> an earlier version, though features added in thelater version will not be available in the earlier format.Another strategy for exchanging otherwise incompatiblefiles is to find some third format that both applications canread. Thus Rich Text Format (RTF), a format that includesmost generic document features, is supported by most modernword processors. A user can thus export a file as RTF<strong>and</strong> the user <strong>of</strong> a different program will be able to read it(see rtf). Similarly, many database <strong>and</strong> other programs canexport files as a series <strong>of</strong> data values separated by commas(comma-delimited files), <strong>and</strong> the files can be then read by adifferent program <strong>and</strong> converted to its “native” format.A variety <strong>of</strong> format conversion utilities are available aseither commercial s<strong>of</strong>tware or shareware. There are also businessesthat specialize in data conversion. While their servicescan be expensive, using them may be the best way to convertlarge numbers <strong>of</strong> files, rather than having to individuallyload <strong>and</strong> save them. Data conversion services can also h<strong>and</strong>lemany “ancient” data files from the 1970s or even early 1980swhose formats are no longer supported by current s<strong>of</strong>tware.Further ReadingHeuser, Werner. “Data Conversion <strong>and</strong> Migration Tools.” Availableonline. URL: http://dataconv.org/. Accessed July 8, 2007.“Media Conversion: Online File Conversions.” Available online.URL: http://www.iconv.com/. Accessed July 8, 2007.data dictionaryA modern enterprise database system can contain hundreds<strong>of</strong> separate data items, each with important characteristicssuch as field types <strong>and</strong> lengths, rules for validating the data,<strong>and</strong> links to various databases that use that item (see databasemanagement system). There can also be many differentviews or ways <strong>of</strong> organizing subsets <strong>of</strong> the data, <strong>and</strong> storedprocedures (program code modules) used to perform variousdata processing functions. A developer who is creatingor modifying applications that deal with such a vast databasewill <strong>of</strong>ten need to check on the relationships between dataelements, views, procedures, <strong>and</strong> other aspects <strong>of</strong> the system.One fortunate characteristic <strong>of</strong> computer science is thatmany tools can be applied to themselves, <strong>of</strong>ten because thecontents <strong>of</strong> a program is itself a collection <strong>of</strong> data. Thus, it ispossible to create a database that keeps track <strong>of</strong> the elements<strong>of</strong> another database. Such a database is sometimes called adata dictionary. A data dictionary system can be developedin the same way as any other database, but many databasedevelopment systems now contain built-in facilities for generatingdata dictionary entries as new data items are defined,<strong>and</strong> updating definitions as items are linked together <strong>and</strong> newviews or stored procedures are defined. (A similar approachcan be seen in some s<strong>of</strong>tware development systems that createa database <strong>of</strong> objects defined within programs, in order topreserve information that can be useful during debugging.)Data dictionaries are particularly important for creatingdata warehouses (see data warehouse), which are largecollections <strong>of</strong> data items that are stored together with theprocedures for manipulating <strong>and</strong> analyzing them.Further ReadingKreines, David. Oracle Data Dictionary Pocket Reference. Sebastapol,Calif.: O’Reilly Media, 2003.Pelzer, Trudy. “MySQL 5.0 New Features: Data Dictionary.”Available online. URL: http://dev.mysql.com/tech-resources/articles/mysql-datadictionary.html. Accessed July 8, 2007.data glove See haptics.data miningThe process <strong>of</strong> analyzing existing databases in order to finduseful information is called data mining. Generally, a database,whether scientific or commercial, is designed for a

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!