28.10.2014 Views

Creating Databases – Importing a Delimited ASCII Text ... - LexisNexis

Creating Databases – Importing a Delimited ASCII Text ... - LexisNexis

Creating Databases – Importing a Delimited ASCII Text ... - LexisNexis

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>LexisNexis</strong> ® Concordance ® 2007<br />

<strong>Creating</strong> <strong>Databases</strong> –<br />

<strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File<br />

Document Overview<br />

• Before You Begin<br />

• <strong>Creating</strong> a New Database File<br />

• Configuring Fields for Your Data<br />

• <strong>Importing</strong> Your Data<br />

• Additional Resources


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 2<br />

Concordance ® 2007 Quick Help<br />

Concordance is a registered trademark of Applied Discovery, Inc. © 2007 Concordance. All rights<br />

reserved.<br />

<strong>LexisNexis</strong> and the Knowledge Burst logo are registered trademarks of Reed Elsevier Properties Inc.,<br />

used under license. Concordance is a registered trademark and FYI is a trademark of Applied Discovery,<br />

Inc. Other products or services may be trademarks or registered trademarks of their respective companies.<br />

© 2007 Concordance. All rights reserved.<br />

Concordance ®<br />

Concordance ® Image<br />

Concordance ® FYI <br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 3<br />

Before You Begin<br />

<strong>Delimited</strong> <strong>ASCII</strong> text files store 2-dimensional arrays of data by separating the values in each row with<br />

specific delimiter characters. Most database and spreadsheet programs are able to read or save data in a<br />

delimited format. <strong>Delimited</strong>-text files may have extensions such as .DAT, .ASC, .CSV or even .TXT, as<br />

long as the file is structured properly with text qualifiers, field delimiters and line breaks.<br />

For many Concordance databases the files will also include optical character recognized (OCR) text and<br />

scanned document images. DAT files will often accompany the OCR text and image files containing the<br />

metadata for each document.<br />

The procedure outlined in this document describes how to import a delimited <strong>ASCII</strong> text (.DAT) file.<br />

You will need…<br />

• Concordance<br />

• <strong>Text</strong> editor program (<strong>Text</strong>Pad, UltraEdit or similar)<br />

• <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) file<br />

<strong>Creating</strong> a New Database File<br />

1 Open Concordance.<br />

2 In the File menu select New.<br />

Figure 1: Concordance Menu – File<br />

3 In the Create database from template dialog (see figure 2), select the Blank database type.<br />

Figure 2: Create database from template – General tab<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 4<br />

4 Click OK.<br />

5 When prompted, choose a file name and directory (choose to store your database locally or on a<br />

network drive).<br />

NOTE – You must have full access to the directory.<br />

6 Click Open to save the database and begin creating and customizing your fields.<br />

Configuring Fields for Your Data<br />

Selecting the Blank database template creates an empty database containing no fields and is best to use<br />

when you are creating a custom structure for a delimited <strong>ASCII</strong> text (.DAT) file.<br />

Plan your database structure<br />

Open your DAT file with a text editor.<br />

Note the following:<br />

• Delimiters used in the file (<strong>Text</strong> qualifier, field, and new line delimiters)<br />

• Field Headers (the first line will usually contain the field headers)<br />

• Type, format, and length of data<br />

• Date fields are 8 digits max, may be in any order with slashes, or in the universal “true date” format<br />

without slashes<br />

• Field(s) database users need to search and sort<br />

• Field (if any) to be linked to an image<br />

• OCR content (if any) to be imported<br />

Tip - While you have the DAT file open, scroll to the bottom of the file, and ensure that the last record<br />

(the last line) has a new line delimiter (create by pressing Enter on your keyboard) at the end of the<br />

record. Without the final return, the last record will not be imported into your database.<br />

Immediately upon creating a blank database the New field dialog will open prompting you to begin<br />

creating and configuring your fields.<br />

1 Type the name of your first field in the Name field (see figure 3).<br />

NOTE – Field names do not need to match field headers specified in the DAT file. They may be up<br />

to 12 characters long and entered in upper or lower case letters. All characters will all be converted to<br />

upper case by the system. They must begin with a letter and may contain only alphanumeric<br />

characters and the underscore.<br />

2 Select the field type in the Type drop-down, and select the appropriate attributes for the field.<br />

Types and Attributes - To successfully import your DAT file, you must create fields to match the<br />

data type and size of your data. Refer to Tables 1 and 2 below for information about Field Types and<br />

Attributes.<br />

Field Order - Create your fields in the order in which you will want to view them in Table and<br />

Browse views. Use the Insert and Delete (Similar functions to Paste and Cut respectively in MS<br />

Office products) buttons to arrange fields into the desired order as necessary.<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 5<br />

3 Click New to confirm your choices and to create the next field.<br />

NOTE – If you accidentally click OK instead of New to create a new field, the New field definition<br />

dialog will close. To access this panel again, select Modify in the File menu.<br />

Figure 3: New field definition dialog<br />

Field Types<br />

Type Capacity Notes<br />

<strong>Text</strong>*<br />

Numeric*<br />

Date*<br />

Paragraph<br />

MMDDYYYY<br />

YYYYMMDD<br />

DDMMYYYY<br />

1-60 alpha or numeric characters, keyed by<br />

default<br />

1-20 digits long (including the decimal<br />

place, negative sign, and all digits following<br />

the decimal place), keyed by default<br />

Use for numeric values that are not used<br />

mathematically (i.e. phone numbers, social<br />

security numbers, and other serial numbers)<br />

Note - If you intend to sort records based on<br />

this field, zero fill any numeric values stored<br />

in to ensure they sort correctly.<br />

Display options available:<br />

• Currency<br />

• Commas<br />

• Zero filled<br />

• Plain<br />

8 bytes in length The date format selected here will control<br />

how the data appears after it is imported<br />

into the database. It does not need to match<br />

the date format in DAT file.<br />

12,000,000 characters (12 MB), indexed by<br />

default<br />

Most flexible and variable in size, not ideal<br />

for sorting or searching by comparison.<br />

Supports rich text formatting.<br />

*Fixed-length field<br />

Table 1: Field Types<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 6<br />

Field Attributes<br />

Attribute* Use Notes<br />

Key<br />

Image<br />

Most commonly applied to fixed length<br />

fields, however it may be applied to any field<br />

(including paragraph fields) to make<br />

relational searches faster.<br />

Used to link Concordance with an Image<br />

viewer, it indicates which field contains the<br />

image name or alias.<br />

Keying a field creates a .KEY file, as KEY<br />

files grow in size, their efficiency decreases<br />

and may slow relational searches.<br />

All keyed fields will appear in the default<br />

table view.<br />

Select only one field per database as an<br />

Image field.<br />

Identifying multiple fields in a database as<br />

an image field will interfere with the linkage<br />

between Concordance and the viewer.<br />

Indexed Enables full text searching. Places every word in the field into a<br />

dictionary file (.NDX and .DCT) for fast<br />

retrieval.<br />

System<br />

Accession<br />

Optical Character<br />

Recognition (OCR) Indexing<br />

Table 2: Field Attributes<br />

Special field that is hidden with no read or<br />

write access to end-users.<br />

Unique serial numbers internally assigned<br />

to each record, managed entirely by<br />

Concordance.<br />

Will not index text that is not contained in a<br />

defined dictionary.<br />

System fields should never be indexed,<br />

added, deleted or modified by users.<br />

Concordance will create these fields for<br />

replication and synchronization information.<br />

Accession numbers may not be edited or<br />

modified. Helpful as load order identifier.<br />

Note – As records are edited, exported or<br />

removed you gaps in numbering may occur.<br />

Not recommended for any fields. Causes<br />

increased indexing times, and will limit the<br />

indexing to Webster’s dictionary and will<br />

include only English words.<br />

Use Synonyms instead.<br />

*Not every Attribute is available for every field type<br />

4 Repeat steps 1 through 3 as necessary to create a structure to match your DAT file.<br />

5 When you have completed creating all your fields, click OK.<br />

Your database structure is ready for the data import.<br />

Additional Considerations<br />

Embedded Punctuation<br />

Embedded punctuation is provided so that hyphenated words, dates, decimal numbers, and contractions<br />

are not split into two or more words. You may add or delete punctuation as needed, by default<br />

Concordance includes ‘ . , / characters as embedded punctuation for all fields.<br />

If you will be importing OCR…<br />

Create your OCR fields now, in addition to the fields for your DAT file import.<br />

As a best practice, create at least two OCR fields labeled with ascending numbers (example: OCR1 &<br />

OCR2) When using the ReadOCR.cpl to import your OCR text, the CPL will automatically overflow<br />

text from the first OCR field if it is over 12 million characters into the next sequential named OCR field.<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 7<br />

NOTE – If you do not create a second sequentially named OCR field, you run the risk of losing overflow<br />

data. You will not receive an error on the import if your content exceeds the 12 million character limit.<br />

<strong>Importing</strong> your data<br />

1 In the Documents menu, select Import then <strong>Delimited</strong> text.<br />

Figure 4: Documents Menu – Import> <strong>Delimited</strong> text…<br />

2 Select the Import/Overlay Wizard in the Import method dialog, and then click OK.<br />

Figure 5: Import Method<br />

3 Accept the default Load option for your initial import of data, and then click Next.<br />

Figure 6: Import Wizard dialog – Load Method<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 8<br />

4 Select the delimited format that matches the one used in your DAT file, then click Next.<br />

NOTE – The Import Wizard defaults to the standard Concordance delimiters, but you may also<br />

select Comma <strong>Delimited</strong> (CSV), Tab <strong>Delimited</strong>, or choose the Custom format and specify your<br />

unique <strong>ASCII</strong> character delimiters in the drop-down menu shown in figure 7.<br />

Figure 7: Import Wizard dialog – Format<br />

5 In the Date format window, select a date format that matches the dates in your DAT file, and then<br />

click Next.<br />

NOTE – Selecting the date format will not affect how it will display in table and browse view. That<br />

preference was set when the date field was created in the New field definition dialog.<br />

Figure 8: Import Wizard – Date format<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 9<br />

6 By default all of the fields you created will appear in the Selected Fields box, make sure the order of<br />

the fields matches the order in your DAT file.<br />

Figure 9: Import Wizard – Fields<br />

If you need to change the order of the files<br />

• Move all the Selected Fields to the Available fields list by clicking on the button.<br />

Or<br />

• Click on a field to reorder and use the Up and Down buttons as needed to correct the order.<br />

NOTE – If the DAT file contains the field information as the first line in the file, select the Skip first<br />

line checkbox to ensure that the data imported from the DAT File has the associated fields in the<br />

Selected Fields window.<br />

7 Click Next to confirm the Selected Fields and their order.<br />

8 Click Browse in order to navigate to and select your DAT file (delimited <strong>ASCII</strong>), and then click<br />

Next.<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 10<br />

Figure 10: Import Wizard – Open<br />

9 Confirm the location of your DAT file in the File field and click Finish to import your data.<br />

Figure 11: Import Wizard – Finish<br />

10 When the import is complete, the dialog will close. Select the Browse view to verify that your data<br />

import was successful.<br />

If you are not linking to images or loading OCR, you are ready to index your database and get started<br />

searching, tagging, and working with your records.<br />

Copyright © 2007 Concordance. All rights reserved.


<strong>Creating</strong> <strong>Databases</strong> – <strong>Importing</strong> a <strong>Delimited</strong> <strong>ASCII</strong> <strong>Text</strong> (DAT) File 11<br />

Additional Resources<br />

General Product Information<br />

http://law.lexisnexis.com/concordance<br />

Concordance Technical Support<br />

Phone: 866-495-2397<br />

Email: concordancesupport@lexisnexis.com<br />

Concordance Training<br />

Phone: 425-463-3503<br />

Email: concordancetraining@lexisnexis.com<br />

Copyright © 2007 Concordance. All rights reserved.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!