16.01.2013 Views

Microsoft Sharepoint Products and Technologies Resource Kit eBook

Microsoft Sharepoint Products and Technologies Resource Kit eBook

Microsoft Sharepoint Products and Technologies Resource Kit eBook

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

594 Part VII: Information Management in SharePoint <strong>Products</strong> <strong>and</strong> <strong>Technologies</strong><br />

Therefore, crawled information is held in two places—the SPS.EDB ESE (Extensible<br />

Storage Engine) database <strong>and</strong> the index files or catalogs on the NTFS file system.<br />

If you encounter errors in the gatherer crawling process, these errors will be<br />

recorded in the gatherer logs, which will be discussed later in this chapter.<br />

Adding File Types to the Indexing Process<br />

The extension, or file type, of a file indicates to SharePoint Portal Server which filter it<br />

should use to convert the text in the file to a Unicode character string. File types determine<br />

which IFilter will convert the document in the indexing process. Administrators<br />

can also add new file types so that they can be included in the indexing process. If the<br />

file’s extension does not appear in the File Type list, those files will not be crawled by<br />

the gatherer. The file type can be associated with an existing IFilter; however, IFilters<br />

are designed to filter documents in only one format. If the new file extension is from<br />

an application that does not have an IFilter, it cannot be properly indexed. SharePoint<br />

Portal Server also accepts third-party IFilters for custom file types.<br />

When a file type is added it applies only to content that is stored outside the<br />

portal site <strong>and</strong> included in the content index through content sources. Therefore, as<br />

long as you have a protocol h<strong>and</strong>ler to connect to an external content source, an<br />

IFilter to extract the data from the documents in that content source, <strong>and</strong> the file<br />

type added to the portal site, then those documents should be available for crawling,<br />

indexing, <strong>and</strong> searching.<br />

You can add file types to be included in the content index. All files with a<br />

file extension in the list will be included in the index. When you add a file type to<br />

SharePoint Portal Server 2003 <strong>and</strong> register its IFilter, the file type is readable/searchable<br />

by the portal site. After adding a new file type, it appears on the Include File<br />

Types page.<br />

If you delete a file type, the file format is no longer compatible with or searchable<br />

from SharePoint Portal Server 2003. Changes in file types will not take effect in<br />

the index until you have run a full index on all your content sources. So plan your<br />

file types carefully <strong>and</strong> give yourself time for new file types to be included in the<br />

index as you execute new Full Updates on your content sources.<br />

Note You must register the IFilter for new file types. If you add a new file<br />

type but no IFilter is registered, only the file properties are included in the<br />

index.<br />

To add a file type<br />

1. On the Site Settings page, in the Search Settings <strong>and</strong> indexed content section,<br />

click Configure search <strong>and</strong> indexing.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!