13.07.2015 Views

Extensions of the UNIX File Command and Magic File for File Type ...

Extensions of the UNIX File Command and Magic File for File Type ...

Extensions of the UNIX File Command and Magic File for File Type ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Figure 5. Display <strong>of</strong> <strong>File</strong> <strong>Type</strong>s Identified by <strong>the</strong> <strong>File</strong> <strong>Type</strong> Identifier.The reason that only two PUIDs are shown in Fig. 5 is that <strong>the</strong> <strong>File</strong> <strong>Type</strong> Identifier recognizesmany more types <strong>of</strong> archive files than are registered in <strong>the</strong> PRONOM registry.The parameters <strong>of</strong> <strong>the</strong> file comm<strong>and</strong> are used to exclude <strong>the</strong> file comm<strong>and</strong>’s procedural tests.These include tests <strong>for</strong> <strong>the</strong> character set <strong>of</strong> text files, <strong>the</strong> language <strong>of</strong> a text file, tar files, EMXapplication type, compound document files <strong>and</strong> elf files. There are several reasons <strong>for</strong>temporarily excluding <strong>the</strong>se procedural tests. First, many <strong>of</strong> <strong>the</strong>se tests result not in file typeidentification but additional description or metadata extraction.Secondly, <strong>the</strong> recognition <strong>of</strong> character sets should be possible with a finite state acceptor. Finitestate acceptors are equivalent in recognition power to regular expressions, <strong>and</strong> <strong>the</strong> magiclanguage supports tests <strong>of</strong> <strong>the</strong> content <strong>of</strong> lines <strong>of</strong> text using regular expressions. An extension <strong>of</strong><strong>the</strong> regular expressions tests to multiple lines might accomplish <strong>the</strong> recognition <strong>of</strong> character sets.14

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!