Download - Academy Publisher
Download - Academy Publisher
Download - Academy Publisher
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Research and Realization about Conversion<br />
Algorithm of PDF Format into PS Format<br />
Xingfu Wang, Lei Qian, Fuyou Mao, and Zhaosheng Zhu<br />
School of Computer Science & Technology in the University of Science and Technology of China,<br />
USTC,Hefei, Anhui, P.R.China postalcode: 230027<br />
wangxfu@ustc.edu.cn, mfy@ustc.edu.cn<br />
Abstract—This paper firstly introduces the characteristics of<br />
PostScript document and PDF document as the basis, and<br />
proposes the necessity and the feasibility of the conversion<br />
from the PDF document format to the PostScript language<br />
program. Secondly, it studies the main algorithm and<br />
technology of the conversion process and realizes the<br />
information extraction for PDF document lastly, with<br />
achieving the software algorithm for the conversion from<br />
PDF document format into PS format on the basis of the<br />
text.<br />
Index Terms—PDF PS(PostScript), Format conversion,<br />
Object-oriented, Interpreter<br />
I. INTRODUCTION<br />
PDF (Portable Document Format), developed by<br />
Adobe, is a open electronic file format which is suitable<br />
for transmitting and sharing the file between the different<br />
computer system, with the advantages like cross-platform,<br />
high-compression, being suitable for screen viewing and<br />
network transmission, the protection of documents, e-<br />
reviewed, Print output of higher quality. PS language<br />
(PostScript description language, namely PDL), is also a<br />
matter of fact in the printing industry standard owned by<br />
Adobe, which may describe the exquisite layout and<br />
occupy the dominant position in the area of printing<br />
presently. Though PDF develops on the basis of<br />
PostScript, it is not a programming language, hence it<br />
must be converted to PostScript date flow when the<br />
output is on the common PostScript printer, or the PDF<br />
document must be converted to PS document before the<br />
output.<br />
II.<br />
CHARACTERISTICS AND STRUCTURES OF PS<br />
LANGUAGE<br />
PostScript language as a common page description<br />
language in the modern printing technology is one kind<br />
of Interpretative programming languages, with strong<br />
graphics function. PostScript language takes Adobe's<br />
imaging model theory as the basis for the description of<br />
pages, and its main application is printing pages or<br />
describing text, graphics, shape and sampling images on<br />
the display pages. The procedures, wrote by using<br />
PostScript language, might describe the correspondence<br />
from Organizing layout system to printing system by<br />
documents, or control displayed result of objects on the<br />
display. The description produced by PostScript language<br />
This work is supported by National Natural Science Foundation of<br />
China (NO.60773037&60970128), by Teaching Research Project of<br />
Anhui Province (2007)<br />
© 2010 ACADEMY PUBLISHER<br />
AP-PROC-CS-10CN006<br />
ISBN 978-952-5726-09-1 (Print)<br />
Proceedings of the Second International Symposium on Networking and Network Security (ISNNS ’10)<br />
Jinggangshan, P. R. China, 2-4, April. 2010, pp. 031-034<br />
31<br />
is a high-grade description, and it is noting with the<br />
equipment, thus it has become an important and<br />
indispensable component of the high-quality printing and<br />
output. At present, many printers, film Phototypesetters,<br />
printing Phototypesetters, digital printers and other<br />
equipments have all been installed the PostScript<br />
interpreter; many RIP processing software also take<br />
PostScript interpreter as its core technology, and many<br />
image-processing software, graphics and design software<br />
and typesetting software support PostScript as also, in<br />
which the most famous typical software are Photoshop,<br />
CorelDraw, Illustrator, Freehand, QuarkXPress,<br />
PageMaker, and so on.<br />
A. The basic characteristics of PostScript language<br />
Compares with other file formats, PostScript language<br />
has the capacity of page description and interactive<br />
handling ability between text and images. In addition,<br />
PostScript document has many advantages. For example,<br />
one of the advantages is that PostScript document is<br />
independent of the equipment and independent of the<br />
operating system platform. Graphical environment of<br />
UNIX itself takes the support of PostScript as its core<br />
part, so no matter using Windows operating system or<br />
using UNIX operating system, PostScript document can<br />
be read and print well, hence it is convenient to exchange.<br />
Moreover, although the PostScript documents also use<br />
the binary encoding to save, it generally expresses and<br />
stores information by the text mode of ASCII code, by<br />
which it can be easier to read and edit.<br />
B. The composition of PS document and the<br />
characteristics of procedures<br />
PostScript language is a high-level interpretative<br />
language, which has a wealth of data types and the<br />
control statement. Besides, it proposes the concept of the<br />
process and, like other programming languages (such as<br />
the C language), adopts the structured programming<br />
method of the top-down and stepwise refinement. A<br />
PostScript file with a good structure is usually composed<br />
of two parts: preface part and description part.<br />
Syntax, data types and implementation of semantics is<br />
the three basic parts of any a PostScript procedure.<br />
Coupled with its outstanding capacity of the graphics,<br />
images and text, the PostScript language can be<br />
competent to control and print the task of dealing with<br />
pages. The grammar of PostScript language is quite<br />
simple, but its function is very powerful, which comes<br />
from it can without any restrictions to combine its each<br />
characteristic in any way. Using these functions of PS