Give me my drawing back! - The Document Foundation Wiki
Give me my drawing back! - The Document Foundation Wiki
Give me my drawing back! - The Document Foundation Wiki
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Give</strong> <strong>me</strong> <strong>my</strong> <strong>drawing</strong> <strong>back</strong>!<br />
Dragging your Visio, Publisher and CorelDraw files<br />
to free-sofware world<br />
Fridrich Štrba<br />
Software Engineer, SUSE<br />
1
Agenda<br />
LibreOffice's contribution to wider<br />
FOSS eco-system<br />
Visio, CorelDraw, Publisher,...<br />
Interesting parts of the reverseengineering<br />
Incre<strong>me</strong>ntal reverse-engineering<br />
Evolution of file-formats observed<br />
2
LibreOffice's contribution to wider<br />
FOSS eco-system
Designed to be re-used<br />
LibreOffice uses technologies available in the FOSS ecosystem<br />
We love to give <strong>back</strong> and share the fruit of our sweat<br />
Libwpg, libvisio, libcdr and libmspub<br />
Standalone libraries<br />
Using the sa<strong>me</strong> interface<br />
Internal class generating SVG for lazy hackers :)<br />
More users, more bug reports and (eventually) fixes<br />
Reverse-engineering is by principle trial & error exercise<br />
4
Visio Import filter - libvisio<br />
Google Sum<strong>me</strong>r of Code 2011<br />
Eilidh McAdam<br />
Previous reverse-engineering work by re-lab's Valentin<br />
Filippov<br />
Started with Visio 2000 – Visio 2010 file-formats<br />
LibreOffice 3.5 release<br />
Visio 2000 and Visio 2002 – version 6 file-format<br />
Visio 2003 to Visio 2010 – version 11 file-format<br />
Extended in 2012 to ALL Visio file-format versions that ever<br />
existed<br />
Upcoming LibreOffice 4.0 release<br />
Visio 2013 – OOXML-ish version (*.vsdx)<br />
Visio 1 – 5<br />
Visio XML Drawings (*.vdx)<br />
5
<strong>The</strong> team<br />
Valentin Filippov Fridrich Štrba Eilidh McAdam<br />
6
CorelDraw import filter - libcdr<br />
Work started in late 2011<br />
Released in LibreOffice 3.6.x<br />
Still improving<br />
Valek's reverse-engineering work<br />
cdr_explorer<br />
So<strong>me</strong> of it reused in sk1 project, which is currently dormant<br />
An interesting challenge after the success of libvisio<br />
Continuation of a fruitful collaboration<br />
Support for ALL CorelDraw file-formats<br />
Starting from version 1 (code Waldo)<br />
Ending by CorelDraw x6 released in March 2012<br />
7
Microsoft Publisher Import filter - libmspub<br />
Google Sum<strong>me</strong>r of Code 2012<br />
Brennan T. Vincent<br />
Flagship feature of LibreOffice 4.0<br />
Reverse-engineering started by Valek Filippov<br />
Completed in tandem.<br />
Version support<br />
MS Publisher 97<br />
MS Publisher 98/2000<br />
MS Publisher 2002-2013<br />
8
Interesting ele<strong>me</strong>nts:<br />
Incre<strong>me</strong>ntal reverse-engineering
Progressive develop<strong>me</strong>nt of file-formats<br />
Nobody reinvents a wheel from scratch<br />
It is useful to know the release dates of different versions when doing<br />
reverse-engineering<br />
Two subsequent versions of the sa<strong>me</strong> file-format will have many things in<br />
common<br />
Design parser to be able to parse lower and higher versions<br />
Opened version conditions<br />
Guard assumptions by exceptions and be verbose in debug mode<br />
Try to parse lower or higher version using the existing parser<br />
Fix issues as they appear<br />
Importance of a small number of reference docu<strong>me</strong>nts covering many<br />
features<br />
10
Extending the CorelDraw version coverage (1)<br />
Departing point<br />
Support for versions 7 to x3<br />
Basically the knowledge from cdr_explorer<br />
Extending the coverage upwards<br />
x4 and x5<br />
Support for RIFF docu<strong>me</strong>nts inside structured ZIP storage<br />
x6<br />
More complicated structure inside the ZIP storage<br />
Extending the coverage downwards<br />
Version 6 (first 32-bit version)<br />
Only so<strong>me</strong> RIFF na<strong>me</strong>s different<br />
Versions 4 and 5 (16-bit versions)<br />
Different way to express coordinates<br />
11
Extending the CorelDraw version coverage (2)<br />
Extending the coverage downwards (cont'ed)<br />
Version 3<br />
First RIFF based CDR file-format<br />
but we did not know it by then<br />
Fill and outline information embedded inside the shape<br />
Shape transform does not accumulate group transforms<br />
Versions 2 and 1<br />
Not RIFF based at all<br />
Version 2 more structured<br />
With so<strong>me</strong> exception handling both can be parsed alike<br />
A header with pointers to different sequences of chunks<br />
Imple<strong>me</strong>ntation of linked list (“type 1”) and shape information (“type 2”)<br />
Embedded raster (“type 3” and “6”), group transforms (“type 7”),<br />
arrow information (“type 8”),<br />
12
Extending the Visio version coverage (1)<br />
Departing point<br />
Versions 6 and 11<br />
Difference in so<strong>me</strong> offsets and in text encoding<br />
Common structure<br />
A trailer pointing to “streams”<br />
So<strong>me</strong> “streams” consist in a hierarchical sequence of “chunks”<br />
Shapes and text content in “chunks”<br />
Bug driven rewrite<br />
A docu<strong>me</strong>nt (most likely generated by SDK)<br />
Challenged completely our assumptions and led to more generalized<br />
parser<br />
13
Extending the Visio version coverage (2)<br />
Microsoft Visio 2013 Preview<br />
We wanted to support it before the official release<br />
xml-based (ooxml-ish) file-format (*.vsdx)<br />
Another rewrite of the parsers<br />
Need to separate more clearly the parsing and information processing<br />
Side-effect: support of Visio XML Drawing (*.vdx)<br />
Versions 1 to 5<br />
So<strong>me</strong> “chunks” of type list different<br />
An override for readers of so<strong>me</strong> chunks<br />
“streams” format very similar<br />
Little abstractions and generalizations needed<br />
Improved understanding of the file-format<br />
Cleaner and simpler parser<br />
14
Getting involved ...<br />
how you can make a difference
Future file-formats to import?<br />
Google Sum<strong>me</strong>r of Code<br />
<strong>The</strong> possibility for a student to work with outstanding<br />
<strong>me</strong>ntors<br />
Valentin Filippov<br />
Your faithful<br />
(Altsys, Aldus, Macro<strong>me</strong>dia & Adobe) Freehand<br />
File-format partially reverse-engineered<br />
<strong>The</strong> big lines of the structure<br />
Ripe to be a successful project<br />
A talented student can make difference in LibreOffice<br />
16
Impact within LibreOffice and the known universe<br />
Happy users will reward you<br />
You will be the hero of the people who can now read their<br />
docu<strong>me</strong>nts...<br />
… and they will get on your nerves listing features that are<br />
not converted.<br />
Users outside LibreOffice<br />
Inkscape reuses libvisio and libcdr in 0.49<br />
Calligra reuses libvisio and (possibly) libcdr since 2.5<br />
17
QA and Stoning session<br />
All text and image content in this docu<strong>me</strong>nt is licensed under the Creative Commons Attribution-Share Alike 3.0 License<br />
(unless otherwise specified). "LibreOffice" and "<strong>The</strong> Docu<strong>me</strong>nt <strong>Foundation</strong>" are registered trademarks. <strong>The</strong>ir respective logos<br />
and icons are subject to international copyright laws. <strong>The</strong> use of these therefore is subject to the trademark policy.<br />
18