OpenDocIda04

____________ Open Document Formats ____________

--> [ Open Standards | Standardisation | Interoperability ]


Web Resources:

Bob Sutor, Vice President of Standards and Open Source for the IBM Corporation, on topic:

I. IDA documentation on the Promotion of Open Document Exchange Format

http://europa.eu.int/ida/en/document/3439

"By decision of the European Parliament and Council, The IDA Programme was set up in order to encourage information content interoperability through the promotion of trans-European telematic networks between Administrations, Institutions and Agencies."

(*) = Reviewed and quoted in this text. Feel free to work on the missing !

1. (*) Valoris report on Open Document Formats

2 (*) Responses from Microsoft to the Valoris report- April 2004

3. (*) Responses from Sun to the Valoris report- April 2004

4. ( ) Recommendations endorsed by the IDA Telematic Advisory Committe (TAC) - May 2004

5. (*) Responses from Microsoft to the TAC recommendations

6. (*) Responses from SUN to the TAC recommendations - Sep. 2004

7. (*) Responses from IBM to the TAC recommendations - Nov. 2004

8. ( ) Positions from other players

9. ( ) IDA Work programmes

10. ( ) The HAM work programme 2004 - PDF

11. ( ) The HAM work programme 2003 - PDF

Markers: ---> <--- surround quoted parts, each part included in "", parts in original doc order. Within quotes, a heading # marks a custom insertion (comment).


II. The Valoris report


---> 1. (report)

"After increasing pressure from users and EU Administrations, Microsoft has announced on 17 Nov. 2003, the publication of the XML Reference Schemas for its office suite. Except for some technical details, these XML formats seem to be completely documented, and licence to use them is granted freely. However, the licence includes some constraints, which need to be examined carefully as they might be too binding and incompatible with GPL software integration. OpenOffice.org (OOo) is both an office applications suite and an XML based format. OpenOffice.org is a community-based project and is based on the open-sourced code from an older version of StarOffice bought by Sun Microsystems from German software company StarDivision, founded in 1991. In 2000, Sun released the source code of StarOffice software publicly through OpenOffice.org, thus initiating the world's largest open source project. OpenOffice is currently being standardised by Oasis. The chairman of the Oasis technical committee in charge of OpenOffice predicts that the standard will be voted in the first half of 2004. OOo boasts a confirmed and growing user base estimated between 10 and 40 million users, mainly in governments and administrations. In its attempt to standardize the OpenOffice.org format, Sun Microsystems is not backed up by other market players except for the Open Source community."

"Our view is that the corresponding formats, namely MS XML Reference schemas and OOo will naturally follow the adoption of the major tools behind them. Microsoft's XML lead and market dominance will remain for the few years to come. On the other hand, OpenOffice user base size is now such that it is irreversible, and it constitutes a viable alternative to Microsoft. In terms of wide adoption of the format, we believe that none of the two will be winner or a knockout looser, with MS dominating the user base at 85%. The two formats will coexist, but OOo will become more and more the open format reference for interoperability across platforms."

"Lack of presentation fidelity could also have dangerous effects. On June 27, 1988, a train crashed into the Gare de Lyon in Paris at high speed and without braking. More than fifty people were killed in the accident and a considerable number wounded. The accident was found to be caused by the coincidence of two badly formatted indented instructions in the maintenance manual."

"2.2.7 Spport for emerging word processor features This criterion extends the above one to those more advanced features, which we believe will be part of the future word processor tools:

As discussed earlier, we foresee electronic document to become more and more decoupled from application. More and more applications will intervene on subparts of the document. The recently added support for user-defined XML schemas in Word2003 confirms this trend. A Word document can now contain Word mark-up, in combination with extra attributes to be used by other business pplications. Situations are imaginable where many applications intervene in a similar way on the same document."

Office 2003 and MS XML

"In Microsoft Word, users can:

Document Workspaces allow users to:

Also, using the XML schemas, it is possible to integrate Office documents into business processes. For example, a letter is written to a client that includes the customer number, the subject and the date, each of these has a style attached to it. Based on the style, a down stream process can extract these fields and put them in a CRM system with a pointer to the actual document. The administrator just writes the letter and the rest happens automatically."

MS XML intellectual property issues

"MS does not believe the patent licence will hinder vendors such as Oracle or IBM or other players from developing WordProcessingML compatible tools. MS believes that standardising the XML Reference Schemas will bring the risk of slowing down their evolution, which could hinder development of future features in MS Office. According to Microsoft, 'the XML Reference Schema announcement underlines Microsoft's commitment to constructive dialogue with governments and the industry with regard to intellectual property issues. Microsoft listened to requests for clarification of its licensing policy with regard to the Office 2003 XML Reference Schemas and is now responding to those requests by delivering a world-wide open and royalty-free licensing program. Individuals and organisations, including governments, academics and commercial software vendors, can enter into the license.'"

"Regarding MS XML Reference schemas and the above definition, we could express the following reservations:

---> 2. Comment by Microsoft: "This statement is true, and is equally true of other formats as well, including the OOo file format. XML is extensible. Therefore, proprietary objects may be embedded. For instance, proprietary Java applets can be embedded in StarOffice documents. Perhaps this point can be stated as a neutral observation about extensible formats in general rather than a 'reservation' that is specific to the Microsoft XML Reference schemas." <--- 2.

---> 2. Comment by Microsoft: "Proprietary macros may be included in documents, but this is not a weakness of Microsoft's implementation of XML. The ability to manage macros is a feature of a word processing application. The Office XML schemas are a viable cross-platform solution. There is no inherent platform restriction in the Microsoft schemas." <--- 2.

"3.4.3.1 The intellectual property

The intellectual property remains with Microsoft. The license precludes the modification or extension of the schemas. Microsoft is not offering these schemas to a standards body. Patents are associated to the license.

License excerpts: 'Microsoft may have patents and/or patent applications that are necessary for you to license in order to make, sell, or distribute software programs that read or write files that comply with the Microsoft specifications for the Office Schemas...' 'Except as provided below, Microsoft hereby grants you a royalty-free license under Microsoft's Necessary Claims to make, use, sell, offer to sell, import, and otherwise distribute Licensed Implementations solely for the purpose of reading and writing files that comply with the Microsoft specifications for the Office Schemas.'

The schema download contains language that allows to copy and distribute the schema, subject to certain limitations (credit it and link to a particular page at Microsoft). But the download doesn't grant the right to implement a program that can use the specifications.

This part is ambiguous. Two theories conflict on this manner. The first one translates "not being licensed to distribute under other license terms in the Patent License" as a clause designed to prevent application that use the Gnu General Public License (GPL) from implementing Office XML compatibility. Developers writing open source software should be careful before using these schemas. The second, more positive, is from Eben Moglen, the Free Software Foundation FSF's pro bono counsel. He told www.theregister.co.uk he didn't think the alarm is justified. 'This is not a license that I would like to accept; Microsoft is saying we might have some patents. But it's not a problem if Microsoft is making it available to everyone to make use and sell.'

"While Microsoft will make available the Office schemas, the company will retain control over how those schemas are developed in the future. That puts the burden on competitors to keep up with Microsoft's changes. At the same time, Microsoft reserves the right to change its policy and/or the terms of the licenses with respect to future versions of Office."

"The actual Microsoft patent statement says you must obtain a license if you use the information in a separate application for compatibility. Quoting them: 'There is a separate patent license available to parties interested in implementing software programs that can read and write files that conform to the Specification.'"

---> 2. Statement by Microsoft: "This Microsoft license follows a common approach to intellectual property license grants by expressly stating the activities included within the scope of the license, as opposed to describing the activities that are not. As stated in the License Agreement, the licensee has the right 'to make, use, sell, offer to sell, import, and otherwise distribute Licensed Implementations solely for the purpose of reading and writing files that comply with the Microsoft specifications for the Office Schemas'. A 'Licensed Implementation" means only those specific portions of a software product that read and writes files that are fully compliant with the specifications for the Office Reference Schemas.' <--- 2.

The OpenOffice.org Office-Suite

"OpenOffice.org (OOo) is an office applications suite. OpenOffice.org is a community-based project and is based on the open-sourced code from an older version of StarOffice created by Sun Microsystems. The goal of the OpenOffice.org community is to 'create the leading international office suite that will run on all major platforms and provide access to all functionality and data through open-component based APIs and an XML-based file format.' As described in the overview document, OpenOffice.org is both an Open Source product and a project. The product is a multi-platform office productivity suite. It includes the key desktop applications, such as a word processor, spreadsheet, presentation manager, and drawing program, with a user interface and feature set similar to other office suites. OpenOffice.org also works transparently with a variety of file formats, including those of Microsoft Office. OpenOffice.org is available for download on the OpenOffice.org website and distributed by partner vendors."

"The OpenOffice.org format is an XML format, which is fully documented and freely available from the OpenOffice.org open source community. Its use and extensibility is provided freely with no legal constraints. Furthermore, OpenOffice format is being standardized by OASIS, the 'Organization for the Advancement of Structured Information Standards'. OASIS is a not-for-profit, global consortium that drives the development, convergence and adoption of e-business standards. OASIS has more than 600 corporate and individual members in 100 countries around the world. OASIS and the United Nations jointly sponsor ebXML, a global framework for e-business data exchange. The purpose of the OASIS OpenOffice Technical Committee is to create an open, XML-based file format specification for office applications."

"4.3.2 Cross-platform interoperability This criterion holds true for the OOo format, at least for the platforms on which the OpenOffice/StarOffice tools have been implemented. Nothing in the OOo format as such should prevent it from being processed on further existing platforms, or future ones." "The user base is difficult to establish precisely, mainly due to the fact that one could not distinguish users from those who simply download. One thing seems to be sure, however, is that OOo adoption in governments and Administrations is beyond any doubt as the following sample illustrate:

Other countries and administrations include China, Thailand, Israel, Australia, Philippines, Uganda, and Vietnam. The main criteria in favour of OpenOffice/Staroffice adoption are the price (free or extremely low), openness, and multiplaform capabilities especially for Linux and Windows."

Some of the reports reference links:

Oasis http://www.oasis-open.org

Market Overview 2003: Office Productivity Suites Erosion Begins for Microsoft Office Dominance http://www.gigaweb.com

Microsoft Licenses Office 2003 XML Reference Schemas http://xml.coverpages.org/LicenseOfficeSchemas.html

German study by Soreon: 25% cost savings with Microsoft alternatives http://de.internet.com/?id=2025258&section=Marketing-Statistics

Office 2003: une entreprise sur cinq envisage de migrer http://www.01net.com/article/220532.html

STATSKONTORET, Interoperability Test and XML Evaluation of StarOffice Writer 6.0 and Office Word 2003 Beta 2 https://createpdf.adobe.com/cgi-feeder.pl/help_general?BP=&LOC=en_US

Microsoft licensing could push users to StarOffice http://www.computerworld.com/printthis/2002/0,4814,70710,00.html

Aristote Seminar (Huc Presentation) http://www.microsoft.com/presspass/press/2003/nov03/11-17XMLRefSchemaEMEAPR.asp

OpenOffice.org XML File Format http://xml.coverpages.org/starOfficeXML.html <--- 1. (report)

---> 3. (Sun) "To meet the criteria established within the Valoris report, a document format must be non-binary (section 2.2.2). As a result, Microsoft's XML Reference Schemas were researched for evaluating the Microsoft Office 2003 suite. However, the report never mentioned that the XML Reference Schemas do not fully describe all document formats used by the applications within Microsoft's Office suite. For example, there is no way to save Microsoft PowerPoint documents in a non-binary/XML format, nor is there an associated XML Reference Schema available for PowerPoint. Since PowerPoint is a core application within Microsoft's Office suite, and creating presentations is a common function performed by users of office suites, this is a significant omission. Also, some of the advanced features supported by Office 2003, such as Information Rights Managements (IRM), are only supported when using the proprietary binary Office document formats. Since part of the evaluation criteria established for this evaluation project include support for 'emerging word processor features' (section 2.2.7) these details are important to note."

"Further, the chart in section 6.1 listing market share statistics for Microsoft's Office suite doesn't include Office 2003. Since the XML Reference Schemas are only applicable to Office 2003, and the project did not consider evaluating the Microsoft proprietary binary document formats used by other versions of Office, caution should be used when attempting to infer conclusions based on the stated market share statistics." <--- 3.

---> 6. (Sun on TAC) "We believe that documents are the intellectual capital of those who create them, whether they be governments, private entities or citizens. In the case of governments this means that the documents you create are your property - nobody else's - and they should therefore be stored in well-designed, truly open, long-lived and sharable data formats which can be retrieved and re-used by future generations. Sun fully shares the European Commission's point of view about the importance of standardization across the spectrum of this technology; this is why we submitted the OpenOffice.org format to OASIS in the first place. Upon consideration, we agree with your recommendation concerning ISO, and I am delighted to inform you that on September 1st we publicly notified the relevant OASIS technical committee of our position. Since then, this idea has been welcomed by the committee and by Best, OASIS' Vice President. Note that since we do not control the committee, or OASIS, or ISO, we cannot promise success, but we do not foresee serious obstacles at the moment and I think we can be optimistic that the OASIS Open Office XML format will become an ISO standard." "I am also very pleased to announce that Sun has developed Open Source WordML and ExcelML filters which will be in the next release of Sun's StarOffice (version 8) and in the next release of OpenOffice.org from the OpenOffice.org open source project.These filters will be available for adaptation and re-use by others in industry." <--- 6.

---> 7. (IBM on TAC) "These principles are particularly relevant for e-government applications with respect to the creation, manipulation and exchange of documents within public administrations, as well as between such administrations and citizens. Public administrations are driven to use a select number of common file formats by their need to share and maintain the fidelity of their documents. It is essential that public sector documents be available in a commonly used open file format so as to avoid use of closed, proprietary formats which result in "vendor lock-in" and the imposition of a single technology choice on citizens, enterprises and other organisations seeking to exchange documents with public administrations. The ongoing work on open file formats in OASIS is an excellent step forward in efforts to develop a file format which meets the requirements outlined above. IBM follows closely the activities of the Open Office XML Format Technical Committee in OASIS and has informed OASIS that we intend to join the relevant technical committee. Indeed, we already offer products (IBM Workplace Client Technology) which conform with the current draft specifications developed within the OASIS TC." <--- 7.

Analysis

Hosting sponsored by Netgate and init7