Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
General
GeneralPortable MacsHardwareNetworking
Applications
Mac ApplicationsEudoraFirefox / MozillaInternet ExplorerOutlook ExpressMS OfficeEntourageExcelPowerPointWordVirtual PCMedia PlayerOther MS Products
Programming
Mac ProgrammingCodeWarriorPerl
Country Specific
Australian Mac GroupUK Mac Group

Mac Forum / Applications / Word / November 2005



Tip: Looking for answers? Try searching our database.

removing meta data

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
G. Michael Paine - 07 Nov 2005 19:01 GMT
I use MSWord X.
Is there a way to completely remove all meta data from a Word doc?

Michael
John McGhie [MVP - Word and Word Macintosh] - 12 Nov 2005 10:59 GMT
Didn't we just answer this?

In Word X, there is no tool or built-in command, you have to do it manually.

The simplest way is to save it to plain text.  Of course, you lose all your
formatting when you do that.

Alternatively, save as HTML (Web Page), then open it as Text.  You can then
see the meta data up the top of the file: select it and delete it.  Save
back as text, and do not re-open in Word, or Word will put it back.

Cheers

On 8/11/05 6:01 AM, in article
mipaine-C5DE4D.11012707112005@comcast.dca.giganews.com, "G. Michael Paine"
<mipaine@comcast.net> wrote:

> I use MSWord X.
> Is there a way to completely remove all meta data from a Word doc?
>
> Michael

Signature

Please reply to the newsgroup to maintain the thread.  Please do not email
me unless I ask you to.

John McGhie <john@mcghie.name>
Microsoft MVP, Word and Word for Macintosh.  Consultant Technical Writer
Sydney, Australia +61 4 1209 1410

Joseph Chamberlain, DDS - 15 Nov 2005 11:36 GMT
John:

I thought this thread was interesting enough to where I wanted to try what
you suggested and participate in the discussion.

Do all documents produced by Word include accompanying metadata ?

I took a document I had created about a meeting I attended and first save it
as HTML (it actually says web page between parentheses in the file type next
to HTML). Then I went to file>open, choose to open a text document and
clicked on the page I had just created. I could not see anything in addition
to what I was seeing before on both the Word doc or the HTML doc.

Have I done something wrong ? Where would I check (I know this is probably a
very basic question) in Word to see what metadata my installed version of
Word is set up to embed in my files ?

Thank you in advance.

Joseph

---

Dr. Joseph Chamberlain
Oral and Maxillofacial Surgery

----------------------------------------------------------------------------

On 11/12/05 2:59 AM, in article BF9C163F.24118%john@mcghie.name, "John
McGhie [MVP - Word and Word Macintosh]" <john@mcghie.name> wrote:

> Didn't we just answer this?
>
[quoted text clipped - 17 lines]
>>
>> Michael
John McGhie [MVP - Word and Word Macintosh] - 18 Nov 2005 10:46 GMT
Hi Joseph:

This depends a bit on which version of Word you are using, and how you save
it.

If you are using Word 2004, you have two "Privacy" options on the "Security"
tab in Word>Preferences.  If you check both of those, Word will save very
little meta-data.  The original poster was using an old version that does
not have that feature.

If you choose the Save option "Save entire file..." you will write out all
the meta data.  If you choose "Display only..." you will remove most of it.

If I save a simple document, I get something like what you see below... I've
removed some of it :-)  The interesting stuff is after the "<!--[if gte mso
9]><xml>" tag, which switches the code stream into XML and allows Word to
express all sorts of things that HTML cannot describe.

<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta name=Title content="Type Typing">
<meta name=Keywords content="">
<meta http-equiv=Content-Type content="text/html; charset=macintosh">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 11">
<meta name=Originator content="Microsoft Word 11">
<link rel=File-List href="test_files/filelist.xml">
<title>Type Typing</title>
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
 <o:Author>John McGhie</o:Author>
<o:Template>Normal</o:Template>
 <o:LastAuthor>John McGhie</o:LastAuthor>
 <o:Revision>1</o:Revision>
 <o:Created>2005-11-16T09:09:00Z</o:Created>
<o:LastSaved>2005-11-18T10:31:00Z</o:LastSaved>
 <o:Pages>1</o:Pages>
 <o:Company>McGhie Information Pty Ltd</o:Company>
 <o:Lines>1</o:Lines>
<o:Paragraphs>1</o:Paragraphs>
<o:Version>11.512</o:Version>
</o:DocumentProperties>
<o:OfficeDocumentSettings>
 <o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:WordDocument>
 <w:Zoom>BestFit</w:Zoom>
<w:DisplayHorizontalDrawingGridEvery>0</w:DisplayHorizontalDrawingGridEvery

<w:DisplayVerticalDrawingGridEvery>0</w:DisplayVerticalDrawingGridEvery>
<w:UseMarginsForDrawingGridOrigin/>
</w:WordDocument>
</xml><![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
           {font-family:"Times New Roman";
           panose-1:0 2 2 6 3 5 4 5 2 3;
           mso-font-charset:0;
           mso-generic-font-family:auto;
           mso-font-pitch:variable;
           mso-font-signature:50331648 0 0 0 1 0;}
@font-face
           {font-family:Arial;
           panose-1:0 2 11 6 4 2 2 2 2 2;
           mso-font-charset:0;
           mso-generic-font-family:auto;
           mso-font-pitch:variable;
           mso-font-signature:50331648 0 0 0 1 0;}
@font-face
           {font-family:Verdana;
           panose-1:0 2 11 6 4 3 5 4 4 2;
           mso-font-charset:0;
           mso-generic-font-family:auto;
           mso-font-pitch:variable;
           mso-font-signature:50331648 0 0 0 1 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
           {mso-style-parent:"";
           margin:0cm;
           margin-bottom:.0001pt;
           mso-pagination:widow-orphan;
           font-size:12.0pt;
           font-family:"Times New Roman";
           mso-ansi-language:EN-AU;}
h1
           {mso-style-next:Normal;
           margin-top:12.0pt;
           margin-right:0cm;
           margin-bottom:3.0pt;
           margin-left:0cm;
           mso-pagination:widow-orphan;
           page-break-after:avoid;
           mso-outline-level:1;
           font-size:16.0pt;
           font-family:Arial;
           mso-font-kerning:16.0pt;
           mso-ansi-language:EN-AU;}
h3
           {mso-style-next:Normal;
           margin-top:12.0pt;
           margin-right:0cm;
           margin-bottom:3.0pt;
           margin-left:0cm;
           mso-pagination:widow-orphan;
           page-break-after:avoid;
           mso-outline-level:3;
           font-size:13.0pt;
           font-family:Arial;
           mso-ansi-language:EN-AU;}
a:link, span.MsoHyperlink
           {color:blue;
           text-decoration:underline;
           text-underline:single;}
a:visited, span.MsoHyperlinkFollowed
           {color:purple;
           text-decoration:underline;
           text-underline:single;}
table.MsoNormalTable
           {mso-style-parent:"";
           font-size:10.0pt;
           font-family:"Times New Roman";}
@page Section1
           {size:595.25pt 841.85pt;
           margin:72.0pt 90.0pt 72.0pt 90.0pt;
           mso-header-margin:35.4pt;
           mso-footer-margin:35.4pt;
           mso-paper-source:0;}
div.Section1
           {page:Section1;}
-->
</style>
</head>

From a complex document, you will get a lot more than that.  The idea is
that Word can express a document in mark-up language so completely that you
can save out to "Web Page" and have the document appear in a modern browser
just as if it were displayed in word.  Better: You can open such an "HTML"
file in Word, and it will completely and accurately reconstruct a Word
document for you without losing anything.

Of course, the "format" used is not "HTML" and was never intended to be.
Microsoft Marketing has a bit of a reputation for completely missing the
point of a lot of the valuable high-end features embedded in the software
they sell, and attempting to "dumb-down" both the design and the
description.  This is one of those times...

The coding in use in the first versions of Word was "XHTML", which is an
extensible superset of HTML that enables an application to describe things
that HTML just can't encode.  Later versions of Word (Word 2002+) use
full-blown XML, which can describe "anything".  In Word X and Word 2002,
while the language supports a full document, the application's output writer
did not, so you did lose stuff.  In Word 2004, you lose very little, and in
Word 2003, almost nothing.

In the next version of Word, Word 12, XML is the native file format, so you
can be guaranteed not to lose "anything".

Hope this helps

On 15/11/05 10:36 PM, in article
BF9F0833.2FF85%drjchamberlain@earthlink.net, "Joseph Chamberlain, DDS"
<drjchamberlain@earthlink.net> wrote:

> John:
>
[quoted text clipped - 48 lines]
>>>
>>> Michael

Signature

Please reply to the newsgroup to maintain the thread.  Please do not email
me unless I ask you to.

John McGhie <john@mcghie.name>
Microsoft MVP, Word and Word for Macintosh.  Consultant Technical Writer
Sydney, Australia +61 4 1209 1410

 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.