WUGNET, the Windows User Group Network
Your Complete Resource Center for "The Best" in Shareware, Computing Tips and Support, Windows Industry News... and much more!
Home Forums Shareware Windows Tips Hot Offers FREE Newsletters Arcade Contact Us About Partners
Search WUGNET: RSS Feeds RSS Feeds Advertise with WUGNET    |    Shareware eBooks
HomeHome FAQFAQ      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Fake returns?

 
   Home -> Office -> Formating Long Docs RSS
Next:  Graphics into table cell  
Author Message
WilliamWMeyer

External


Since: Nov 08, 2006
Posts: 12



(Msg. 1) Posted: Tue Oct 07, 2008 11:18 am
Post subject: Fake returns?
Archived from groups: microsoft>public>word>formatting>longdocs (more info?)

Hi folks,

I often am dealing with Word files I didn't create. That is, a human being
other than myself created them (perhaps on Macs), or Save As/Export filters
created them. For example, Saving As .doc/.rtf out of Adobe Acrobat, or OCR
scanning software that saves as .doc/rtf.

Because of this I often run into fake returns, which I can manipulate to
some degree, in Find & Replace with ^p and ^13.

However, if I select a group of these paragraphs and apply a Style to them,
their fakeness is revealed and the block of paragraphs is treated as if it
were one paragraph.

I used to see and solve a problem like this, in which a para mark had a ^10
before or after it, but I haven't seen those ^10s since a couple Word or
operating system versions ago.

Any ideas?
WilliamW
Back to top
Login to vote
Klaus Linke

External


Since: Dec 18, 2003
Posts: 461



(Msg. 2) Posted: Wed Oct 08, 2008 8:03 pm
Post subject: Re: Fake returns? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Hi William,

Yes, as you say that's a problem that has been around a while.
The issues with the non-working ¶ para marks should go away once you save
the file in a native Word format (doc, rtf, docx...).

Or, if that's not a good option, replace ^13 with ^p, and maybe ^10 with ^p
to be on the safe side. I do that routinely at the beginning of macros that
process text files or other non-word files.

Regards,
Klaus




"WilliamWMeyer" <fake.DeleteThis@dontharvest.com> wrote:
> Hi folks,
>
> I often am dealing with Word files I didn't create. That is, a human being
> other than myself created them (perhaps on Macs), or Save As/Export
> filters created them. For example, Saving As .doc/.rtf out of Adobe
> Acrobat, or OCR scanning software that saves as .doc/rtf.
>
> Because of this I often run into fake returns, which I can manipulate to
> some degree, in Find & Replace with ^p and ^13.
>
> However, if I select a group of these paragraphs and apply a Style to
> them, their fakeness is revealed and the block of paragraphs is treated as
> if it were one paragraph.
>
> I used to see and solve a problem like this, in which a para mark had a
> ^10 before or after it, but I haven't seen those ^10s since a couple Word
> or operating system versions ago.
>
> Any ideas?
> WilliamW
>
Back to top
Login to vote
WilliamWMeyer

External


Since: Nov 08, 2006
Posts: 12



(Msg. 3) Posted: Thu Oct 09, 2008 11:53 am
Post subject: Re: Fake returns? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Klaus Linke" <info.TakeThisOut@fotosatz-kaufmann.de> wrote in message
news:OIqvhAXKJHA.5920@TK2MSFTNGP03.phx.gbl...
> Hi William,
>
> Yes, as you say that's a problem that has been around a while.
> The issues with the non-working ¶ para marks should go away once you save
> the file in a native Word format (doc, rtf, docx...).
>
> Or, if that's not a good option, replace ^13 with ^p, and maybe ^10 with
> ^p to be on the safe side. I do that routinely at the beginning of macros
> that process text files or other non-word files.
>
> Regards,
> Klaus
>
>



Hi Klaus,

Thanks for responding. Yes, I remembered after posting that I asked this
before (!), and that the response you gave me then about changing ^13 to ^p,
does solve the problem. (^13 to ^p does the same thing, either with
*wildcards* checked or without.)

The main thing that throws me, is that I use a wonderful macro I got from a
Microsoft-provided template called Macros8 that was supplied with Word
several versions back. That template has a number of useful macros, but the
one I use most is called ANSIValue, which displays the ANSI values of a
swiped group of characters.

Therefore, when I swipe these four characters surrounding a para return:

e.¶
G

I get 101 46 13 71

Regardless of whether it's fake paras or real paras I get that 13 -- and
only that 13.

However, in earlier days of Word I would see 10 13 often enough (or 13 10, I
don't remember the order), but I haven't seen a true 10 13 this way in
several years. If you look at the Word file in a text editor there's no sign
of a difference, so I figure the difference must be in the header of the
Word file that specifies that there is one type of para-break encoding, when
in fact the file contains mixed para-break encodings.

Along these same lines of getting under the hood of what's happening in the
Word file, I'd love to have a better understanding of how to determine when
files contain Unicode versus when they don't, whether files sometimes
*think* they contain Unicode but in fact they don't and vice versa, etc.

--WilliamW







>
>
> "WilliamWMeyer" <fake.TakeThisOut@dontharvest.com> wrote:
>> Hi folks,
>>
>> I often am dealing with Word files I didn't create. That is, a human
>> being other than myself created them (perhaps on Macs), or Save As/Export
>> filters created them. For example, Saving As .doc/.rtf out of Adobe
>> Acrobat, or OCR scanning software that saves as .doc/rtf.
>>
>> Because of this I often run into fake returns, which I can manipulate to
>> some degree, in Find & Replace with ^p and ^13.
>>
>> However, if I select a group of these paragraphs and apply a Style to
>> them, their fakeness is revealed and the block of paragraphs is treated
>> as if it were one paragraph.
>>
>> I used to see and solve a problem like this, in which a para mark had a
>> ^10 before or after it, but I haven't seen those ^10s since a couple Word
>> or operating system versions ago.
>>
>> Any ideas?
>> WilliamW
>>
>
Back to top
Login to vote
Klaus Linke

External


Since: Dec 18, 2003
Posts: 461



(Msg. 4) Posted: Thu Oct 09, 2008 6:55 pm
Post subject: Re: Fake returns? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

> [...] If you look at the Word file in a text editor there's no sign of a
> difference, so I figure the difference must be in the header of the Word
> file that specifies that there is one type of para-break encoding, when in
> fact the file contains mixed para-break encodings.

Yes, something like that is my guess too. If you could look into the binary
*.doc format (or its equivalent in memory once Word has loaded a doc),
functioning paragraph marks would likely have a pointer associated with them
that points to a data structure with the style and all the paragraph
formatting.
In the problematic cases, that pointer wasn't created. Just speculation,
though.


> Along these same lines of getting under the hood of what's happening in
> the Word file, I'd love to have a better understanding of how to determine
> when files contain Unicode versus when they don't, whether files sometimes
> *think* they contain Unicode but in fact they don't and vice versa, etc.

Interesting questions... One quick way to tell if a file has "Unicode
characters" (precisely, characters that aren't in the old Windows code page
1252) is to try to save as Plain Text (*.txt), choosing the Windows
(Standard) encoding.
If the file contains such characters, the dialog shows a yellow exclamation
mark, and the characters that can't be saved are marked red in the preview
window.

Greetings,
Klaus
Back to top
Login to vote
WilliamWMeyer

External


Since: Nov 08, 2006
Posts: 12



(Msg. 5) Posted: Thu Oct 09, 2008 6:55 pm
Post subject: Re: Fake returns? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Klaus Linke" <info.TakeThisOut@fotosatz-kaufmann.de> wrote in message
news:uc5%23m$iKJHA.3808@TK2MSFTNGP04.phx.gbl...
>> Along these same lines of getting under the hood of what's happening in
>> the Word file, I'd love to have a better understanding of how to
>> determine when files contain Unicode versus when they don't, whether
>> files sometimes *think* they contain Unicode but in fact they don't and
>> vice versa, etc.
>
> Interesting questions... One quick way to tell if a file has "Unicode
> characters" (precisely, characters that aren't in the old Windows code
> page 1252) is to try to save as Plain Text (*.txt), choosing the Windows
> (Standard) encoding.



I know about this, and do it, but I'd like to able to control these things
without a human being having to look at the file.

I've been able to use VBA to cycle through a file character by character.
Typically for files that use Unicode chars, the chars used are within a
100-200 char unicode range. Once I've identified what that range is, then I
change the values my macro searches for to the values in that range. But
going char by char *and* going unicode value by unicode value through the
whole 6000-char range of unicode values would take a verrry long time.
Already, going char by char through the file takes pretty long.
Back to top
Login to vote
Klaus Linke

External


Since: Dec 18, 2003
Posts: 461



(Msg. 6) Posted: Fri Oct 10, 2008 12:53 am
Post subject: Re: Fake returns? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

> I know about this, and do it, but I'd like to able to control these things
> without a human being having to look at the file.

Yes, that was the "low tech" approach <g>


> I've been able to use VBA to cycle through a file character by character.
> Typically for files that use Unicode chars, the chars used are within a
> 100-200 char unicode range. Once I've identified what that range is, then
> I change the values my macro searches for to the values in that range. But
> going char by char *and* going unicode value by unicode value through the
> whole 6000-char range of unicode values would take a verrry long time.
> Already, going char by char through the file takes pretty long.

Then maybe I have something a little more high-tech for ya (see code
below)...
If you'd rather put the results in an array and process it, instead of
printing it out at the end of the document, I'm sure you can adapt the code.

Klaus

Sub CodesFast()
Dim myString, myStringNew, myChar, myCode
Dim strOutput, HexString, myCharCount
myString = ActiveDocument.Content.Text
strOutput = ""
Do
myChar = left$(myString, 1)
myStringNew = Replace(myString, myChar, "", 1, Compare:=vbBinaryCompare)
myCharCount = Len(myString) - Len(myStringNew)
myCode = AscW(myChar) And &HFFFF&
strOutput = strOutput & (myCode) & vbTab
StatusBar = myCode
HexString = Hex$(myCode)
While Len(HexString) < 4
HexString = "0" & HexString
Wend
strOutput = strOutput & "U+" & HexString & vbTab
If myCode > 31 Then
strOutput = strOutput & myChar
End If
strOutput = strOutput & vbTab & LTrim(STR$(myCharCount))
strOutput = strOutput & vbCr
myString = myStringNew
Loop Until Len(myString) = 0
ActiveDocument.Content.Select
Selection.Collapse Direction:=wdCollapseEnd
Selection.Range.InsertParagraphBefore
Selection.TypeText Text:=" "
Selection.Expand Unit:=wdParagraph
With ActiveDocument.Bookmarks
.Add Range:=Selection.Range, Name:="Codes"
.DefaultSorting = wdSortByName
.ShowHidden = False
End With
Selection.Collapse Direction:=wdCollapseStart
Selection.TypeText strOutput
Selection.GoTo What:=wdGoToBookmark, Name:="Codes"
Selection.ConvertToTable Separator:=wdSeparateByTabs
Selection.SORT ExcludeHeader:=False, FieldNumber:=1, _
SortFieldType:=wdSortFieldNumeric, _
SortOrder:=wdSortOrderAscending
Selection.Rows.ConvertToText Separator:=wdSeparateByTabs
ActiveDocument.Bookmarks("Codes").Delete
End Sub
Back to top
Login to vote
WilliamWMeyer

External


Since: Nov 08, 2006
Posts: 12



(Msg. 7) Posted: Fri Oct 10, 2008 6:01 pm
Post subject: Re: Fake returns? [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

"Klaus Linke" <info.TakeThisOut@fotosatz-kaufmann.de> wrote in message
news:ORxfLHmKJHA.5340@TK2MSFTNGP03.phx.gbl...
>> I know about this, and do it, but I'd like to able to control these
>> things without a human being having to look at the file.
>
> Yes, that was the "low tech" approach <g>
>
>
>> I've been able to use VBA to cycle through a file character by character.
>> Typically for files that use Unicode chars, the chars used are within a
>> 100-200 char unicode range. Once I've identified what that range is, then
>> I change the values my macro searches for to the values in that range.
>> But going char by char *and* going unicode value by unicode value through
>> the whole 6000-char range of unicode values would take a verrry long
>> time. Already, going char by char through the file takes pretty long.
>
> Then maybe I have something a little more high-tech for ya (see code
> below)...
> If you'd rather put the results in an array and process it, instead of
> printing it out at the end of the document, I'm sure you can adapt the
> code.
>
> Klaus


Wow. Thanks, Klaus.

I tried it, and saw the results. The Hex and binary stuff in the code are
beyond my depth right now, but I think I can use this as a jumping off point
for further exploration.

-WilliamW
Back to top
Login to vote
Display posts from previous:   
       Home -> Office -> Formating Long Docs All times are: Eastern Time (US & Canada) (change)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Categories:
 Windows XP
 Windows Vista
 Windows Other
  Office
 Office Other
 Security
  • Home |
  • Shareware |
  • Windows Tips |
  • Hot Offers |
  • FREE Newsletters |
  • Arcade |
  • Forums |
  • eBooks |
  • About WUGNET |
  • Partners |
  • Contact

  • WUGNET Privacy Policy |
  • Link to WUGNET |
  • IT Support