SearchSearch   

Entities in alt and title text

 
Goto page Previous  1, 2, 3, 4, 5
   Webmaster Forums (Home) -> HTML RSS
Next:  using tag first time  
Author Message
Andreas Prilop

External


Since: Jul 04, 2007
Posts: 23



(Msg. 16) Posted: Tue Jul 31, 2007 3:26 pm
Post subject: Re: Entities in alt and title text
Archived from groups: comp>infosystems>www>authoring>html (more info?)

On Tue, 31 Jul 2007, The Bicycling Guitarist wrote:

> X-Newsreader: Microsoft Outlook Express 6.00.2900.3138
>
> I am replacing straight vertical apostrophes with ’ throughout
> my web site.

Fine!
Btw: You can do the same in your newsreader^W Outlook Express
with the settings

Tools > Options > Send
Mail Sending Format > Plain Text Settings > Message format MIME
News Sending Format > Plain Text Settings > Message format MIME
Encode text using: None

ASCII apostrophe (') and curly apostrophe (’).

> Should I do this in alt text

Yes.

> and title attributes

May be dangerous because the TITLE attribute is often shown
in a font of restricted character set.
http://www.cs.tut.fi/~jkorpela/html/alt.html#tech
reports problems with obsolete Windows versions.

> What about in page titles?

OK.

--
In memoriam Alan J. Flavell
http://groups.google.com/groups/search?q=author:Alan.J.Flavell
Back to top
The Bicycling Guitarist

External


Since: Nov 03, 2004
Posts: 121



(Msg. 17) Posted: Tue Jul 31, 2007 3:26 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

"Andreas Prilop" <Prilop2007.DeleteThis@trashmail.net> wrote in message
news:Pine.GSO.4.63.0707311514040.18108@s5b004.rrzn.uni-hannover.de...
> On Tue, 31 Jul 2007, The Bicycling Guitarist wrote:
> ASCII apostrophe (') and curly apostrophe (’).
>
>> Should I do this in alt text
>
> Yes.
>
>> and title attributes
>
> May be dangerous because the TITLE attribute is often shown
> in a font of restricted character set.

Thank you for the specific information. I never would have guessed that
character entities are okay in alt text but maybe not okay in title
attribute, and that would be a tricky thing to find by search if you didn't
already know.
Back to top
Harlan Messinger

External


Since: Apr 25, 2004
Posts: 1190



(Msg. 18) Posted: Tue Jul 31, 2007 3:27 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

Andy Dingley wrote:
> On 31 Jul, 16:47, Harlan Messinger <hmessinger.removet....TakeThisOut@comcast.net>
> wrote:
>
>>> Of course they can happen with ’, just pass it through an XML
>>> tool that works on the entirely correct and specification-conformant
>>> basis that ’ can be converted transparently to and from the
>>> literal character "'" at the serializer's whim. Then you've entered
>>> the domain where incorrect encodings will break the content.
>> *Anything* that works fine to begin with won't work if you first pass it
>> through something that breaks it.
>
> XML doesn't "break" it. It does something entirely legal.

If the original works fine in HTML browsers, and then the XML processor
that's *supposed* to keep it functional actually turns it into something
that doesn't work as intended, then the processor has broken it, by
definition of the word "break". What XML, in and of itself, can legally
do is irrelevant.

> The risk here is that the world, and certainly not the web world,
> isn't simple. Even if the OP thinks they're using a simple process,
> how simple is it really? What happens when they post that code into a
> blog engine? Through something that's collected by RSS and re-
> distributed? Now there's an XML-based process that certainly does
> hammer on numeric entities.

Why should he care if downstream purloiners of his material don't have
proper processing on *their* systems?
Back to top
Andreas Prilop

External


Since: Jul 04, 2007
Posts: 23



(Msg. 19) Posted: Tue Jul 31, 2007 3:28 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On Tue, 31 Jul 2007, The Bicycling Guitarist wrote:

> Won't those characters in that order be recognized as
> a character entity and be rendered as a curly apostrophe?

Yes.

> What about in meta tags such as description?
> Do I dare replace the straight apostrophes in the
> description with the character entity for a curly apostrophe?

Yes - but <meta description> is mostly useless;
<meta keywords> is completely useless.

--
In memoriam Alan J. Flavell
http://groups.google.com/groups/search?q=author:Alan.J.Flavell
Back to top
Andreas Prilop

External


Since: Jul 04, 2007
Posts: 23



(Msg. 20) Posted: Tue Jul 31, 2007 3:53 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On Tue, 31 Jul 2007, Andy Dingley wrote:

> For "modern" web browsers, Unicode will be supported and this will
> work.

Your definition of "modern" thus means Netscape 4 and above.

> http://www.badscience.net/?p=398

Good example! Such things can happen only with UTF-8-encoded
characters but never with character references like ’ .
Your example shows that ’ is indeed safer.

--
In memoriam Alan J. Flavell
http://groups.google.com/groups/search?q=author:Alan.J.Flavell
Back to top
David Trimboli

External


Since: Apr 06, 2006
Posts: 14



(Msg. 21) Posted: Tue Jul 31, 2007 4:12 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

The Bicycling Guitarist <Chris.RemoveThis@TheBicyclingGuitarist.net> wrote:
> Well there is an entity called apostrophe that I will use there, and
> ’ will only be used for right single quote. I want to be
> technically correct no matter how it looks.
>
> example from my web site:
>
> “How could you ‘bust’ me, I'm
> irresistible!”

One day I realized this was a big waste of time. Why bother with
entities when you can simply use an encoding that supports those
characters? Use UTF-8 and you can use those characters directly.

“How could you ‘bust’ me; I’m irresistible!â€

David
Stardate 7580.6
Back to top
David Trimboli

External


Since: Apr 06, 2006
Posts: 14



(Msg. 22) Posted: Tue Jul 31, 2007 4:39 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

Andy Dingley <dingbat.DeleteThis@codesmiths.com> wrote:
> * An apostrophe is not the same thing as a right single quote.

According to the Unicode chart
(http://www.unicode.org/charts/PDF/U2000.pdf), U+2019 is not only the
right single quotation mark, it is also the “preferred character to use
for apostrophe.â€

David
Stardate 7580.7
Back to top
David Trimboli

External


Since: Apr 06, 2006
Posts: 14



(Msg. 23) Posted: Tue Jul 31, 2007 4:46 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

The Bicycling Guitarist <Chris.RemoveThis@TheBicyclingGuitarist.net> wrote:
> Dang. I found more than one reputable-looking resource that said
> ’ did double duty as right single quote AND as apostrophe. Some
> of the regulars here also seem to think it's okay to use the right
> single quote as an apostrophe. Obviously there is disagreement. Is
> there an "official" position on this matter?

Unicode (http://www.unicode.org/charts/PDF/U2000.pdf), U+2019 (’)
“is the preferred character to use for apostrophe.†However, the
character’s name is “right single quotation mark.â€

So yes, that entity works for both purposes.

David
Stardate 7580.7
Back to top
Andreas Prilop

External


Since: Jul 04, 2007
Posts: 23



(Msg. 24) Posted: Tue Jul 31, 2007 6:10 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On Tue, 31 Jul 2007, Andy Dingley wrote:

> ’
>
> My point isn't about encodings, it's about the fact that
> this codepoint is from a relatively obscure part of Unicode that
> _requires_ Unicode support, rather than a simpler solution needing no
> more than ASCII characters.

This character (’) is included in Windows code page 1252
"West European"
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
and in the MacRoman character set.
http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ROMAN.TXT
It is *not* from a "relatively obscure part of Unicode".

By your arguments, we should not even write in German with
German special letters (ä ö ü).
Back to top
Helmut Richter

External


Since: Jun 12, 2007
Posts: 18



(Msg. 25) Posted: Tue Jul 31, 2007 8:03 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On Tue, 31 Jul 2007, Andy Dingley wrote:

> The character might be (as 0x92), but the codepoint (0x2019 / ’)
> certainly isn't.

.... and the connexion between the two is found in
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT

--
Helmut Richter
Back to top
Andy Dingley

External


Since: Feb 14, 2004
Posts: 1110



(Msg. 26) Posted: Tue Jul 31, 2007 8:03 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On Tue, 31 Jul 2007 20:03:31 +0200, Helmut Richter <hhr-m DeleteThis @web.de> wrote:

>On Tue, 31 Jul 2007, Andy Dingley wrote:
>
>> The character might be (as 0x92), but the codepoint (0x2019 / ’)
>> certainly isn't.
>
>... and the connexion between the two is found in
>http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT

So you claim that serving 0x2019 in a Windows-1252 encoded document will
deliver a right single quote?
Back to top
Helmut Richter

External


Since: Jun 12, 2007
Posts: 18



(Msg. 27) Posted: Tue Jul 31, 2007 9:32 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On Tue, 31 Jul 2007, Andy Dingley wrote:

> So you claim that serving 0x2019 in a Windows-1252 encoded document will
> deliver a right single quote?

No. I just wanted to make clear that the two characters are not similar by
coincidence but they are the same by definition.

--
Helmut Richter
Back to top
Andy Dingley

External


Since: Feb 14, 2004
Posts: 1110



(Msg. 28) Posted: Tue Jul 31, 2007 9:32 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On Tue, 31 Jul 2007 21:32:19 +0200, Helmut Richter <hhr-m.RemoveThis@web.de> wrote:

>No. I just wanted to make clear that the two characters are not similar by
>coincidence but they are the same by definition.

Agreed, but that's by their definition as _characters_, not codepoints.

Unfortunately we're working with codepoints here (as for all HTML
documents), not characters -- you can't represent the same codepoint as
’ in a Windows-1252 document, no matter how you encode it.
Back to top
Stan Brown

External


Since: Jul 13, 2004
Posts: 1233



(Msg. 29) Posted: Tue Jul 31, 2007 10:34 pm
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

Tue, 31 Jul 2007 16:12:12 -0400 from David Trimboli
<david.DeleteThis@trimboli.name>:
> One day I realized this was a big waste of time. Why bother with
> entities when you can simply use an encoding that supports those
> characters? Use UTF-8 and you can use those characters directly.
>
> â??How could you â??bustâ?? me; Iâ??m irresistible!â?

Res ipsa loquitur.

--
Stan Brown, Oak Road Systems, Tompkins County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2.1 spec: http://www.w3.org/TR/CSS21/
validator: http://jigsaw.w3.org/css-validator/
Why We Won't Help You:
http://diveintomark.org/archives/2003/05/05/why_we_wont_help_you
Back to top
Andy Dingley

External


Since: Jun 01, 2007
Posts: 134



(Msg. 30) Posted: Wed Aug 01, 2007 2:47 am
Post subject: Re: Entities in alt and title text
Archived from groups: per prev. post (more info?)

On 31 Jul, 20:27, Harlan Messinger <hmessinger.removet... DeleteThis @comcast.net>
wrote:

> >>> Of course they can happen with ’, just pass it through an XML
> >>> tool that works on the entirely correct and specification-conformant
> >>> basis that ’ can be converted transparently to and from the
> >>> literal character "'" at the serializer's whim. Then you've entered
> >>> the domain where incorrect encodings will break the content.
> >> *Anything* that works fine to begin with won't work if you first pass it
> >> through something that breaks it.
>
> > XML doesn't "break" it. It does something entirely legal.
>
> If the original works fine in HTML browsers, and then the XML processor
> that's *supposed* to keep it functional actually turns it into something
> that doesn't work as intended, then the processor has broken it, by
> definition of the word "break".

That's wrong in two ways:

Firstly, look at systems theory for your definition of "break". If you
expect to build a big system and have it work oversall, you have to be
careful with this stuff. The function of XML tools are carefully
defined within their own scope. Converting the numeric character
reference ’ to its literal character is permitted, if the output
encoding can cope.

In a _system_ analysis, we may well have a working system that takes
ASCII characters in, processes them with an XML tool to UTF-8 output,
then serves them labelled as ISO-8859-*. Everything works fine,
because the scope of the system's data doesn't go outside the
partition of its correctly working domain. It's not even _wrong_ to do
this -- it's not "serving mislabelled UTF-8", it's a system that
outputs ISO-8859-* by design (using a subset of UTF-8 internally to do
so). For a restricted input scope, this is entirely correct.

Now feed it ’. It breaks because the (correct) behaviour of the
XML processor has caused something to break further down the line. An
"unexpected consequence" if you like. Now if you're claiming that this
is the _XML_processor_ that has broken it, how are you planning to fix
it? By changing the behaviour of the XML processor, such that it no
longer behaves in the way described by the standard? Now hands up all
those who've seen such a thing coded into a live system! How many
times have you seen an XML serialiser coded up locally on top of an
existing major XML processor like Xalan / Xerces, "because the
standard one doesn't work correctly" ?! This isn't a fault in the
original XML processor (the big name ones are pretty robust and well-
tested pieces of code!), it's a fault in the local designer's(sic)
understanding of how XML and encodings work. Bolting on kludges
"because Xerces is broken and we know better" is farcically wrong-
headed and it's most unlikely to end with a system that works
correctly afterwards.

Secondly, it's not the XML processor itself that has broken it. If
anything, it's the over-simple binding of "all XML content" to an
ISO-8859-* content-type HTTP header. That's caused by an over-
simplistic .htaccess or similar, not the XML processor.
Back to top
Display posts from previous:   
       Webmaster Forums (Home) -> HTML
Goto page Previous  1, 2, 3, 4, 5
Page 2 of 5

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum