Sanskrit: Devanagari UTF-8 browser tests
Jun. 4th, 2004 04:26 pmEdited at 20:33 EDT to add that of course
Any Linux users out there?
So anyway, I was previously aware of the lameness of most browsers when it comes to rendering many of the more obscure UTF-8 characters.
One of our projects is all about the Sanskrit, which is written in a character set called Devanagari.
This is represented in Unicode in the U+0900 to U+0970 range (aka the 0xE0A480 to 0xE0A5B0 range in UTF-8 hex).
So I decided to put the "lame browsers" theory to the test.
I tested lots of Mac browsers for problems vis-a-vis Devanagari UTF-8, etc.
I took the word "evaM" (which means 'thus', I think) as the test. It ought to look like this in a browser that understands utf-8:

The dot in red there is the 'M', which gets combined over the 'va' sound. If I put a <span style="color: #FF0000;"> around it to highlight it (such as when a student left it out, so we can show them the error), some browsers freak out even more than they already do just trying to handle the utf-8.
Also, some browsers don't recognize or deal with XML files (esp. ones called something.xml), so that is also reflected in some of the results.
Clearly on a Mac, Opera rules for this sort of thing.
Test Results:
P.S. -- Some of the 'dot to side [right]' or 'dot on left' may
actually vary depending on the text zoom in various of these
browsers. No time to nail that down definitively right now.
i.e. the dot moves according to how large you make the text, in some
browsers (esp. and at least Safari).
http://carole.fates.org/Sanskrit/Test_Files/utf8-Test.xml
http://carole.fates.org/Sanskrit/Test_Files/utf8-Test.html
http://carole.fates.org/Sanskrit/Test_Files/utf8-Test-NoMargin.html
Note I made the font size for these tests 5.1em, or about five times bigger than the default font size in your browser. This is so that the "dot-to-the-side" (left or right) mistake gets minimized in most browsers, and to make the character more visible. However, if you make the font smaller in your browser (View->Text Zoom->minus or smaller type thing), you'll see how lame it gets in a lot of browsers.
| F-name | Platform | Browser | Version | Behavior |
|---|---|---|---|---|
| utf8-Test.xml | Panther | Firefox | 0.8 | Dot to the side, not red |
| utf8-Test.xml | Panther | Camino | 0.8b | Dot to the side, not red |
| utf8-Test.xml | Panther | Opera | 7.51 | Dot red, on top correctly |
| utf8-Test.xml | Panther | I.E. | 5.2 | Garbage (doesn't recognize UTF-8) |
| utf8-Test.xml | Panther | Safari | 1.2.2 | Garbage (doesn't recognize UTF-8) |
| utf8-Test.html | Panther | Firefox | 0.8 | Dot red, on top correctly |
| utf8-Test.html | Panther | Camino | 0.8b | Dot red, on top correctly |
| utf8-Test.html | Panther | Opera | 7.51 | Dot red, 'e' and 'va' not visible |
| utf8-Test.html | Panther | I.E. | 5.2 | Garbage (doesn't recognize UTF-8) |
| utf8-Test.html | Panther | Safari | 1.2.2 | Dot red, but on LEFT (?!) |
| utf8-Test-NoMargin.html | Panther | Firefox | 0.8 | Dot to the side, but IS red |
| utf8-Test-NoMargin.html | Panther | Camino | 0.8b | Dot to the side, but IS red |
| utf8-Test-NoMargin.html | Panther | Opera | 7.51 | Dot red, on top correctly |
| utf8-Test-NoMargin.html | Panther | I.E. | 5.2 | Garbage (doesn't recognize UTF-8) |
| utf8-Test-NoMargin.html | Panther | Safari | 1.2.2 | Dot to the side, but IS red |
(no subject)
Date: 2004-06-04 11:56 pm (UTC)Actually, maybe I'll link to this post from there, too.