[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[JDEV] International Char Sets..
I think these questions in the expat FAQ may give some additional information regarding this conversation..
---
How can I get expat to deal with non-ASCII characters?
By default, expat assumes that documents are encoded in UTF-8. In UTF-8, ASCII characters are represented by a single byte as they would be in ASCII, but non-ASCII characters are represented by a sequence of two or more bytes all with the 8th bit set. The encoding most widely used for European languages is ISO 8859-1 which is not compatible with UTF-8. To use this encoding, expat must be told either by supplying an argument of "iso-8859-1" to XML_ParserCreate, or by starting the document with <?xml version="1.0" encoding="iso-8859-1"?>.
What encodings does expat support?
expat has built in support for the following encodings:
utf-8
utf-16
iso-8859-1
us-ascii
Additional encodings can be supported by using XML_SetUnknownEncodingHandler
---
Thomas Charron
--== Sent via Deja.com http://www.deja.com/ ==--
Share what you know. Learn what you don't.