How to prase XML when the encoding is not UTF8?

Basic Tutorials concerning: GUI, Views, Activites, XML, Layouts, Intents, ...

How to prase XML when the encoding is not UTF8?

Postby jayyh » Sun Aug 16, 2009 8:20 am

Seems the SAX can only prase the XML which encoding is UTF8. when I try to prase a no UTF8 xml. The exception is displayed as blow:


08-14 04:36:37.756: WARN/System.err(792): org.apache.harmony.xml.ExpatParser$ParseException: At line 5, column 7: not well-formed (invalid token)

Anyone know how to resolve this problem?

Any suggestions will be high appreciate. Thank you
jayyh
Freshman
Freshman
 
Posts: 2
Joined: Fri Aug 14, 2009 11:58 am

Top

Postby padde » Sun Aug 16, 2009 11:21 am

Each XML starts with a line similar to this..

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
  1.  
  2. <?xml version="1.0" encoding="utf-8"?>
  3.  
  4.  
Parsed in 0.000 seconds, using GeSHi 1.0.8.4


if you set the correct encoding android should
parse it without problems.
padde
Master Developer
Master Developer
 
Posts: 443
Joined: Wed Apr 08, 2009 4:52 pm

Postby jayyh » Sun Aug 16, 2009 4:36 pm

padde wrote:Each XML starts with a line similar to this..

Syntax: [ Download ] [ Hide ]
Using xml Syntax Highlighting
  1. <?xml version="1.0" encoding="utf-8"?>
  2.  
Parsed in 0.000 seconds, using GeSHi 1.0.8.4


if you set the correct encoding android should
parse it without problems.


Hi padde, Thanks for reply.
But I means I got a xml which encoding is not UTF8, such as ISO, GBK, EUC, etc. I can't change the encoding of those xmls to UTF8. So if there is any method to make the SAXprase to suit those xmls.

Thank you.
jayyh
Freshman
Freshman
 
Posts: 2
Joined: Fri Aug 14, 2009 11:58 am

Postby padde » Sun Aug 16, 2009 6:06 pm

The encoding have to match the content...

<?xml version="1.0" encoding="ISO-8859-1"?>

for ISO-8859-1 etc. this should work... or is this
what you tried?
padde
Master Developer
Master Developer
 
Posts: 443
Joined: Wed Apr 08, 2009 4:52 pm

Postby tiger79 » Wed Feb 24, 2010 11:28 am

Actually this won't work with me :(

I also have an ISO XML stream which I retrieve with a HttpPost request after which , when feed to the SAXParser, I will get several java.lang.RuntimeException: org.apache.harmony.xml.ExpatParser$ParseException not well-formed (invalid token) exceptions...
Even when the header is changed to utf-8 it will still throw errors...
tiger79
Junior Developer
Junior Developer
 
Posts: 10
Joined: Thu Jan 14, 2010 5:30 pm

Re: How to prase XML when the encoding is not UTF8?

Postby pabmadi » Mon Apr 26, 2010 9:39 pm

jayyh wrote:Seems the SAX can only prase the XML which encoding is UTF8. when I try to prase a no UTF8 xml. The exception is displayed as blow:


08-14 04:36:37.756: WARN/System.err(792): org.apache.harmony.xml.ExpatParser$ParseException: At line 5, column 7: not well-formed (invalid token)

Anyone know how to resolve this problem?

Any suggestions will be high appreciate. Thank you


I use this:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. InputSource is = new InputSource(url.openStream());
  2. is.setEncoding("ISO-8859-1");
Parsed in 0.029 seconds, using GeSHi 1.0.8.4


Bye.-
pabmadi
Junior Developer
Junior Developer
 
Posts: 19
Joined: Thu Apr 08, 2010 9:25 pm

Top

Hi jayyh

Postby zOro » Tue Apr 27, 2010 6:31 pm

I have the same problem,did you find any solutions?
zOro
Freshman
Freshman
 
Posts: 4
Joined: Thu Mar 11, 2010 12:36 pm

Re: How to prase XML when the encoding is not UTF8?

Postby rcheng4g » Sat May 01, 2010 3:18 pm

jayyh wrote:Seems the SAX can only prase the XML which encoding is UTF8. when I try to prase a no UTF8 xml. The exception is displayed as blow:


08-14 04:36:37.756: WARN/System.err(792): org.apache.harmony.xml.ExpatParser$ParseException: At line 5, column 7: not well-formed (invalid token)

Anyone know how to resolve this problem?

Any suggestions will be high appreciate. Thank you



So do I.
rcheng4g
Freshman
Freshman
 
Posts: 2
Joined: Wed Jan 27, 2010 4:40 pm

Re: How to prase XML when the encoding is not UTF8?

Postby nookie1988 » Wed Sep 01, 2010 6:56 pm

Hi guys!
Did anyone resolve this problem until now? Because I want to write my own NewsReader with http://java.sun.com/developer/technicalArticles/javaserverpages/rss_utilities/ but I get the same exception as you are? Can anyone help me?

My Code:
Code: Select all
   try{
            URL url= new URL("http://rss.cnn.com/rss/cnn_world.rss");
            this.parser = RssParserFactory.createDefault();
            Rss rss = parser.parse(url);
            Collection items = rss.getChannel().getItems();
            if (items != null && !items.isEmpty()){
               for(Iterator i = items.iterator();i.hasNext();)
               {
                  Item item = (Item)i.next();
                  remoteViews.setTextViewText(R.id.widget_textview, item.getTitle().toString());
                  this.manager.updateAppWidget(thisWidget, remoteViews);
                  Thread.sleep(5000);
               }
            }            
         }
         catch (Exception e){
            remoteViews.setTextViewText(R.id.widget_textview, e.getMessage());
            this.manager.updateAppWidget(thisWidget, remoteViews);
         }


It is the code of a widget I want to write.
nookie1988
Junior Developer
Junior Developer
 
Posts: 24
Joined: Mon May 24, 2010 6:53 pm

Re: How to prase XML when the encoding is not UTF8?

Postby zarahjutz » Wed Sep 08, 2010 6:23 am

I have tried it with encoding set to ISO_8859_1.
Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. public ParseDemoXML(String feedSource){
  2.         try {
  3.                 this.feedUrl = new URL(feedSource);
  4.         } catch (MalformedURLException e) {
  5.                 e.printStackTrace();
  6.         }
  7. }
  8.  
  9. public MyObject parse(){
  10.     final MyObject xmlContents = new MyObject;
  11.  
  12.     RootElement root = new RootElement(DEMO_ROOT);
  13.     Element child = root.getChild(DEMO_CHILD);
  14.    // do the same for all children, set listeners to fill in the contents of your object      
  15.  
  16.     try {
  17.           Xml.parse(this.getInputStream(), Xml.Encoding.ISO_8859_1, root.getContentHandler());
  18.      } catch (IOException e) {
  19.           e.printStackTrace();
  20.      } catch (SAXException e) {
  21.           e.printStackTrace();
  22.      }
  23.  
  24.      return xmlContents;
  25. }
  26.  
  27. protected InputStream getInputStream() {
  28.         try {
  29.                 return feedUrl.openConnection().getInputStream();
  30.         } catch (IOException e) {
  31.                 throw new RuntimeException(e);
  32.        }
  33. }
Parsed in 0.033 seconds, using GeSHi 1.0.8.4
zarahjutz
Freshman
Freshman
 
Posts: 3
Joined: Tue Sep 07, 2010 10:36 am

Top

Return to Novice Tutorials

Who is online

Users browsing this forum: Yahoo [Bot] and 6 guests