Strange problem trying to remove links from a html string

Put your problem here if it does not fit any of the other categories.

Strange problem trying to remove links from a html string

Postby Kalder » Tue Jan 31, 2012 6:33 am

Hello, here I am again, with a new problem.
I parsed a html code in to a string, and then removed the unwanted parts at the start and end of the string, that was the easy part.
But take a look at this log output and then at the code.

  • 01-31 06:19:13.486: D/HTML_inString part to remove->: <a href="/wedstrijd/1622160/psv---vitesse/">
  • 01-31 06:19:13.486: D/Should still be the same->:<a href="/wedstrijd/1622160/psv---vitesse/">
  • 01-31 06:19:13.606: E/AndroidRuntime: Caused by: java.lang.StringIndexOutOfBoundsException: length=4959; regionStart=848; regionLength=-605
  • 01-31 06:50:28.805: E/AndroidRuntime: at com.kalder.android.FeyenoordActivity.onCreate(FeyenoordActivity.java:48)

Notice that both outputs are the same, still, it gives me a StringIndexOutOfBoundsException while the string to be removed is displayed correctly in the log...

Line 3 in the following syntax is line 48 in my java file.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. while (HTML_inString.contains("<a href=\"/wedstrijd/")) {
  2.         if (HTML_inString.contains("<a href=\"/wedstrijd/")) {
  3.                 Log.d("HTML_inString part to remove->","" + HTML_inString.substring(HTML_inString.indexOf("<a href=\"/wedstrijd/"),HTML_inString.indexOf("<span class=\"club-home\">"))); ///// This is line 48...
  4.                 String temp = HTML_inString.substring(HTML_inString.indexOf("<a href=\"/wedstrijd/"),HTML_inString.indexOf("<span class=\"club-home\">"));
  5.                 Log.d("Should still be the same->", temp);
  6.                 HTML_inString = HTML_inString.replace("" + temp, ""); // I tried both with "" + and without that.
  7.         } else {
  8.                 break;
  9.         }
  10.         try {
  11.                 Thread.sleep(100);
  12.         } catch (InterruptedException e) {
  13.                 // TODO Auto-generated catch block
  14.                 e.printStackTrace();
  15.         }
  16. }
Parsed in 0.034 seconds, using GeSHi 1.0.8.4
HTML_inString.indexOf("<span class=\"club-home\">") is the first index after the > from <a href=


so frankly it gives an error although the string to replace (temp) is correct.
Kalder
Junior Developer
Junior Developer
 
Posts: 17
Joined: Sun Jul 24, 2011 8:20 pm

Top

Re: Strange problem trying to remove links from a html strin

Postby Kalder » Tue Jan 31, 2012 7:59 am

I figured it out... because <span blabla didn't get changed at all it kept looking for the same <span blabla,
so i made it remove the < from <span, and at that point it started looping the links out including each < of <span, then after the while loop i replaceAll'ed them back in.
Kalder
Junior Developer
Junior Developer
 
Posts: 17
Joined: Sun Jul 24, 2011 8:20 pm

Top

Return to Other Coding-Problems

Who is online

Users browsing this forum: Google [Bot] and 14 guests