Z
ZOCOR
Hi
I am using HTMLEditorKit.Parser class to parse a HTML file. However, I have
found this Swing HTML parser extremely difficult to use.
I am trying to parse a HTML file and extracting specific information from it
into a table. Consider the snippet of my HTML and the table I like it to
generate:
HTML source:
<HTML>
<TITLE></TITLE>
<BODY>
<PRE>
Identifer: ABCDEFG
</PRE>
data: 123456
<PRE>
</PRE>
</BODY>
</HTML>
TABLE:
ABCDEFG 123456
Here is the code I have so far:
import javax.swing.text.*;
import javax.swing.text.html.*;
import java.io.*;
public class HTMLParser extends HTMLEditorKit
{
public HTMLEditorKit.Parser getParser()
{
return super.getParser();
}
public static void main (String[] args)
{
try
{
Reader r = new FileReader("html_file.html");
HTMLEditor.Parser parse = new HTMLParser.getParser()
HTMLEditorKit.ParserCallback cb =
{
public void handleStartTag(HTML.Tag t, MutableAttributeSet
a, int a)
{
if (t==HTML.Tag.PRE)
{
//print whats between the pre tag
}
}
public void handleText(char[] data, int pos)
{
//print whats between the pre tags
}
};
parse.parse(r, cb, true);
}
catch (IOException e)
{
System.out.println(e);
}
}
}
I would appreciate it very much if someone could solve this problem for me.
I tried the sun tutortial, but the examples aren't that clear enough for me.
Thanks
ZOCOR
I am using HTMLEditorKit.Parser class to parse a HTML file. However, I have
found this Swing HTML parser extremely difficult to use.
I am trying to parse a HTML file and extracting specific information from it
into a table. Consider the snippet of my HTML and the table I like it to
generate:
HTML source:
<HTML>
<TITLE></TITLE>
<BODY>
<PRE>
Identifer: ABCDEFG
</PRE>
data: 123456
<PRE>
</PRE>
</BODY>
</HTML>
TABLE:
ABCDEFG 123456
Here is the code I have so far:
import javax.swing.text.*;
import javax.swing.text.html.*;
import java.io.*;
public class HTMLParser extends HTMLEditorKit
{
public HTMLEditorKit.Parser getParser()
{
return super.getParser();
}
public static void main (String[] args)
{
try
{
Reader r = new FileReader("html_file.html");
HTMLEditor.Parser parse = new HTMLParser.getParser()
HTMLEditorKit.ParserCallback cb =
{
public void handleStartTag(HTML.Tag t, MutableAttributeSet
a, int a)
{
if (t==HTML.Tag.PRE)
{
//print whats between the pre tag
}
}
public void handleText(char[] data, int pos)
{
//print whats between the pre tags
}
};
parse.parse(r, cb, true);
}
catch (IOException e)
{
System.out.println(e);
}
}
}
I would appreciate it very much if someone could solve this problem for me.
I tried the sun tutortial, but the examples aren't that clear enough for me.
Thanks
ZOCOR