A
ATC Productions
Group,
New to Java, but certainly not to programming. I am trying to find an
approach to this problem using Java (already solved with Perl, but I
must port):
Situation:
1. A continuous string is fed into the program via STDIN.
2. Each string between EOL strings/delimiters is a payload of 7
strings, each separated by a "token" delimiter that is different from
the EOL delimiter string.
3. The string is well-formatted, and can have any type of "token"
delimiters or EOL string/characters. his part I have control over.
4. All the typical delimiters like tab, " and ' and CRLF and NewLine,
etc. can be found in the _payload_ part of the string (stream) coming
in.
5. I can't pre-parse the data. This program _is_ the pre-parser. Let me
be clearer; I can't _alter_ the incoming data in the payload-not even a
little.
6. I can insert anything at all for EOL and string delimiters.
Struggle:
1. Find a way to use a String as a delimiter, and not just a single
character.
2. Any single character chosen (seems to be what *Tokenizer wants)as a
delimiter will eventually appear in the data.
3. Right now, I'm using Perl to first cut the data into lines by
inserting a string as EOL (something like: ]:::[) that would not appear
in the data.
4. From there, I cut that "line" of text again using a "token" string
delimiter like "<--->". Again, this string won't show up in the data.
5. Perl simply loads the "tokens" into an array, and life is good.
6. I have control over what the delimiting characters are at the
sending application, and they can be any string.
7. Another caveat: the STDIN "stream" is usually very, very
busy-usually at the capacity of the NIC. This means the less lines of
code, the better!
Need:
A simple Java method to identify these strings (versus characters) as
EOL and "token delimiters", and then parse out the "tokens" as Java
would normally.
Ideas?
TIA!!!
pat
New to Java, but certainly not to programming. I am trying to find an
approach to this problem using Java (already solved with Perl, but I
must port):
Situation:
1. A continuous string is fed into the program via STDIN.
2. Each string between EOL strings/delimiters is a payload of 7
strings, each separated by a "token" delimiter that is different from
the EOL delimiter string.
3. The string is well-formatted, and can have any type of "token"
delimiters or EOL string/characters. his part I have control over.
4. All the typical delimiters like tab, " and ' and CRLF and NewLine,
etc. can be found in the _payload_ part of the string (stream) coming
in.
5. I can't pre-parse the data. This program _is_ the pre-parser. Let me
be clearer; I can't _alter_ the incoming data in the payload-not even a
little.
6. I can insert anything at all for EOL and string delimiters.
Struggle:
1. Find a way to use a String as a delimiter, and not just a single
character.
2. Any single character chosen (seems to be what *Tokenizer wants)as a
delimiter will eventually appear in the data.
3. Right now, I'm using Perl to first cut the data into lines by
inserting a string as EOL (something like: ]:::[) that would not appear
in the data.
4. From there, I cut that "line" of text again using a "token" string
delimiter like "<--->". Again, this string won't show up in the data.
5. Perl simply loads the "tokens" into an array, and life is good.
6. I have control over what the delimiting characters are at the
sending application, and they can be any string.
7. Another caveat: the STDIN "stream" is usually very, very
busy-usually at the capacity of the NIC. This means the less lines of
code, the better!
Need:
A simple Java method to identify these strings (versus characters) as
EOL and "token delimiters", and then parse out the "tokens" as Java
would normally.
From there they go into a database, and all that is easy.
Ideas?
TIA!!!
pat