New Logic

S

Spidey

What is best solution for finding "a repeatedly occuring lenghtiest sub
SEQUENCE from a given paragraph"

ex:
input:
hello world is a good hello world. Is that the way it is
a way.
Output:
hello world
 
M

Mike Wahler

Spidey said:
What is best solution for finding "a repeatedly occuring lenghtiest sub
SEQUENCE from a given paragraph"

The 'best solution' in the context of this newsgroup is
to write a C program to do it.
ex:
input:
hello world is a good hello world. Is that the way it is
a way.
Output:
hello world

Nobody is going to write the code for you, however if
you post your best attempts (code) accompanied by specific
questions, you'll get plenty of help.

-Mike
 
P

pemo

Spidey said:
What is best solution for finding "a repeatedly occuring lenghtiest sub
SEQUENCE from a given paragraph"

ex:
input:
hello world is a good hello world. Is that the way it is
a way.
Output:
hello world

It's reasonably difficult, as you're looking for n-gram frequency. So,
there's text chunking to do, and more than a little strtok'ing etc.

Have a look around some Computational Linguistics sites - hint: there are
tools on the web to do this (not neccessarily in C, or with source code
though)
 
M

Malcolm

Spidey said:
What is best solution for finding "a repeatedly occuring lenghtiest sub
SEQUENCE from a given paragraph"

ex:
input:
hello world is a good hello world. Is that the way it is
a way.
Output:
hello world
It's a programming problem rather than a C program.
Look up "dot plot" and "protein sequence" in Google for an example of a
real-life application of this problem.
 
M

Mark McIntyre

What is best solution for finding "a repeatedly occuring lenghtiest sub
SEQUENCE from a given paragraph"

This is an algorithms question. Try asking in comp.programming.
 
S

Simon Biber

Spidey said:
What is best solution for finding "a repeatedly occuring lenghtiest sub
SEQUENCE from a given paragraph"

ex:
input:
hello world is a good hello world. Is that the way it is
a way.
Output:
hello world

Here's my solution...

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

/* Find the longest sub-sequence of src, *
* which is repeated at least once. */

char *longsubseq(const char *src)
{
size_t substrlen, offset, srclen = strlen(src);
char *substr = malloc(srclen/2 + 1), *repeat;
if(!substr) return NULL;
for(substrlen = srclen/2; substrlen > 0; substrlen--)
{
for(offset = 0; offset < srclen - substrlen; offset++)
{
substr[0] = 0;
strncat(substr, src + offset, substrlen);
repeat = strstr(src + offset + substrlen, substr);
if(repeat) return substr;
}
}
free(substr);
return NULL;
}

If there is no subsequence that is repeated at least once, it will
return a null pointer. Otherwise, it returns a pointer to allocated
memory containing a string with the longest repeated subsequence.

This uses a rather naive algorithm, certainly not the most efficient.
 
R

Randy Howard

Simon Biber wrote
(in article said:
Here's my solution...

Thanks for contributing to the increased graduation rates
amongst cheating students. Well done.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,174
Messages
2,570,941
Members
47,476
Latest member
blackwatermelon

Latest Threads

Top