Oscarian said:
Indeed, but I enjoy having goals, even if it is a long term goal.
Fine, but if you want to research writing a Web Crawler, asking on a C
language newsgroup is probably not the best starting point.
<Off-topic>
The principle is fairly simple:-
1. Make a list of URLs
2. Access the next URL from the list
3. Find any links in the URL, add them to your list, discarding
duplicates
4. Go to 2
What language would you find it easiest to do this in? As I've already
stated, I probably wouldn't choose C - as I work with Java regularly, I
might choose that (out of the box, it has useful classes for accessing
URLs, maintaining lists, etc), or I might use a scripting language such
as Python or even Perl. As Richard has pointed out, the choice of
language is unlikely to have a huge impact on performance.
There is a discussion of WebCrawlers on Wikipedia, with links to a
variety of implementations in a range of languages. Entering "writing a
webcrawler" into Google leads to an assortment of papers, tutorials
etc.
</Off-topic>