Glossary script

claus_01 · Jan 28, 2009

Hi,

I'm currently working on two nearly identical websites that
have a glossary. Now, I would like to automatically link
certain keywords inside the glossary to the respective entry,
and several users at http://www.webdesignerforum.co.uk/
recommended JavaScript for this. Since I couldn't find a
free script online, I would like to know if it is feasible
for a comparable beginner (with JavaScript, that is) to write
a script that will do that.

One example: Say, there's the keyword XYZ. Now, I would
like the script to link from each entry inside the glossary
where 'XYZ' is mentioned to the entry for XYZ itself. Is
this very difficult to do? Where would I have to look to
find a sample script I could modify t fit my needs?

TIA,

Claus

RobG · Jan 28, 2009

Hi,

I'm currently working on two nearly identical websites that
have a glossary. Now, I would like to automatically link
certain keywords inside the glossary to the respective entry,
and several users athttp://www.webdesignerforum.co.uk/
recommended JavaScript for this. Since I couldn't find a
free script online, I would like to know if it is feasible
for a comparable beginner (with JavaScript, that is) to write
a script that will do that.

For some definition of "feasible". Possible, yes. Efficient -
unlikely. But you might enjoy the learning process, so perhaps
interesting.

One example: Say, there's the keyword XYZ. Now, I would
like the script to link from each entry inside the glossary
where 'XYZ' is mentioned to the entry for XYZ itself. Is
this very difficult to do?

No, but javascript doesn't seem the right tool. Glossaries are pretty
static, whereas javascript is more suited to interaction with the
user. It seems a better idea to do the markup on the server - why
generate the links every time the page is loaded on the client?
Particularly given the vaguaries of javascript in the multitude of
clients when the alternative is likely a server-side language that is
much more portable across a limited and controlled set of platforms
(in comparison to user agents).

Where would I have to look to
find a sample script I could modify t fit my needs?

You could search the archives here, but that may not be too helpful.
I suppose you want to go cycle through each term in the glossary, then
replace each instance of the word in the page with a link. How will
you deal with plural versions of words? Or those that might be verbs
with different tenses? Should *every* instance of word be linked to
its entry in the glossary? Many of these decisions are better made by
a human when creating each entry rather than depending on a simple
pattern match to create the links.

In any case, if you proceed with javascript for the sake of it, have a
go and post the results here. I'm sure you'll get plenty of
hints.

Elegie · Jan 29, 2009

claus_01 wrote :

Hello,

One example: Say, there's the keyword XYZ. Now, I would
like the script to link from each entry inside the glossary
where 'XYZ' is mentioned to the entry for XYZ itself.

<snip>

You'll find below a simple script that should help you in your task. I
have also built the script in such a way that the "glossarize" process
may be used by other pages than the glossary page itself (example
provided, 2 html pages).

In the end though, Rob's points look very reasonable, especially the
syntax-related ones, so sticking to a manual-linking process should be a
relevant option.

Enjoy,
Elegie.

--- test.html ---
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>The Story of the Blessing of El-ahrairah</title>
<style type="text/css">
* {
font-family : trebuchet MS, sans-serif ;
}
h1 {
color : #c00 ;
font-weight : 700 ;
font-size : 1.2em ;
}
p {
background-color : #ff9 ;
color : #000 ;
padding : 5px ;
width : 70% ;
}
</style>
<script type="text/javascript">
// When the page is loaded, load the glossary.html page in
// a dynamically created iframe (used as data buffer);
// the glossary page will constitute a glossary object from
// the glossary table, then call its glossarize method,
// which will modify our content here.
window.onload = function (evt) {
// Execute the script if supported
if(
document.createElement &&
document.appendChild &&
document.body
){ // load the glossary, and apply to paragraphs
var iframe = document.createElement("iframe") ;
iframe.style.display = "none" ;
document.body.appendChild(iframe) ;
iframe.src = "glossary.html" ;
}
}
</script>
</head>
<body>
<h1>The Story of the Blessing of El-ahrairah</h1>
<div><i>Richard Adams, Watership Down, Chapter 6.</i></div>
<p>
"Now, El-ahrairah was among the animals in those days and he had many
wives. He had so many wives that there was no counting them, and the
wives had so many young that even Frith could not count them, and
they ate the grass and the dandelions and the lettuces and the
clover, and El-ahrairah was the father of them all." (Bigwig growled
appreciatively.) "And after a time," went on Dandelion, "after a time
the grass began to grow thin and the rabbits wandered everywhere,
multiplying and eating as they went."
</p>
</body>
</html>
---

--- glossary.html ---
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>Glossary</title>
<style type="text/css">
* {
font-family : trebuchet MS, sans-serif ;
}
h1 {
color : #c00 ;
font-weight : 700 ;
font-size : 1.2em ;
}
th {
background-color : #fc3 ;
color : #c00 ;
font-weight : 700 ;
}
td {
background-color : #ff9 ;
color : #000 ;
padding : 5px ;
}
</style>
<script type="text/javascript">
// This page can be accessed either directly, or from a frame.
// If it is called directly, we modify the current page; otherwise,
// we modify the top page.
window.onload = function (evt) {
// Execute the script if supported
if(
document.getElementsByTagName &&
document.body &&
typeof document.body.innerHTML != "undefined" &&
typeof document.childNodes != "undefined"
){
// build a glossary object from the table
var glossary = {} ;
// grasp all A elements, check if they are anchors,
// grab their content (new glossary key), and
// grab their name (corresponding glossary href)
var entries = document.getElementsByTagName("a") ;
for( var ii=0; ii<entries.length; ii++) {
if (!entries[ii].href) {
glossary[
(
entries[ii].textContent ||
entries[ii].innerText||
""
).toLowerCase()
] = location.href.replace(/\#[a-z0-9-]+$/i,"") +
"#" + entries[ii].name ;
}
}
glossarize(glossary, top!=this ? top.document : document) ;
}
}

// We intend to replace the innerHTML property of all paragraphs,
// so we need to precisely identify where the glossary
// entries are located, more specifically searching in text nodes
// only, disregarding other locations reported by innerHTML, such as
// elements attributes or attributes' values.
function glossarize(glossary, owner) {

// Get all paragraphs
var p = owner.getElementsByTagName("p") ;

// For all paragraphs
for (var ii=0; ii<p.length; ii++) {
// Normalize the text nodes we will search later
if (p[ii].normalize) {
p[ii].normalize() ;
}

// Find and mark all glossary entries in text nodes,
// using a special token : \x01
// (we could also create a unique token, with additional
// code, but the one here should be rare enough)
for (var j=0; j<p[ii].childNodes.length; j++) {
var child=p[ii].childNodes[j] ;
if(child.nodeType == 3) { // TEXT_NODE
for (var entry in glossary) {
// Mark the entry, using the special token
child.nodeValue =
child.nodeValue.replace(
new RegExp(
"\\b("+entry+")\\b",
"gi"
),
function (a, b) {
return "\x01"+b+"\x01"
}
) ;
}
}
}

// Replace the mark entries by appropriate links
p[ii].innerHTML = p[ii].innerHTML.replace(
/\x01([a-z0-9\-]+)\x01/ig,
function (a, b) {
return "<a href='"+glossary[b.toLowerCase()]+"'>"+b+"<\/a>" ;
}
) ;
}
}
</script>
</head>
<body>
<h1>Glossary</h1>
<table>
<thead>
<tr><th>Entry</th><th>Description</th></tr>
</thead>
<tbody>
<tr>
<td><a name="frith">Frith</a></td>
<td><p>The god that made the world.</p></td>
</tr>
<tr>
<td><a name="elahrairah">El-ahrairah</a></td>
<td><p>The legendary rabbit, who was tricked by Frith.</p></td>
</tr>
<tr>
<td><a name="bigwig">Bigwig</a></td>
<td><p>A tough rabbit, who admires El-ahrairah.</p></td>
</tr>
<tr>
<td><a name="dandelion">Dandelion</a></td>
<td><p>A story-teller rabbit.</p></td>
</tr>
</tbody>
</table>
</body>
</html>
---

Dr J R Stockton · Jan 29, 2009

In comp.lang.javascript message <35554704-d182-4179-ba63-a0559647895d@v5
g2000prm.googlegroups.com>, Wed, 28 Jan 2009 15:34:36, RobG

No, but javascript doesn't seem the right tool. Glossaries are pretty
static, whereas javascript is more suited to interaction with the
user. It seems a better idea to do the markup on the server - why
generate the links every time the page is loaded on the client?
Particularly given the vaguaries of javascript in the multitude of
clients when the alternative is likely a server-side language that is
much more portable across a limited and controlled set of platforms
(in comparison to user agents).

I suggest that it might be better not to do it even on the server. This
is a task which should be done on the authoring system (where one can
make mistakes without the clients being able to see them).

The process of inserting the links should not be 100% automated;
homonyms <http://www.chambersharrap.co.uk/chambers/features/chref/chref.
py/main?query=homonym&title=21st> should be treated as distinct, and it
takes intelligence to recognise them (e.g. Google Translate may still
think that the French for "may" is "mai", just because the French for
"May" is usually "Mai").

YSCIB.

claus_01 · Jan 29, 2009

Many thanks for your replies. Much appreciated! ;-)

claus_01 wrote :

Hello,

<snip>

You'll find below a simple script that should help you in your task. I
have also built the script in such a way that the "glossarize" process
may be used by other pages than the glossary page itself (example
provided, 2 html pages).

In the end though, Rob's points look very reasonable, especially the
syntax-related ones, so sticking to a manual-linking process should be a
relevant option.

I thought about this as well, but as the sites are getting more and more
complex, I ruled this option out - it's just too much to do this
manually. I was told that PHP would be a solution, but the pages are
strictly HTML (XHTML 1.0 Transitional), so I would have to rewrite
some 150 HTML files for each site. In any case, I will look into the
code you attached.

One of the glossary pages is located at

http://www.ergotherapie-sillenbuch.de/infos/index.html

In fact, these are two nearly identical sites.

Thanks,

Claus

Enjoy,
Elegie.

--- test.html ---
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>The Story of the Blessing of El-ahrairah</title>
<style type="text/css">
* {
font-family : trebuchet MS, sans-serif ;
}
h1 {
color : #c00 ;
font-weight : 700 ;
font-size : 1.2em ;
}
p {
background-color : #ff9 ;
color : #000 ;
padding : 5px ;
width : 70% ;
}
</style>
<script type="text/javascript">
// When the page is loaded, load the glossary.html page in
// a dynamically created iframe (used as data buffer);
// the glossary page will constitute a glossary object from
// the glossary table, then call its glossarize method,
// which will modify our content here.
window.onload = function (evt) {
// Execute the script if supported
if(
document.createElement &&
document.appendChild &&
document.body
){ // load the glossary, and apply to paragraphs
var iframe = document.createElement("iframe") ;
iframe.style.display = "none" ;
document.body.appendChild(iframe) ;
iframe.src = "glossary.html" ;
}
}
</script>
</head>
<body>
<h1>The Story of the Blessing of El-ahrairah</h1>
<div><i>Richard Adams, Watership Down, Chapter 6.</i></div>
<p>
"Now, El-ahrairah was among the animals in those days and he had many
wives. He had so many wives that there was no counting them, and the
wives had so many young that even Frith could not count them, and
they ate the grass and the dandelions and the lettuces and the
clover, and El-ahrairah was the father of them all." (Bigwig growled
appreciatively.) "And after a time," went on Dandelion, "after a time
the grass began to grow thin and the rabbits wandered everywhere,
multiplying and eating as they went."
</p>
</body>
</html>
---

--- glossary.html ---
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>Glossary</title>
<style type="text/css">
* {
font-family : trebuchet MS, sans-serif ;
}
h1 {
color : #c00 ;
font-weight : 700 ;
font-size : 1.2em ;
}
th {
background-color : #fc3 ;
color : #c00 ;
font-weight : 700 ;
}
td {
background-color : #ff9 ;
color : #000 ;
padding : 5px ;
}
</style>
<script type="text/javascript">
// This page can be accessed either directly, or from a frame.
// If it is called directly, we modify the current page; otherwise,
// we modify the top page.
window.onload = function (evt) {
// Execute the script if supported
if(
document.getElementsByTagName &&
document.body &&
typeof document.body.innerHTML != "undefined" &&
typeof document.childNodes != "undefined"
){
// build a glossary object from the table
var glossary = {} ;
// grasp all A elements, check if they are anchors,
// grab their content (new glossary key), and
// grab their name (corresponding glossary href)
var entries = document.getElementsByTagName("a") ;
for( var ii=0; ii<entries.length; ii++) {
if (!entries[ii].href) {
glossary[
(
entries[ii].textContent ||
entries[ii].innerText||
""
).toLowerCase()
] = location.href.replace(/\#[a-z0-9-]+$/i,"") +
"#" + entries[ii].name ;
}
}
glossarize(glossary, top!=this ? top.document : document) ;
}
}

// We intend to replace the innerHTML property of all paragraphs,
// so we need to precisely identify where the glossary
// entries are located, more specifically searching in text nodes
// only, disregarding other locations reported by innerHTML, such as
// elements attributes or attributes' values.
function glossarize(glossary, owner) {

// Get all paragraphs
var p = owner.getElementsByTagName("p") ;

// For all paragraphs
for (var ii=0; ii<p.length; ii++) {
// Normalize the text nodes we will search later
if (p[ii].normalize) {
p[ii].normalize() ;
}

// Find and mark all glossary entries in text nodes,
// using a special token : \x01
// (we could also create a unique token, with additional
// code, but the one here should be rare enough)
for (var j=0; j<p[ii].childNodes.length; j++) {
var child=p[ii].childNodes[j] ;
if(child.nodeType == 3) { // TEXT_NODE
for (var entry in glossary) {
// Mark the entry, using the special token
child.nodeValue =
child.nodeValue.replace(
new RegExp(
"\\b("+entry+")\\b",
"gi"
),
function (a, b) {
return "\x01"+b+"\x01"
}
) ;
}
}
}

// Replace the mark entries by appropriate links
p[ii].innerHTML = p[ii].innerHTML.replace(
/\x01([a-z0-9\-]+)\x01/ig,
function (a, b) {
return "<a href='"+glossary[b.toLowerCase()]+"'>"+b+"<\/a>" ;
}
) ;
}
}
</script>
</head>
<body>
<h1>Glossary</h1>
<table>
<thead>
<tr><th>Entry</th><th>Description</th></tr>
</thead>
<tbody>
<tr>
<td><a name="frith">Frith</a></td>
<td><p>The god that made the world.</p></td>
</tr>
<tr>
<td><a name="elahrairah">El-ahrairah</a></td>
<td><p>The legendary rabbit, who was tricked by Frith.</p></td>
</tr>
<tr>
<td><a name="bigwig">Bigwig</a></td>
<td><p>A tough rabbit, who admires El-ahrairah.</p></td>
</tr>
<tr>
<td><a name="dandelion">Dandelion</a></td>
<td><p>A story-teller rabbit.</p></td>
</tr>
</tbody>
</table>
</body>
</html>
---

Thomas 'PointedEars' Lahn · Jan 29, 2009

claus_01 said:
Elegie said:

claus_01 wrote :
[...]

One example: Say, there's the keyword XYZ. Now, I would
like the script to link from each entry inside the glossary
where 'XYZ' is mentioned to the entry for XYZ itself.

Click to expand...

<snip>

You'll find below a simple script that should help you in your task. I
have also built the script in such a way that the "glossarize" process
may be used by other pages than the glossary page itself (example
provided, 2 html pages).

In the end though, Rob's points look very reasonable, especially the
syntax-related ones, so sticking to a manual-linking process should be a
relevant option.

Click to expand...

I thought about this as well, but as the sites are getting more and more
complex, I ruled this option out - it's just too much to do this
manually. I was told that PHP would be a solution, but the pages are
strictly HTML (XHTML 1.0 Transitional),

XHTML isn't HTML, nor is XHTML 1.0 Transitional in any way (but its
well-formedness) strict. HTML 4.01 Strict or XHTML 1.0 Strict is.
You don't want to use XHTML at this point, though.

so I would have to rewrite some 150 HTML files for each site.

PHP (PHP Hypertext Preprocessor) can make HTML out of XHTML, and vice-versa.
I have recently written a simple algorithm that makes XHTML 1.0 out of
Valid HTML 4.01.

[...]

[snipped 100+ unreferred lines]

Click to expand...

Trim your quotes, please.

<http://jibbering.com/faq/#posting>

PointedEars

Help with datascraping script	1	Aug 26, 2024
Need help with a script... (my first!)	6	Sep 24, 2022
Issue with textbox script?	0	Sep 5, 2022
Script stops working when using variables to save time typing...	4	Oct 31, 2022
[ANN] Python 3 Symbol Glossary	0	Nov 2, 2008
C Script Prematurely Terminating	3	Feb 7, 2022
glossary wiki	4	Sep 4, 2004
Search Results with Pagination	1	Oct 25, 2024

Glossary script

claus_01

RobG

Elegie

Dr J R Stockton

claus_01

Thomas 'PointedEars' Lahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads