regexp test for italics and bold

B

Bob

Hello:

I could use some regular expression help. I'm trying to create a
client-side javascript function that will test user input to determine
the markup of text that has been copied into a textarea input field,
then will insert html tags accordingly.

So, the user writes up a document in word that contains instances of
italics, bold, and underline. The user then copies the text of that
document into my textarea field. I am trying to create a function that
will then search through all instances of that markup and insert the
appropriate html tags directly into the textarea field.

My problem is that I cannot find a way in javascript with regexp or
otherwise to test user input for italics, bold, or underline. I found
something similar in a PHP script here:

// Check for Italics text
$Text = preg_replace("(\[i\](.+?)\[\/i\])is",'<span
class="italics">$1</span>',$Text);

But I cannot find a client-side javascript equivalent to this. This
only needs to work in the latest IE, as this is part of an internal
tool to help build HTML emails.

Any help would be appreciated greatly.

Thanks,

Bob
 
B

Bob

That's not quite what I'm going for. That will search through the html
document itself for text that is surrounded by italics tags. But what
I'm trying to do is search through user input: whatever the user copies
into the textarea field. That content may have attributes like bolding
and italics that are carried over from the original source (e.g., MS
Word) and I am hoping to detect these attributes through regular
expressions and add tags into the textarea field accordingly.
 
J

jdlwright

Sorry textarea elements dont support rich formatting, you need a rich
text iframe text editor, like PowerWEB TextBox, or HtmlArea, or
FreeTextBox etc etc etc...

Once you're using one of those you can do this, you access the
..innerHTML of the iframe's body element, and go from there.

Good luck.

Jim
 
B

Bob

Thanks for the reply. I checked out some of the solutions you listed.
I could be out of my depth, but they seem to take a textarea field and
apply some javascript magickery to get their effect, which leads me to
believe that my goal should be possible. The input starts out in the
textarea field, so there has to be some way to determine attributes,
iframes or no. I'd rather not use any of these packaged solutions
because their output code is pretty ugly.
 
B

Bob

Just to clarify, I don't care what it looks like in the textarea field.
I just want to detect if it was italicized when it was copied over and
then add the tags <i>foo</i> appropriately. It can all look like
straight text in the textarea field.
 
J

jdlwright

Sure. Textareas don't output html tags - remember they were created
about a decade ago when everyone was just pleased to get online and
share views on who was the best star trek captain... ;)
 
B

Bob

I don't expect the textarea to output tags. I will insert the tags
directly into the textarea with the script. So, for example, a user
copies in "This is italics. This is not." from a Word document and the
first sentence is in italics. Then my script gets run and recognizes
that the first sentence is italicized and inserts the tags directly
into the textarea.value string to change it to "<i>This is italics.</i>
This is not." I'd like to do something similar to what this PHP code
is supposed to do:

// Check for Italics text
$Text = preg_replace("(\[i\](.+?)\[\/i\])is",'<span
class="italics">$1</span>',$Text);
 
L

Lee

Bob said:
Just to clarify, I don't care what it looks like in the textarea field.
I just want to detect if it was italicized when it was copied over

Only characters are pasted.
Information about formating never makes it into the textarea.
There is nothing to detect.
 
J

jdlwright

Then what I'd suggest you do is;

a) add an iframe with contentEditable set true, to your page

b) have users paste word documents into this

c) listen to the paste event in the iframe.document object

d) get the content of the iframe with iframe.document.body.innerHTML,
this will be your Word document in HTML markup (in my view this is
pretty amazing - such as easy way to convert a word doc to html markup,
AND on the client side!)

e) do what Mark Twain suggested to the content you got from d)

f) insert it into a textarea with textareaID.value = ....

Jim
 
J

jdlwright

I think what I've failed to impart to you is that once your users
pastes into a textarea, all the formatting is gone - no way AT ALL to
access any kind of formatting info.

My last suggestion above is in my opinion (8 yrs of comm. web
development) the only option outside of using some sort of ActiveX or
plugin on the client side...


Jim
 
B

Bob

Thank you for the replies and patient explanation. I'll check further
into your suggested iframes solution.
 
J

Julian Turner

Bob said:
Hello:

I could use some regular expression help. I'm trying to create a
client-side javascript function that will test user input to determine
the markup of text that has been copied into a textarea input field,
then will insert html tags accordingly.

Some of my thoughts for you.
So, the user writes up a document in word that contains instances of
italics, bold, and underline.
The user then copies the text of that
document into my textarea field. I am trying to create a function that
will then search through all instances of that markup and insert the
appropriate html tags directly into the textarea field.

When you refer to "instances that mark-up" what mark-up are you
referring to? I assume the underlying .doc format. In which case are
you looking for someone who is familiar with .doc formats as well, in
order to be able to write a RegExp?

Also, as others have noted, when you cut and paste .doc text into a
TEXTAREA, so all .doc formatting is stripped out. How to you then
plan to get hold of the .doc formatting?
My problem is that I cannot find a way in javascript with regexp or
otherwise to test user input for italics, bold, or underline. I found
something similar in a PHP script here:

// Check for Italics text
$Text = preg_replace("(\[i\](.+?)\[\/i\])is",'<span
class="italics">$1</span>',$Text);

But I cannot find a client-side javascript equivalent to this. This
only needs to work in the latest IE, as this is part of an internal
tool to help build HTML emails.

Some options are:-

(a) Use a content mark-up, as used for Wiki or some other news groups
(e.g. http://www.webdeveloper.com). I.e. if the user wants italics,
they must physically mark it up themselves. But this is
obviously not idea.

(b) As suggested by others, use the rich text edit components in the
latest browsers (e.g. contentEditable="true" DIVS or designMode="on"
IFRAMES). When you cut and paste, formatting is preserved, although
the automatically generated HTML is not the best. Equally, users can
type directly into those rich text components and create mark-up
directly.

(c) Create some form of Macro in Word to talk to the web page, or try
to get the web page to talk to (i.e. automate) Microsoft Word, using
the ActiveXObject ("Word.Application"), and directly inspect the
Document Object Model for Microsoft Word to extract the formatting.
These are likely to be difficult to implement.

Julian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top