Can I upload Perl program in unicode?

J

John

Hi

Imagine that I have the following statement in a Perl program.

my $word = '??';

If I save the Perl program as 'UTF8' the characters remain.

If I save it in ASCII, the line appears as my $word = '????‘?';

Now, the problem is although I can save it as UTF8 the Perl program needs to
be in ASCII to be run

How do I get around this problem?

Regards
John
 
B

Ben Bullock

Imagine that I have the following statement in a Perl program.

my $word = '??';

If I save the Perl program as 'UTF8' the characters remain.

If I save it in ASCII, the line appears as my $word = '????‘?';

Now, the problem is although I can save it as UTF8 the Perl program needs to
be in ASCII to be run

No, it doesn't.
How do I get around this problem?

The above actually won't cause any problems to Perl.

Do you have a working example program which illustrates some problem
you've encountered?

If you want your string to be recognized as utf8 by Perl and encoded into
Perl's internal form, where characters like ã‚ are recognized as single
characters by things like /(.)/ or "length", you can

use utf8;

at the top of the script. But Perl will run without it.

If you need to use variables with non-ASCII names, like

my $ã‚ = "bingo";

then you also need to

use utf8;

But again, this is not necessary to make Perl run with text in UTF-8
encoding.
 
X

xhoster

John said:
Hi

Imagine that I have the following statement in a Perl program.

my $word = '??';

That's just two question marks, chr 63, right? If not, then what
is it?
If I save the Perl program as 'UTF8' the characters remain.

"Save as UTF8" sounds like something a word processor would do.
We don't know what word processor you are using.
If I save it in ASCII, the line appears as my $word = '????‘?';

Is that equivalent to the below?
join "", chr(65),chr(65),chr(65),chr(65),chr(145),chr(65);
Now, the problem is although I can save it as UTF8 the Perl program needs
to be in ASCII to be run

What makes you think that? What errors are you getting?

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
P

Peter J. Holzer

Imagine that I have the following statement in a Perl program.

my $word = '??';

These are two question marks. I suppose that should have been something
else. If you want to write non-ASCII characters in your postings, please
use a newsreader which is able to do so.

If I save the Perl program as 'UTF8' the characters remain.

If I save it in ASCII, the line appears as my $word = '????‘?';

Now, the problem is although I can save it as UTF8 the Perl program needs to
be in ASCII to be run

Why? The perl interpreter is perfectly fine with scripts in UTF-8.

How do I get around this problem?

I could tell you, but then I'd have to kill you.

No, seriously, you seem to be lacking some basics about perl character
strings, so I'll refer you to Juerd's rather good Unicode tutorial:
http://juerd.nl/site.plp/perluniadvice

hp
 
B

Bill H

These are two question marks. I suppose that should have been something
else. If you want to write non-ASCII characters in your postings, please
use a newsreader which is able to do so.




Why? The perl interpreter is perfectly fine with scripts in UTF-8.


I could tell you, but then I'd have to kill you.

No, seriously, you seem to be lacking some basics about perl character
strings, so I'll refer you to Juerd's rather good Unicode tutorial:http://juerd.nl/site.plp/perluniadvice

        hp

I think - and I could be wrong here, John is writing his code using
something like Notepad that warns you about saving in utf or not. If
you don't save in utf the characters go away. It may be that he saves
in utf and then when uploading it is messed up by his ftp program.
Then again I could be completly wrong here.

Bill H
 
J

Jürgen Exner

Bill H said:
I think - and I could be wrong here, John is writing his code using
something like Notepad that warns you about saving in utf or not. If
you don't save in utf the characters go away. It may be that he saves
in utf and then when uploading it is messed up by his ftp program.

Quite possible. Another wild guess:
- If he is using Notepad and saves as UTF8, then Notepad prepends a
totally useless BOM(*) to the file. Mabye perl is choking on that BOM.

*: UTF8 is a _byte_ stream and does not have a byte order

jue
 
J

John

Good morning , all

If a Perl script can be encoded in UTF8, the problem must lie with the
upload. I use WS_FTP Pro on my Windows (sorry) machine and vsftpd on my
linux server. I need to look at the setup on both. The answer lies there.

Many thanks and regards
John
 
B

Ben Bullock

John schrieb:

A lot of people here use Perl on Windows. One of my main uses for Perl
is manipulating Microsoft Word/Excel documents via Win32::OLE.

Did you know you can use Explorer as an ftp client on Windows? It's fairly
handy.
Also have a look at line endings, windows-style line ends
(CRLF) generally don't mix well with Perl on *nix systems

The only problem with these on Unix systems is "bad interpreter" messages
if there is a ^M at the end of the #! line. But these messages come from the
shell, not Perl itself. Perl is able to cope with any kind of line
endings on Unix, so you can run your scripts as "perl myscript" it will
be OK.
 
S

szr

Ben said:
A lot of people here use Perl on Windows. One of my main uses for Perl
is manipulating Microsoft Word/Excel documents via Win32::OLE.


Did you know you can use Explorer as an ftp client on Windows? It's
fairly handy.


The only problem with these on Unix systems is "bad interpreter"
messages if there is a ^M at the end of the #! line. But these
messages come from the shell, not Perl itself. Perl is able to cope
with any kind of line endings on Unix, so you can run your scripts
as "perl myscript" it will be OK.

Another little trick I've used before in such situations is, after
uploading, go to your Linux shell and open the script in 'pico', then
save the file. When pico saves the file, it will have converted any
\CR\LF sequences to just LF. I am not sure who it handles UTF8, however.
And I'm sure there are other tools for converting as well.
 
J

John

szr said:
Another little trick I've used before in such situations is, after
uploading, go to your Linux shell and open the script in 'pico', then save
the file. When pico saves the file, it will have converted any \CR\LF
sequences to just LF. I am not sure who it handles UTF8, however. And I'm
sure there are other tools for converting as well.


Good news. It is all working fine now. I tested it with French, German,
Persian, Chinese and Greek.
I had some problems with DBI and MySQL with Perl and UTF8 but I managed to
get that to work too.
Many thanks for all the input.
Regards
John
 
J

John

Just out of curiosity:

What was now the root cause?

bye

N





- Show quoted text -

Was wondering that myself

Bill H


Hi

Here's the solution.

On my Windows machine I use ScITE as my editor for Perl. I set the ENCODING
as UTF8 and I can see all the correct characters - whether German , Chinese,
etc. However I save with ENCODING set to 8-bit. I upload with WS_FTP PRO.
WS_FTP is set to auto mode so that any .PL: is transferred as ASCII. On the
linux server, I use vsftpd. In the conf file I set both ascii_upload_enable
and ascii_download_enable to YES. The unicode characters are displayed
correctly.

There is probably a more elegant solutuion.

Hope that helps.
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,209
Messages
2,571,089
Members
47,689
Latest member
kilaocrhtbfnr

Latest Threads

Top