Message5140
John Rouillard wrote:
> Are you in a position to try out roundup with asian
> languages without using the CJKCodecs?
Well, we can try it right now. :-)
このメッセージは日本語で書かれました。
You should be able to paste that into Google Translate or
similar and ask for Japanese -> English and get something
sensible back. (I'm assuming Roundup's tracker didn't
install the cjkcodecs. If that's wrong I summitted a similar
issue on a newly installed 1.5.0 tracker with no ill effects
there either.)
Additionally, if you download the cjkcodecs package from
Sourceforge (http://sourceforge.net/projects/cjkpython.berlios/)
you'll see its author is Hye-Shik Chang <perky@FreeBSD.org>
and last update was 2004.
The Python revision where the cjkcodecs were added to
cpython is:
http://hg.python.org/cpython/rev/69dadc2ca14d
also in 2004 and by Hye-Shik Chang <hyeshik@gmail.com>
(That is, the cjkcodecs added to Python *are* the ones
from the cjkcodecs package.)
If you compare the codecs in the cjkcodecs package with
those listed in the Python codecs module docs (which I
did copy-pasting the python table, removing all rows
that weren't cjk codings, and deleting all but the first
column; the cjkcodecs package codec list from the names
of *.py files in the package) one gets (first column are
codecs in cjkcodecs package but not in python codecs,
second column are codecs in both):
big5
big5hkscs
cp932
cp949
cp950
euc_jis_2004
euc_jisx0213
euc_jp
euc_kr
euc_tw
gb18030
gb2312
gbk
hz
iso2022_cn
iso2022_jp
iso2022_jp_1
iso2022_jp_2
iso2022_jp_2004
iso2022_jp_3
iso2022_jp_ext
iso2022_kr
johab
shift_jis
shift_jis_2004
shift_jisx0213
I do not know why euc-tw and iso2022-cn were left out of
Python. However, Wikipedia
(http://en.wikipedia.org/wiki/Extended_Unix_Code)
says,
"It [euc-tw] is a rarely used encoding for traditional
Chinese characters as used on Taiwan. Big5 is much
more common."
As for iso2022-cn, other projects had problems with it and
decided to do without it, e.g.:
https://bugzilla.mozilla.org/show_bug.cgi?id=470523
Given the many other, more popular encodings for Chinese,
the lack of those two would not seem to present a serious
barrier to communication.
So I see no reason why Roundup should continue to recommend
the cjkcodecs package. |
|
Date |
User |
Action |
Args |
2014-09-13 01:56:39 | smcgraw | set | messageid: <1410573399.54.0.0690584272296.issue2550851@psf.upfronthosting.co.za> |
2014-09-13 01:56:39 | smcgraw | set | recipients:
+ smcgraw, rouilj |
2014-09-13 01:56:39 | smcgraw | link | issue2550851 messages |
2014-09-13 01:56:38 | smcgraw | create | |
|