Roundup Tracker - Issues

Issue 2550851

classification
cjk codecs section in docs obsolete
Type: behavior Severity: minor
Components: Documentation Versions: 1.5
process
Status: closed fixed
:
: rouilj : ber, r.david.murray, rouilj, smcgraw
Priority: :

Created on 2014-09-12 22:07 by smcgraw, last changed 2016-06-18 01:37 by rouilj.

Messages
msg5138 Author: [hidden] (smcgraw) Date: 2014-09-12 22:07
The Roundup install docs has a section that reads:

  Additional Language Codecs
  If you intend to send messages to Roundup that use Chinese, 
  Japanese or Korean encodings the[sic] you'll need to obtain
  CJKCodecs from http://cjkpython.berlios.de/

Isn't this obsolete?  AFAIK, these days (and for a long time
now, certainly for any Python version Roundup supports) Python 
comes with all common CJK codecs included.  Although the package
is still available on Sourceforge, the link given is a 404 and 
a quick grep for cjkcodecs does not find anything in the Roundup 
source code.
msg5139 Author: [hidden] (rouilj) Date: 2014-09-12 22:42
Hi Mr. McGraw:

In message <1410559674.92.0.12192048544.issue2550851@psf.upfronthosting.co.za> 
<1410559674.92.0.12192048544.issue2550851@psf.upfronthosting.co.za>,
Stuart McGraw writes:
>
>The Roundup install docs has a section that reads:
>
>  Additional Language Codecs
>  If you intend to send messages to Roundup that use Chinese, 
>  Japanese or Korean encodings the[sic] you'll need to obtain
>  CJKCodecs from http://cjkpython.berlios.de/
>
>Isn't this obsolete?  AFAIK, these days (and for a long time
>now, certainly for any Python version Roundup supports) Python 
>comes with all common CJK codecs included.

Are you in a position to try out roundup with asian languages without
using the CJKCodecs?

AFAIK currently none of the developers use any of those languages so
while it may be obsolete, I don't think anybody can tell 8-).
msg5140 Author: [hidden] (smcgraw) Date: 2014-09-13 01:56
John Rouillard wrote:
> Are you in a position to try out roundup with asian
> languages without using the CJKCodecs?

Well, we can try it right now. :-)

  このメッセージは日本語で書かれました。

You should be able to paste that into Google Translate or
similar and ask for Japanese -> English and get something 
sensible back.  (I'm assuming Roundup's tracker didn't
install the cjkcodecs.  If that's wrong I summitted a similar
issue on a newly installed 1.5.0 tracker with no ill effects
there either.)

Additionally, if you download the cjkcodecs package from
Sourceforge (http://sourceforge.net/projects/cjkpython.berlios/)
you'll see its author is Hye-Shik Chang <perky@FreeBSD.org>
and last update was 2004.

The Python revision where the cjkcodecs were added to 
cpython is:
  http://hg.python.org/cpython/rev/69dadc2ca14d
also in 2004 and by Hye-Shik Chang <hyeshik@gmail.com>
(That is, the cjkcodecs added to Python *are* the ones
from the cjkcodecs package.)

If you compare the codecs in the cjkcodecs package with 
those listed in the Python codecs module docs (which I 
did copy-pasting the python table, removing all rows
that weren't cjk codings, and deleting all but the first
column; the cjkcodecs package codec list from the names 
of *.py files in the package) one gets (first column are 
codecs in cjkcodecs package but not in python codecs,
second column are codecs in both):

		big5
		big5hkscs
		cp932
		cp949
		cp950
		euc_jis_2004
		euc_jisx0213
		euc_jp
		euc_kr
euc_tw
		gb18030
		gb2312
		gbk
		hz
iso2022_cn
		iso2022_jp
		iso2022_jp_1
		iso2022_jp_2
		iso2022_jp_2004
		iso2022_jp_3
		iso2022_jp_ext
		iso2022_kr
		johab
		shift_jis
		shift_jis_2004
		shift_jisx0213

I do not know why euc-tw and iso2022-cn were left out of 
Python.  However, Wikipedia 
  (http://en.wikipedia.org/wiki/Extended_Unix_Code)
says,
 "It [euc-tw] is a rarely used encoding for traditional 
  Chinese characters as used on Taiwan. Big5 is much
  more common."

As for iso2022-cn, other projects had problems with it and 
decided to do without it, e.g.:
  https://bugzilla.mozilla.org/show_bug.cgi?id=470523

Given the many other, more popular encodings for Chinese, 
the lack of those two would not seem to present a serious 
barrier to communication.

So I see no reason why Roundup should continue to recommend 
the cjkcodecs package.
msg5141 Author: [hidden] (smcgraw) Date: 2014-09-13 02:24
It just occurred to me that my "test" in the previous message was
pretty meaningless given that any encoding/decoding done on it 
was utf-8.

AFAICT there is no import or other mention of "cjkcodecs" anywhere
in the roundup code.  So does that mean that any "need" for it 
would be in user-written extensions?

If that's the case, then my points about the equivalence of the
old cjkcodecs package and what comes with python holds: anyone
writing code that needs cjk codecs should be using the ones in 
Python and not in the (10 years old) cjkcodecs packge.
msg5143 Author: [hidden] (ber) Date: 2014-09-15 09:28
Stuart,
thanks for reporting the issue.

Looks like you are right, but confirmation would be nice.

Maybe we should ask on the mailinglist. 
I think there was a chinese user. :)
Otherwise would try getting some character and convert them to the 
encoding. My problem is that I don't know precisely where the encoding
would affect roundup to test that myself.

best,
Bernhard
msg5598 Author: [hidden] (rouilj) Date: 2016-06-12 00:56
Sent the following to roundup-users:

===========
In the roundup installation docs, there is a section that reads:

=============
Additional Language Codecs
--------------------------

If you intend to send messages to Roundup that use Chinese, Japanese or
Korean encodings the you'll need to obtain CJKCodecs from
http://cjkpython.berlios.de/
=============

The issue: http://issues.roundup-tracker.org/issue2550851
claims these dirctions are obsolete.

Is anybody usning roundup with Asian lanuages that has *not* had to
explicitly install the CJKCodecs?

I took an example piece of Japanese text and was able to paste it into
a demo.py tracker from the current development version. I saw the
Japanese characters properly displayed. I was able to cut/paste them
from the demo tracker into google translate and get a valid tranlation
to English.

I claim this is a good enough test that we can remove the directions
from the install docs.
===========

Hopefully somebody will say they didn't need to install the codecs, or
my test is sufficient and we can put this to bed.
msg5602 Author: [hidden] (r.david.murray) Date: 2016-06-14 16:19
The CJK codecs have been shipped with python for years now.  I don't
think roundup supports a version of python where they are not included,
but it is easy enough to check: in a checkout for the oldest version of
python roundup supports, look for Lib/test/cjkencodings.  If that
directory exists, the encoding are shipped with that python version.
msg5603 Author: [hidden] (rouilj) Date: 2016-06-18 01:37
I just checked a 2.6 install on centos 6 (which is the lower bound for
roundup-1.6). It has the encodings are part of python-lib and I see big5
and the other encodings for Chinese characters mentioned above.

So I am going to remove this section and note it in the change log.
Referring to this issue so people can find it if they want.


checkin: ac143db86fcc
History
Date User Action Args
2016-06-18 01:37:19rouiljsetstatus: new -> closed
assignee: rouilj
resolution: fixed
messages: + msg5603
type: behavior
2016-06-14 16:19:36r.david.murraysetnosy: + r.david.murray
messages: + msg5602
2016-06-12 00:56:52rouiljsetmessages: + msg5598
2014-09-15 09:28:24bersetnosy: + ber
messages: + msg5143
2014-09-13 02:24:09smcgrawsetmessages: + msg5141
2014-09-13 01:56:39smcgrawsetmessages: + msg5140
2014-09-12 22:42:06rouiljsetnosy: + rouilj
messages: + msg5139
2014-09-12 22:07:54smcgrawcreate