Roundup Tracker - Issues

Issue 2551084

classification
Title: Inefficiency in roundup-admin
Type: behavior Severity: normal
Components: Command-line interface Versions: devel
process
Status: new Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: rouilj, tekberg
Priority: Keywords:

Created on 2020-05-14 14:50 by tekberg, last changed 2020-05-19 03:36 by rouilj.

Messages
msg6909 Author: [hidden] (tekberg) Date: 2020-05-14 14:50
I'm using the roundup admin code to create messages, files and issues. Some of the files are rather large. Looking at the roundup/admin.py code at line 104 a split is done to separate a property name from its value. Here are the few lines that matter.

l = arg.split('=')
if len(l) < 2:
    raise UsageError(_('argument "%(arg)s" not propname=value'
        )%locals())
key, value = l[0], '='.join(l[1:])

As you can see, a split is done on '=' and then a length check is done. After that, the variables key and value are assigned. The value part does a join on '=' to reconstitute it. A better approach is to use the maxsplit argument for split:

l = arg.split('=', 1)
if len(l) < 2:
    raise UsageError(_('argument "%(arg)s" not propname=value'
        )%locals())
key, value = l

Here, the arg is split into at most 2 pieces (1 split), the length check is done, and the key and value are assigned.

While this may not be important for small propname/value pairs in roundup-admin, it is important when the value is large, like files. In addition, the code is simpler. If you would like a diff and/or a test case for this, please let me know.
msg6910 Author: [hidden] (tekberg) Date: 2020-05-14 15:12
In addition, line 101 in roundup/admin.py does a check for '=' in the arg. The propname/value split does the same check and throws the same exception. Lines 101-103 can be removed. Note that 'foo'.split('=') => ['foo']. This may be a hold-over from an earlier version of python where split generated an exception when the split arg is not present in the string being split.
msg6911 Author: [hidden] (rouilj) Date: 2020-05-14 18:03
Hi Tom:

In message <1589467839.07.0.917750020348.issue2551084@roundup.psfhosted.org>,
Tom Ekberg writes:
>I'm using the roundup admin code to create messages, files and issues.
>Some of the files are rather large.

I assume the message/file contents include the = character. Am I
correct that this is a performance issue and doesn't cause a bad
message/file to be created?

> ...
>As you can see, a split is done on '=' and then a length check is
>done. After that, the variables key and value are assigned. The value
>part does a join on '=' to reconstitute it. A better approach is to
>use the maxsplit argument for split:
> ...
>While this may not be important for small propname/value pairs in
>roundup-admin, it is important when the value is large, like
>files. In addition, the code is simpler. If you would like a diff
>and/or a test case for this, please let me know.

Yes please diff and test case since you offered.

Also can you share how you are calling roundup-admin to add the
messge/file bodies?

I always thought that it would be good to provide a way to have the
content read from a file. I had issues attaching an image using
roundup-admin a while ago.

Maybe something like content=<<images/filename.jpg>> to read
images/filename.jpg if it exists. If <<images/filename.jpg>> doesn't
exist, the literal string '<<images/filename.jpg>>' would be the
content.

I am ok with line 101-103 deletion.

Have a great day.
--
				-- rouilj
John Rouillard
===========================================================================
My employers don't acknowledge my existence much less my opinions.
msg6912 Author: [hidden] (tekberg) Date: 2020-05-14 18:35
This is the class I created to return the ids, instead of writing them to stdout. (There is more text after the class.)

from roundup.admin import AdminTool

class Admin(AdminTool):
    """
    Class to allow one to run roundup-admin using its API.
    Roundup writes results to (IDs on creates) to stdout.
    This version returns ints.
    Currently only create has been implemented. The other roundup-admin
    commands should work but haven't been tested.
    """
    def __init__(self, args):
        super().__init__()
        self.args = args

        # These are set in the AdminTool.main method. Since that isn't called we set them here.
        self.tracker_home = self.args.tracker_home
        self.separator = None
        self.print_designator = 0
        self.verbose = 1 if self.args.debug else 0

        # Override the create command with our version.
        self.commands['create'] = self.do_create
        if self.args.debug:
            print('Exit Admin.__init__')


    def do_create(self, args):
        """
        Same as AdminTool.do_create except this returns the result as an int,
        instead of writing to stdout.
        """
        try:
            # Capture output.
            old_stdout = sys.stdout
            sys.stdout = io.StringIO()
            super().do_create(args)
            sys.stdout.seek(0)
            return int(sys.stdout.readline())
        finally:
            sys.stdout = old_stdout
        return -1

This class gets instantiated the normal way

    tool = Admin(args)

and gets called with

    id = tool.run_command(command)
    ids[command[1]] = id

With the id being stored in a dict. A command is a list of roundup-admin arguments:

 ['create', 'msg', ...
 ['create', 'file', ...
 ['create', 'issue', 'files=%s' % ids['file'], 'messages=%s' % ids['message'], ...

Hopefully that make sense. Now on to your comment. This is purely a performance issue. The file data I'm creating is a PDF file, which may contain the '=' character. I'm still debugging the code. On my last test the acrobat reader didn't like the PDF file format. I'll need to compare what is in the tracker file system with the original PDF file.

To get to this point, especially the create issue command, I realized that the roundup-admin documentation could be beefed up a bit. I eventually determined that 'roundup-admin -i dir specification issue' helped with the files and messages properties of the create issue command.
msg6913 Author: [hidden] (rouilj) Date: 2020-05-16 16:05
Hi Tom:

In message <1589481316.18.0.953671432578.issue2551084@roundup.psfhosted.org>,
Tom Ekberg writes:
>This is the class I created to return the ids, instead of writing
>them to stdout. (There is more text after the class.)
>
>[...]
>This class gets instantiated the normal way
>
>    tool = Admin(args)
>
>and gets called with
>
>    id = tool.run_command(command)
>    ids[command[1]] = id
>
>With the id being stored in a dict. A command is a list of roundup-admin arguments:
>
> ['create', 'msg', ...
> ['create', 'file', ...
> ['create', 'issue', 'files=%s' % ids['file'], 'messages=%s' % ids['message'], ...

Got it. I thought you were using the CLI to create new issues
attachments.

That's an interesting use of the admin module. Are you familiar with
scripts/add-issue that uses the native tracker instance to create
issues. I suspect it performs better than wrapping admin, even with
your needed performance cleanups.

>Hopefully that make sense. Now on to your comment. This is purely a
>performance issue. The file data I'm creating is a PDF file, which
>may contain the '=' character. I'm still debugging the code. On my
>last test the acrobat reader didn't like the PDF file format. I'll
>need to compare what is in the tracker file system with the original
>PDF file.

I would be interested in adding your class/command when you get it
working as I think it's an easier path for occasional use by a
non-programmer.

>To get to this point, especially the create issue command, I realized
>that the roundup-admin documentation could be beefed up a bit. I
>eventually determined that 'roundup-admin -i dir specification issue'
>helped with the files and messages properties of the create issue
>command.

There is a man page available that mentions the specification command
(I fixed a formatting glitch in it just this morning). However that is
not linked from anywhere in the official doc tree. I'll have to see
how I can do that.

I think having additional documentation of roundup-admin in
doc/admin_guide.txt would be good. In addition to a test case and
patch for the admin.py module, would you be able to write up a section
for the admin_guide on using roundup-admin. Maybe include a few use
cases that have been most useful to you?

Have a great weekend.
msg6915 Author: [hidden] (rouilj) Date: 2020-05-19 03:36
Tom, just a heads up. I added some code to test_admin.py including a new 
method for capturing stdout/stderr.

It may help you when adding a test for create.

-- rouilj
History
Date User Action Args
2020-05-19 03:36:07rouiljsetmessages: + msg6915
2020-05-16 16:05:33rouiljsetmessages: + msg6913
2020-05-14 18:35:16tekbergsetmessages: + msg6912
2020-05-14 18:03:28rouiljsetmessages: + msg6911
2020-05-14 15:12:26tekbergsetmessages: + msg6910
2020-05-14 14:50:39tekbergcreate