Roundup Tracker - Issues

Message6749

Author rouilj
Recipients rouilj, schlatterbeck
Date 2019-10-16.23:53:42
Message-id <20191016235337.249034C0273@itserver6.cs.umb.edu>
In-reply-to <20191015131334.3h4qoopfsqiphydy@runtux.com>
Hi Ralf:

In message <20191015131334.3h4qoopfsqiphydy@runtux.com>,
Ralf Schlatterbeck writes:
>On Tue, Oct 15, 2019 at 02:22:21AM +0000, John Rouillard wrote:
>> Files however are not the same. We support creation of the metadata
>> using json, but I am not sure how we can post the metadata and then
>> set the content separately.

Answer we can't. Content is a required property 8-).

>> Would we use a multipart/form-data posted to /rest/data/file to
>> include the metadata and content?
>
>I've successfully written files (metadata and content property) using
>the method from the request library. The idea is instead of specifying
>json = dictionary we specify data = dictionary.
>
>        d = dict (name = filename, content = content, type = content_type)
>        j = self.post ('file', data = d)
>
>The self.post method sets up the necessary headers and prefixes the
>given path.
>
>I *think* this by default sends the contents as
>application/x-www-form-urlencoded

I couldn't make curl do this.

>which is sub-optimal for large files.

Agreed. The encoding would make the file larger.

>You can force the requests library to use multipart/form-data by
>specifying both, files= *and* data= parameters, e.g.,
>
># A binary string that can't be decoded as unicode
>content = open ('random-junk', 'rb').read ()
>fname   = 'a-bigger-testfile'
>d = dict \
>    ( name = fname
>    , type='application/octet-stream'
>    )
>c = dict (content = content)
>r = session.post (url + 'file', files = c, data = d)
>print (r.json ())
>
>This produces something like
>
>POST /path/to/tracker/.../file HTTP/1.1
>Host: bee:8080
>Connection: keep-alive
>Accept-Encoding: gzip, deflate
>Accept: */*
>User-Agent: python-requests/2.12.4
>Content-Length: 2405
>Content-Type: multipart/form-data;
>boundary=788e954792774a6cbe747ba2ca2a276a
>Authorization: Basic <censored>
>
>--788e954792774a6cbe747ba2ca2a276a
>Content-Disposition: form-data; name="type"
>
>application/octet-stream
>--788e954792774a6cbe747ba2ca2a276a
>Content-Disposition: form-data; name="name"
>
>a-bigger-testfile
>--788e954792774a6cbe747ba2ca2a276a
>Content-Disposition: form-data; name="content"; filename="content"
>
>i.S...Em..3/].T...e1ag.G..?N.b.%..P`M..#a...r.S......}>..d.>7.3a...n.."..`
>.P.[.aQc..Rg.....q...s1z.9........%..]..|..1.|...M..p.GC....=..BV.L.5..
>+.F.!..H...gI..cdg?.........k...t..A..........}`...J.....
>....Y.....>....{..E..
>%.E...:a.o.F.......o...../..).>o..qmm.U7..BT..
>--788e954792774a6cbe747ba2ca2a276a--
>
>And, yes, this works for creating files :-)

Cool. I think this curl command does the same using multipart/form-data:

   curl -u demo:demo -s  -X POST -H "Referer: https://.../demo/" \
       -H "X-requested-with: rest" \
       -F "name=afile" -F "status=1" -F "type=image/vnd.microsoft.icon" \
      -F  "content=@doc/roundup-favicon.ico"  \
       https://.../demo/rest/data/file

which returns:

  {
    "data": {
        "id": "11",
        "type": "file",
        "link": "https://.../demo/rest/data/file/11",
        "attributes": {
            "acl": null,
            "content": {
                "link": "https://.../demo/file11/"
            },
            "name": "afile",
            "status": {
                "id": "1",
                "link": "https://.../demo/rest/data/filestatus/1"
            },
            "type": "image/vnd.microsoft.icon"
        },
        "@etag": "\"74276f75ef71a30a0cce62dc6a8aa1bb\""
    }
  }

but I can't actually use https://.../demo/file11/ to get the contents
of the file.

That returns the full file11 page including page.html and the form to
change the file's metadata. To get the file contents, I need to use:

   https://.../demo/file11/afile

should we change that response? Currently, you need to get
demo/data/file/11, pull the name and append it to the content link.

Also if I get demo/data/file/11/content, I see:

      "data": "file11 is not text, retrieve using binary_content property. mdsum: bd990c0f8833dd991daf610b81b62316",

but using demo/data/file/11/binary_content I get:

  "data": "b'\\x00\\x00\\x01\\x00\\x01\\x00\\x10\\x10\\x00\\x00\\x00\\x00\\x00\\x00h\\x05\\x00\\x00\\x16\\x00\\x00\\x00(\\x00\\x1c\\x1c\\x1c\\x00ttt ...

etc.  It is encapsulated in the json data wrapper. I assume that is
some encoded form of actual binary data? Is that encoded form
decodable from javascript?

Given how this bloats the file size, I wonder if we should provide a
way via the rest interface to download just the content data in raw
form.

I think the right way to do this is to make the request to
demo/data/file/11/content but set the header:

 Accept:  image/vnd.microsoft.icon

If the content type matches the file type, respond with a binary data
stream with appropriate Content-Type (either the same as the Accept
type or application/octet-stream) and Content-Length. If it doesn't
match we return 406 - not acceptable.

Thoughts?
History
Date User Action Args
2019-10-16 23:53:43rouiljsetrecipients: + rouilj, schlatterbeck
2019-10-16 23:53:43rouiljlinkissue2551067 messages
2019-10-16 23:53:42rouiljcreate