Wrong filename of pages with umlauts when using the REST-API

Comments

5 comments

  • Avatar
    Jens Rutschmann

    Hi Sven,

    what client were you using to download the PDF file?

    A regular browser or  a tool like wget?

    We have some logic in there that determines the output filename based on the user agent string that the client sends. In your case it seems the exporter falls back to the safe replacement name because it thinks the used client can't deal with umlauts in the Content-Disposition header.

     

    Cheers,
    Jens

    0
    Comment actions Permalink
  • Avatar
    Sven Leichsenring

    Hi Jens,

    you are right. I am using wget.

     

    Now I tried to fake the User-Agent-String with the option

    --user-agent="xxx"

    where xx ist a common User-Agent-String of Firefox(Win) and cURL-Library, but nothing works fine.

     

    BTW: The API-URL works fine with standard browsers like Firefox, Chrome and so on.

     

    Thanks in advance,

    Sven

    0
    Comment actions Permalink
  • Avatar
    Sven Leichsenring

    I also tried to use an empty user agent string, so wget dosn't send the User-Agent header in its request, but it also did not work with umlauts.

    0
    Comment actions Permalink
  • Avatar
    Jens Rutschmann

    Hi Sven,

    I double checked curl and wget and it seems they don't support filename specs as defined in RFC 6266 / RFC 5987

    We have a fallback in place for old Internet Explorer, Firefox and Safari version but everything else gets response headers like this:

         Content-Disposition: attachment;
                              filename="EURO rates";
                              filename*=utf-8''%e2%82%ac%20rates
    

    The "filename" value contains the ascii-only filename for legacy tools that you are getting: scroll-export-...pdf
    The "filename*" value in contrast contains the full name (with umlauts in your case) encoded according to RFC 5987

    Unfortunately wget and curl don't seem to support this standard. I also tried by only returning the utf8 filename but this made it even worse.

    So until wget or curl support this I fear there is no way to make it work.

    For curl I found a patch here: http://curl.haxx.se/mail/archive-2012-10/0039.html
    Until now it doesn't seem to have ended up in any official curl release.

     

    Hope that explains it,
    Jens

    0
    Comment actions Permalink
  • Avatar
    Sven Leichsenring

    Hi Jens,

    what a pity. :-(

    so there is no solution at this time.

    Applying the curl patch exeeds my knowledge. And I'm not sure, if I would like that, if I were able to do that.

    How about an url parameter like "fallback=false" for the REST-API?

    I tried httrack but got even more problems (because of my lack of competence?).

     

    Cheers,

    Sven

     

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk