Opened 8 years ago

Closed 5 years ago

#6629 closed enhancement (wontfix)

/vsizip: troubles with cyrillic filenames, part #2

Reported by: oleinik Owned by: warmerdam
Priority: normal Milestone: closed_because_of_github_migration
Component: GDAL_Raster Version: svn-trunk
Severity: normal Keywords:
Cc:

Description

As continuing of #5361:
The first part of dataset name (zip-archive name) follows GDAL_FILENAME_IS_UTF8 rules and on Windows systems can be either ANSI or UTF8, but second part of dataset name (actually dataset name) only in UTF8. It is very uncomfortable.
Main program needs full dataset name in one encoding to display on the screen, sorting,... and this can only be done if all operations performed in the main program is ansi-encoded, and during call to GDAL converting to UTF8 and back.
Which in this case is the meaning of GDAL_FILENAME_IS_UTF8 ?
I propose to add GDAL_FILENAME_IS_UTF8 support to the /vsizip and other VSIFileManager handlers.

Change History (3)

comment:1 by Even Rouault, 8 years ago

GDAL_FILENAME_IS_UTF8=NO is aimed at handling the case where a user would use the GDAL API on Windows with non-UTF8 filenames, which is not recommended. In that case, GDAL will use the Win32 ANSI API instead of the Unicode ones. Seems that you would want similar hack to be extended for handling the filename within a zip file ? I don't think we want to add another encoding hack in GDAL. Recoding from/to UTF-8 can be handled by the application if it needs it. Most GUI framework now speak natively UTF-8 even on Windows.

comment:2 by oleinik, 8 years ago

GDAL_FILENAME_IS_UTF8=NO is aimed at handling the case where a user would use the GDAL API on Windows with non-UTF8 filenames, which is not recommended. In that case, GDAL will use the Win32 ANSI API instead of the Unicode ones.

This is exactly my case. I'm program on windows in Delphi 7. It is non-Unicode framework and has no native support of UTF-8 encoding. In my possession only transcoding functions.
I wanted to say, that impact of the GDAL_FILENAME_IS_UTF8 should be on the full dataset name, including filename within zip-file and other /vsi handlers. Otherwise it is useless expansion which does not help but complicates the programming.

  1. GDAL_FILENAME_IS_UTF8=NO - full dataset name in UTF-8. In this case I do not get the benefits for which GDAL_FILENAME_IS_UTF8 was invented and have to use conversion functions.
  2. GDAL_FILENAME_IS_UTF8=YES - zip-file is in ANSI-encoding, file withing zip - in UTF-8. I still have to use the conversion functions.

Moreover, description of dataset in this case would be in varied encoding

      poDS->SetDescription( poOpenInfo->pszFilename );
      poDS->TryLoadXML();

and after reading them I can't do anything with it because I don't know what part of Description is in ANSI, and what in UTF-8.

comment:3 by Even Rouault, 5 years ago

Milestone: closed_because_of_github_migration
Resolution: wontfix
Status: newclosed

This ticket has been automatically closed because Trac is no longer used for GDAL bug tracking, since the project has migrated to GitHub. If you believe this ticket is still valid, you may file it to https://github.com/OSGeo/gdal/issues if it is not already reported there.

Note: See TracTickets for help on using tickets.