Opened 4 years ago

Closed 12 months ago

#5626 closed defect (fixed)

DXF Chinese Encoding error

Reported by: liminlu0314 Owned by: warmerdam
Priority: normal Milestone:
Component: OGR_SF Version: 1.11.0
Severity: normal Keywords: DXF ANSI936
Cc:

Description

OGR open the DXF file, Chinese will appear garbled, if you use the --config DXF_ENCODING "UTF-8" after the show normal. The DXF file itself is coded using ANSI_936. Attachment is the test data. Use the command line: ogrinfo -ro -al "test-text1.dxf" -geom=NO output information:

INFO: Open of `F:\RsSrcDir\test-text1.dxf'
      using driver `DXF' successful.

Layer name: entities
Geometry: Unknown (any)
Feature Count: 1
Extent: (84531.256076, 296647.901946) - (84531.256076, 296647.901946)
Layer SRS WKT:
(unknown)
Layer: String (0.0)
SubClasses: String (0.0)
ExtendedEntity: String (0.0)
Linetype: String (0.0)
EntityHandle: String (0.0)
Text: String (0.0)
OGRFeature(entities):0
  Layer (String) = JMDSS
  SubClasses (String) = AcDbEntity:AcDbText:AcDbText
  ExtendedEntity (String) = 遥感工程院用于建设用地调查2013-4-7 0
  Linetype (String) = Continuous
  EntityHandle (String) = 251
  Text (String) = 姝e湪鏂藉伐
  Style = LABEL(f:"Arial",t:"姝e湪鏂藉伐",s:4g,c:#000000)

Use the command line: ogrinfo -ro -al "test-text1.dxf" -geom=NO --config DXF_ENCODING "UTF-8" output information:

INFO: Open of `F:\RsSrcDir\test-text1.dxf'
      using driver `DXF' successful.

Layer name: entities
Geometry: Unknown (any)
Feature Count: 1
Extent: (84531.256076, 296647.901946) - (84531.256076, 296647.901946)
Layer SRS WKT:
(unknown)
Layer: String (0.0)
SubClasses: String (0.0)
ExtendedEntity: String (0.0)
Linetype: String (0.0)
EntityHandle: String (0.0)
Text: String (0.0)
OGRFeature(entities):0
  Layer (String) = JMDSS
  SubClasses (String) = AcDbEntity:AcDbText:AcDbText
  ExtendedEntity (String) = 遥感工程院用于建设用地调查2013-4-7 0
  Linetype (String) = Continuous
  EntityHandle (String) = 251
  Text (String) = 正在施工
  Style = LABEL(f:"Arial",t:"正在施工",s:4g,c:#000000)

Attachments (1)

test-text1.dxf (164.9 KB) - added by liminlu0314 4 years ago.

Download all attachments as: .zip

Change History (4)

Changed 4 years ago by liminlu0314

Attachment: test-text1.dxf added

comment:1 Changed 4 years ago by Even Rouault

There are 2 issues :

  • you probably use a Windows console with CP936 encoding ? But GDAL uses UTF-8 as its pivot output format. So garbled characters are expected when display in the console (we should perhaps convert again to the console encoding when outputing on it, but that's another matter). If you redirect to a file instead, and oopens it with a UTF-8 compatible editor, then the Text should appear correctly if you don't specify --config DXF_ENCODING "UTF-8", since it will be converted from CP936 to UTF-8

At least, on Linux with a UTF-8 character, ogrinfo without DXF_ENCODING specified displays correctly the Text attribute, but not ExtendedEntity?

  • there's probably a bug with ExtendedEntity? strings being kept in the source encoding CP936 instead of being converted to UTF-8

comment:2 Changed 4 years ago by Even Rouault

Milestone: 1.11.1

comment:3 Changed 12 months ago by Even Rouault

Resolution: fixed
Status: newclosed

In 40196:

DXF: apply DXF codepage encoding while decoding ExtendedEntity? field (fixes #5626)

Note: See TracTickets for help on using tickets.