Opened 8 years ago

Last modified 6 years ago

#3155 new defect

inconsistent newlines handling on Windows

Reported by: annakrat Owned by: grass-dev@…
Priority: normal Milestone: 7.6.2
Component: LibVector Version: svn-trunk
Keywords: newline Cc:
CPU: Unspecified Platform: MSWindows 8

Description

Function Vect_write_ascii uses:

fprintf(ascii, "%s", HOST_NEWLINE);

where HOST_NEWLINE is defined as '\r\n' on Windows. Other modules like v.info use:

fprintf(stdout, "north=%s\n", tmp1);

This is problem when you process the output in Python:

>>> output_points = grass.read_command("v.out.ascii", input='hospitals', format="point", separator=",").strip()
>>> repr(output_points)
'697237.5638615,182012.65540056,1\r\r\n704563.1878615,183568.90640056,2\r\r\n371094.1568615,274897.21940056,3\r\r\n640607.6248615,224673.12440056,4\r\r\n640952.4998615,224309.56340056,5\r\r\n670385.0618615,229717.53040056,6\r\r\n643879.5628615,230462.43740056,7\r\r\n636274.0638615,229165.37540056,8\r\r\n627390.7508615,202760.56240056,9\r\r\n646823.5628615,226063.35940056,10\r\r\n206065.7348615,195917.60840056,11\r\r\n444744.4678615,280199.56340056,12\r\r\n501025.1878615,179551.84440056,13\r\r\n566335.3748615,111190.14840056,14\r\r\n566327.6258615,111195.04740056,15\r\r\n671357.9988615,139627.76540056,16...

but:

x= grass.read_command('v.info', map='hospitals', flags='g')
repr(x)
'north=308097.937400562\r\nsouth=20235.5644005626\r\neast=914347.8748615\r\nwest=156998.1718615\r\ntop=0.000000\r\nbottom=0.000000\r\n'

so you have '\r\r\n' instead of '\r\n' which results in problems:

>>> output_points.splitlines()
['640607.6248615,224673.12440056,4', '', '640952.4998615,224309.56340056,5', ' ', ...]

Should we stop using 'HOST_NEWLINE' then?

Change History (8)

comment:2 by annakrat, 8 years ago

Milestone: 7.2.07.2.1

comment:3 by wenzeslaus, 8 years ago

It actually sounds more like http://stackoverflow.com/questions/11497376/new-line-python, specifically Nate C-K's comment to Charlie Martin's answer, except that fprintf(ascii, "%s", HOST_NEWLINE); is in C.

It seems that HOST_NEWLINE being \r\n on Windows is passed to fprintf() which seems to automatically replace \n by \r\n resulting in \r\n\n in the file (which is what is not possible to parse later in Python).

ISO/IEC 9899:TC3 says:

5.2.2 Character display semantics

Alphabetic escape sequences representing nongraphic characters in the execution character set are > intended to produce actions on display de vices as follows:

  • ...
  • \n (new line) Moves the active position to the initial position of the next line.
  • \r (carriage return) Moves the active position to the initial position of the current line.
  • ...

Wikipedia says:

The C standard only guarantees two things:

  • Each of these escape sequences maps to a unique implementation-defined number that can be stored in a single char value.
  • When writing a file in text mode, '\n' is transparently translated to the native newline sequence used by the system, which may be longer than one character. When reading in text mode, the native newline sequence is translated back to '\n'. In binary mode, no translation is performed, and the internal representation produced by '\n' is output directly.

The behavior fits what these two resources say, so it seems that HOST_NEWLINE can be changed to always mean \n and/or not used at all.

HOST_NEWLINE seems to be used only in lib/vector/Vlib/ascii.c, i.cluster, and v.out.ascii.

comment:4 by martinl, 8 years ago

Milestone: 7.2.17.2.2

comment:5 by neteler, 7 years ago

Milestone: 7.2.27.2.3

Ticket retargeted after milestone closed

comment:6 by martinl, 7 years ago

Milestone: 7.2.3

Ticket retargeted after milestone closed

comment:7 by martinl, 7 years ago

Milestone: 7.2.4

comment:8 by hellik, 6 years ago

Milestone: 7.2.47.6.2
Note: See TracTickets for help on using tickets.