Opened 8 years ago
Last modified 6 years ago
#3155 new defect
inconsistent newlines handling on Windows
Reported by: | annakrat | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 7.6.2 |
Component: | LibVector | Version: | svn-trunk |
Keywords: | newline | Cc: | |
CPU: | Unspecified | Platform: | MSWindows 8 |
Description
Function Vect_write_ascii uses:
fprintf(ascii, "%s", HOST_NEWLINE);
where HOST_NEWLINE is defined as '\r\n' on Windows. Other modules like v.info use:
fprintf(stdout, "north=%s\n", tmp1);
This is problem when you process the output in Python:
>>> output_points = grass.read_command("v.out.ascii", input='hospitals', format="point", separator=",").strip() >>> repr(output_points) '697237.5638615,182012.65540056,1\r\r\n704563.1878615,183568.90640056,2\r\r\n371094.1568615,274897.21940056,3\r\r\n640607.6248615,224673.12440056,4\r\r\n640952.4998615,224309.56340056,5\r\r\n670385.0618615,229717.53040056,6\r\r\n643879.5628615,230462.43740056,7\r\r\n636274.0638615,229165.37540056,8\r\r\n627390.7508615,202760.56240056,9\r\r\n646823.5628615,226063.35940056,10\r\r\n206065.7348615,195917.60840056,11\r\r\n444744.4678615,280199.56340056,12\r\r\n501025.1878615,179551.84440056,13\r\r\n566335.3748615,111190.14840056,14\r\r\n566327.6258615,111195.04740056,15\r\r\n671357.9988615,139627.76540056,16...
but:
x= grass.read_command('v.info', map='hospitals', flags='g') repr(x) 'north=308097.937400562\r\nsouth=20235.5644005626\r\neast=914347.8748615\r\nwest=156998.1718615\r\ntop=0.000000\r\nbottom=0.000000\r\n'
so you have '\r\r\n' instead of '\r\n' which results in problems:
>>> output_points.splitlines() ['640607.6248615,224673.12440056,4', '', '640952.4998615,224309.56340056,5', ' ', ...]
Should we stop using 'HOST_NEWLINE' then?
Change History (8)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
Milestone: | 7.2.0 → 7.2.1 |
---|
comment:3 by , 8 years ago
It actually sounds more like http://stackoverflow.com/questions/11497376/new-line-python, specifically Nate C-K's comment to Charlie Martin's answer, except that fprintf(ascii, "%s", HOST_NEWLINE);
is in C.
It seems that HOST_NEWLINE being \r\n
on Windows is passed to fprintf()
which seems to automatically replace \n
by \r\n
resulting in \r\n\n
in the file (which is what is not possible to parse later in Python).
ISO/IEC 9899:TC3 says:
5.2.2 Character display semantics
Alphabetic escape sequences representing nongraphic characters in the execution character set are > intended to produce actions on display de vices as follows:
- ...
- \n (new line) Moves the active position to the initial position of the next line.
- \r (carriage return) Moves the active position to the initial position of the current line.
- ...
Wikipedia says:
The C standard only guarantees two things:
- Each of these escape sequences maps to a unique implementation-defined number that can be stored in a single char value.
- When writing a file in text mode, '\n' is transparently translated to the native newline sequence used by the system, which may be longer than one character. When reading in text mode, the native newline sequence is translated back to '\n'. In binary mode, no translation is performed, and the internal representation produced by '\n' is output directly.
The behavior fits what these two resources say, so it seems that HOST_NEWLINE
can be changed to always mean \n
and/or not used at all.
HOST_NEWLINE
seems to be used only in lib/vector/Vlib/ascii.c
, i.cluster
, and v.out.ascii
.
comment:4 by , 8 years ago
Milestone: | 7.2.1 → 7.2.2 |
---|
comment:7 by , 7 years ago
Milestone: | → 7.2.4 |
---|
comment:8 by , 6 years ago
Milestone: | 7.2.4 → 7.6.2 |
---|
Sounds like https://stackoverflow.com/questions/4599936/handling-r-n-vs-n-newlines-in-python-on-mac-vs-windows