Opened 11 years ago
Closed 11 years ago
#2087 closed defect (fixed)
grass64 man page: missing words
Reported by: | hamish | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | 6.4.4 |
Component: | Docs | Version: | 6.4.3 |
Keywords: | g.html2man | Cc: | |
CPU: | All | Platform: | Linux |
Description
Hi,
the current build of GRASS 6.x is losing important information in the man page due to a g.html2man error.
the source file is lib/init/grass6.html
the html looks like:
<h2>SYNOPSIS</h2> <b>grass64</b> [<b>-</b>] [<b>-v</b>] [<b>-h | -help | --help</b>] [<b>-text | -gui | -tcltk | -oldtcltk | -wxpython | -wx]</b>] [[[<b><GISDBASE>/</b>]<b><LOCATION_NAME>/</b>] <b><MAPSET></b>]
the resulting man page looks like:
.SH SYNOPSIS \fBgrass64\fR [\fB-\fR] [\fB-v\fR] [\fB-h | -help | --help\fR] [\fB-text | -gui | -tcltk | -oldtcltk | -wxpython | -wx]\fR] [[[\fB/\fR]\fB/\fR] \fB\fR]
i.e. html:
SYNOPSIS grass64 [-] [-v] [-h | -help | --help] [-text | -gui | -tcltk | -oldtcltk | -wxpython | -wx]] [[[<GISDBASE>/]<LOCATION_NAME>/] <MAPSET>]
and man:
SYNOPSIS grass65 [-] [-v] [-h | -help | --help] [-text | -gui | -tcltk | -oldtcltk | -wxpython | -wx]] [[[/]/] ]
... the [[<GISDBASE>/]<LOCATION_NAME>/] <MAPSET>]
part has lost its words even though >, < were used and not something which could be mistaken for a <html tag>. Is the DoEscape
subroutine converting '>' to '<' before any unknown html tags are thrown away? If so it should be moved to after that; see lines 136 and 141:
https://trac.osgeo.org/grass/browser/grass/branches/develbranch_6/tools/g.html2man/g.html2man#L110
?
thanks, Hamish
Attachments (1)
Change History (12)
follow-up: 2 comment:1 by , 11 years ago
comment:2 by , 11 years ago
Replying to neteler:
Also in GRASS 7, the final part which is in HTML
... -wxpython | -wx]] [[[<GISDBASE>/]<LOCATION_NAME>/]becomes in MAN:
... -wxpython | -wx]] [[[/]/] ]
I cannot confirm this. With a freshly checked out and compiled grass_trunk, I get:
grass71 [-h | -help | --help] [-v | --version] [-c | -c geofile | -c EPSG:code] [-text | -gtext | -gui] [[[<GISDBASE>/]<LOCATION_NAME>/] <MAPSET>]
I can confirm it for grass64_release, though:
grass64 [-] [-v] [-h | -help | --help] [-text | -gui | -tcltk | -oldt‐ cltk | -wxpython | -wx]] [[[/]/] ]
Moritz
follow-up: 4 comment:3 by , 11 years ago
The problem seems to be in the function DoLine, lines 136ff:
&DoEscape($_); &DoPara($_); if (! $preformat) { if (m/^$/) {return 0}; s#^[ \t]*##; s#<[^>]*>##g;
DoEscape is called first, which replaces the < and > by the respective symbols, and then, in the last line of DoLine, these symbols and everything between them is replace by an empty string. Commenting out the last line, i.e. s#<[>]*>##g;, solves the problem for grass6.html, but I don't know what other effects this has.
Moritz
follow-up: 6 comment:4 by , 11 years ago
Replying to mlennert:
The problem seems to be in the function DoLine, lines 136ff:
&DoEscape($_); &DoPara($_); if (! $preformat) { if (m/^$/) {return 0}; s#^[ \t]*##; s#<[^>]*>##g;DoEscape is called first, which replaces the < and > by the respective symbols, and then, in the last line of DoLine, these symbols and everything between them is replace by an empty string. Commenting out the last line solves the problem for grass6.html, but I don't know what other effects this has.
It leaves in a series of HTML tags. So the art will be to erase all these tags, without erasing the <> around the variable names. This said, do we really need those ?
Moritz
by , 11 years ago
Attachment: | g.html2man.diff added |
---|
follow-up: 7 comment:5 by , 11 years ago
I've attached a very quick and dirty hack that solves this specific issue for me. I don't find it particularly elegant, though. Maybe someone with more perl/regex foo can find a better solution.
Moritz
comment:6 by , 11 years ago
Replying to mlennert:
without erasing the <> around the variable names. This said, do we really need those ?
I would say no. We are still following man pages formatting in HTML and I don't think that <
and >
are part of it. For example, this is my man grep
:
grep [OPTIONS] PATTERN [FILE...] grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]
By the way, I'm still not sure if parsing whole HTML pages is a good idea. If I would start from scratch I would probably use module's HTML stub and XML interface description because XML is easier to parse than HTML tag soup (but since we already have the parsing and Makefiles are also designed for parsing whole HTML it is probably not worth trying).
Even more by the way, Moritz, you can mark and escape inline code here using backticks `#forexample`
or even {{{
and }}}
should work inline. New Trac has automatic preview, so we have hope (http://trac.edgewall.org/ticket/8855 and http://trac.edgewall.org/ticket/8721).
follow-up: 8 comment:7 by , 11 years ago
Replying to mlennert:
Maybe someone with more perl/regex foo can find a better solution.
Does using g.html2man.py from GRASS 7 qualify?
The main drawbacks are that it makes Python a build-time dependency (but eliminates the Perl dependency), and may require some clean-up of the HTML files (the Python version will fail hard on invalid HTML).
follow-up: 9 comment:8 by , 11 years ago
Replying to glynn:
Replying to mlennert:
Maybe someone with more perl/regex foo can find a better solution.
Does using g.html2man.py from GRASS 7 qualify?
The main drawbacks are that it makes Python a build-time dependency (but eliminates the Perl dependency), and may require some clean-up of the HTML files (the Python version will fail hard on invalid HTML).
I would just remove the problematic <
and >
and leave GRASS 6 (core) without Python (build) dependency. (We have two versions of GRASS, let's keep them different from each other.)
follow-up: 10 comment:9 by , 11 years ago
Replying to wenzeslaus:
Replying to glynn:
Replying to mlennert:
Maybe someone with more perl/regex foo can find a better solution.
Does using g.html2man.py from GRASS 7 qualify?
The main drawbacks are that it makes Python a build-time dependency (but eliminates the Perl dependency), and may require some clean-up of the HTML files (the Python version will fail hard on invalid HTML).
I would just remove the problematic
<
and>
and leave GRASS 6 (core) without Python (build) dependency.
As there were no objections to this, I took the liberty to just erase these symbols from the file. The resulting html page and man file appear easily readable to me and I don't think that this issue warrants changing g.html2man.
Leaving this ticket open for now in case anyone objects now or in case someone sees the same problem in another man page.
Moritz
comment:10 by , 11 years ago
Replying to mlennert:
Replying to wenzeslaus:
Replying to glynn:
Replying to mlennert:
Maybe someone with more perl/regex foo can find a better solution.
Does using g.html2man.py from GRASS 7 qualify?
The main drawbacks are that it makes Python a build-time dependency (but eliminates the Perl dependency), and may require some clean-up of the HTML files (the Python version will fail hard on invalid HTML).
I would just remove the problematic
<
and>
and leave GRASS 6 (core) without Python (build) dependency.As there were no objections to this, I took the liberty to just erase these symbols from the file.
Forgot to mention: r60237 for develbranch6 and r60238 for releasebranch_6_4.
comment:11 by , 11 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Closing as no one has objected to the solution.
Moritz
Also in GRASS 7, the final part which is in HTML
becomes in MAN: