Opened 10 years ago

Last modified 9 months ago

#151 assigned enhancement

make documentation be full text searchable: use sphinx

Reported by: timmie Owned by: epatton
Priority: major Milestone: 7.4.0
Component: Docs Version: unspecified
Keywords: JavaScript, Sphinx, Website Cc: grass-dev@…, timmie
CPU: Unspecified Platform: Unspecified

Description

The current HTML documentation consists of different HTML formated man pages linked together which offers good help for the experienced user. But an advantage would be to have a full text search on the documentation:

Use case: A user wants to remove a mapset or georeference a file but tdoesn't know which commands to use.

Good example for a full text searchable documentation: http://docs.python.org/dev/

Change History (46)

comment:1 Changed 10 years ago by epatton

Component: defaultWebsite
Owner: changed from grass-dev@… to epatton
Status: newassigned

I notice on http://grass.itc.it/gdp/general.php there is a link to 'Manual Pages search engine', but when clicked, the linked page displays only Google search engines for the user and developer mailing lists, the osgeo.org site, but nothing exclusive for Grass man pages.

Did this functionality once exist on this page but has been modified? How hard would it be to add in a man page search engine on http://grass.itc.it/searchgrass.php ?

~ Eric.

comment:2 Changed 10 years ago by neteler

This could be easily realized if "htdig" was installed on grass.osgeo.org. Personally, I don't have the resources currently to set it up (no time). Once it is there, we can fix this issue in a few minutes (it used to work on grass.itc.it).

Markus

comment:3 Changed 8 years ago by timmie

CPU: Unspecified
Platform: Unspecified

Please check Sphinx: http://sphinx.pocoo.org/ It has a standalone JavaScript? based search engine.

Very good!

comment:4 Changed 8 years ago by hamish

Cc: grass-dev@… added

comment:5 Changed 8 years ago by neteler

Summary: make documentation be full text searchablemake documentation be full text searchable: use sphinx

comment:6 Changed 8 years ago by neteler

I have locally converted most pages (using html2rest.py by Gerard Flanagan at http://bazaar.launchpad.net/~grflanagan/python-rattlebag/trunk/annotate/head:/src/html2rest.py ), a set of the GRASS HTML files fails with problems like

reST markup error:
/home/neteler/grass65/dist.x86_64-unknown-linux-gnu/docs/html/rst/source/r.coin.rst:66: (SEVERE/4) Title level inconsistent:

:
:
make: *** [html] Error 1

or

reST markup error:
/home/neteler/grass65/dist.x86_64-unknown-linux-gnu/docs/html/rst/source/r.cost.rst:183: (SEVERE/4) Title level inconsistent:

Algorithm notes
```````````````
make: *** [html] Error 1

This indicates to some extent HTML errors in the original as well as Sphinx problems with the tags

<dt> ...
<dd> ...

So with some effort the HTML pages could be made Sphinx compliant (perfect power user job).

Here the list of failing HTML files in 6.5.svn: d.graph.rst, d.his.rst, d.linegraph.rst, d.mapgraph.rst, d.menu.rst, d.out.file.rst, d.text.freetype.rst, g.gisenv.rst, g.message.rst, grass6.rst, g.region.rst, i.ortho.photo.rst, m.proj.rst, ps.map.rst, r.category.rst, r.coin.rst, r.cost.rst, r.distance.rst, r.in.gdal.rst, r.in.xyz.rst, r.mfilter.fp.rst, r.mfilter.rst, r.out.gdal.rst, r.proj.rst, r.ros.rst, r.spreadpath.rst, r.spread.rst, r.terraflow.rst, r.tileset.rst, r.watershed.rst, r.what.rst, v.label.rst, v.lidar.correction.rst, v.lidar.edgedetection.rst, v.lidar.growing.rst, v.outlier.rst, v.reclass.rst, v.segment.rst, v.surf.bspline.rst.

I've put everything online give you an impression (yes, partially messy but not so bad...):

http://grass.osgeo.org/grass65/manuals/sphinx/

Markus

comment:7 Changed 8 years ago by neteler

Here the procedure:

cd dist.x86_64-unknown-linux-gnu/docs/html/

# convert HTML to rEST:
mkdir rst
cd rst
for i in ../*.html ; do echo "$i:"; html2rest.py < $i > `basename $i .html`.rst ; done

sphinx-quickstart

# to avoid name conflict or define better in sphinx-quickstart:
mv index.rst oldindex.rst
mv *.rst source/

# convert with sphinx
make html

The resulting Sphinx-HTML manual is stored in the build/ directory.

Markus

PS: once the Wiki is back this should go there

comment:8 in reply to:  6 Changed 8 years ago by hamish

Replying to neteler: ...

This indicates to some extent HTML errors in the original as well as Sphinx problems with the tags

<dt> ...
<dd> ...

So with some effort the HTML pages could be made Sphinx compliant (perfect power user job).

Here the list of failing HTML files in 6.5.svn: d.graph.rst, d.his.rst, d.linegraph.rst, d.mapgraph.rst, d.menu.rst, d.out.file.rst, d.text.freetype.rst, g.gisenv.rst, g.message.rst, grass6.rst, g.region.rst, i.ortho.photo.rst, m.proj.rst, ps.map.rst, r.category.rst, r.coin.rst, r.cost.rst, r.distance.rst, r.in.gdal.rst, r.in.xyz.rst, r.mfilter.fp.rst, r.mfilter.rst, r.out.gdal.rst, r.proj.rst, r.ros.rst, r.spreadpath.rst, r.spread.rst, r.terraflow.rst, r.tileset.rst, r.watershed.rst, r.what.rst, v.label.rst, v.lidar.correction.rst, v.lidar.edgedetection.rst, v.lidar.growing.rst, v.outlier.rst, v.reclass.rst, v.segment.rst, v.surf.bspline.rst.

all of the above should (now) be html bug-free, as checked by dillo's lint verifier.

if that is so and all is valid HTML, remaining problems should are for the sphinx people to fix IMO.

I've put everything online give you an impression (yes, partially messy but not so bad...): http://grass.osgeo.org/grass65/manuals/sphinx/

specifically, bolds and newlines need work.

Hamish

ps- reST is good stuff.

comment:9 Changed 8 years ago by neteler

If HTML is bugfree then it depends on http://bazaar.launchpad.net/~grflanagan/python-rattlebag/trunk/annotate/head:/src/html2rest.py which perhaps needs some tweaks to write clean reST.

comment:10 Changed 8 years ago by hamish

FWIW, reStructuredText (reST) docs: http://docutils.sourceforge.net/rst.html

comment:11 Changed 7 years ago by neteler

Came across another HTML to reST (Sphinx) converter:

http://johnmacfarlane.net/pandoc/

Online try (throw in GRASS HTML file): http://johnmacfarlane.net/pandoc/try

comment:13 Changed 7 years ago by hamish

Hi,

after extensive use of reST + sphinx for the osgeo LiveDVD* (live.osgeo.org) documentation and website over the last year+, I am now of the opinion that GRASS's current html-source man pages are far superior to what would be accomplished by reST-source man pages; both in terms of expressibility and aggravation. The critical thing is to get it into a stable mark-up language, once there there's little reason (besides the usual bugs) why html2rest or some any2pdf style program couldn't translate between them and make a search index. Maybe wikimedia is a bit easier markup language than html, but if you are reading this you are highly likely to be smart enough to learn that <b> means bold and we don't actually do much complicated with it. I think we forget how simple stock HTML really is, and that when it comes to documentation, the steak is much more important than the sizzle.

[*] https://trac.osgeo.org/osgeo/browser/livedvd/gisvm/trunk/doc/

best, Hamish

comment:14 Changed 7 years ago by hamish

i.e. to say, I'd rather invest the time in helping to debug htDig.

comment:16 Changed 5 years ago by lucadelu

Some improvements, I obtain a working version of documentation with sphinx. I really like it but there are some think to fix. Here the procedure (use a recent version of pandoc, older it's buggy for me):

cd dist.x86_64-unknown-linux-gnu/docs/
mkdir rst
# convert html to rst
for i in `ls ../html/*.html`; do pandoc -s -c ../html/grassdocs.css -r html $i -w rst -o `basename $i .html`.rst; done
# move other files
cp ../html/*.png ../html/*.jpg .
cp ../html/grassdocs.css .
cp ../html/grass_logo.txt .
cp -rf ../html/icons/ .
# start sphinx
sphinx-quickstart
# move all to source directory
mv *.rst *.png *.jpg icons/ grass* source/
# create html documentation
make html

In the next weeks I'll try to study a little bit of sphinx to fix some problems

comment:17 Changed 5 years ago by lucadelu

In r52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the pandoc software.
You can simple run

 run make restdocs
 cd dist.XXXX/doc/rest
 make html

to create the documentation in rest format and to convert to beautiful HTML using sphinx. There are some issues still open, in order of importance level (if someone with good skill in makefile system wants to help me it would be really appreciated):

  • launching only "make", the reStructuredText documentation should not be created but some documents are created;
  • I cannot convert helptext.html and wxgui documentation due to some Make problems;
  • There are some documents with bad indentation because "pandoc" wrongs to convert <br> tag, the solution should be: remove white space if second character is not another white space, but some problem could remain ;
  • Some other problems remain (special chars, formatting) in the new rest pages.

Once solved, the resulting HTML pages could replace the current manual pages (since also search is provided).

comment:18 in reply to:  17 ; Changed 5 years ago by hamish

Replying to lucadelu:

Once solved, the resulting HTML pages could replace the current manual pages (since also search is provided).

erhm, once solved and building in parallel discussion on if that should happen could begin. Personally I am not in favour of throwing away all the strongly marked up work we have done in the description.html files in favour of the rather erratic and obscure markup of reSt for those pages. Perhaps 'finicky' is a better word. I'm happy to see the help pages get pretty, and yes reSt+sphinx-alikes is very pretty, but would like it to be in parallel, and reSt translated from our existing HTML docs automatically (ie the description.html parts), in the same way (or better) than the man pages are now.

I don't think lack of a working htDig install*, or reliance on "site:" google search, is a fatal blow for html.

[*] (is that still the case? if so as offered earlier, I'm happy to spend a little time on it)

thanks, Hamish

comment:19 in reply to:  17 ; Changed 5 years ago by hellik

Replying to lucadelu:

In r52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the pandoc software.

does this mean that pandoc would be another extern dependecy to get the docs?

on windows there would be needed an extra step installing pandoc (http://johnmacfarlane.net/pandoc/installing.html).

Helmut

comment:20 Changed 5 years ago by hamish

[slight follow up]

Hamish wrote:

, but would like it to be in parallel, and reSt translated from our existing HTML docs automatically

I am glad to see that is indeed the case, but <br> -> <br /> and <br> -> <p> in all the html files?! if the converter is broken, fix the converter! "<br>" is not a tough one to parse..

thanks, Hamish

comment:21 in reply to:  18 Changed 5 years ago by lucadelu

Replying to hamish:

Replying to lucadelu:

Once solved, the resulting HTML pages could replace the current manual pages (since also search is provided).

erhm, once solved and building in parallel discussion on if that should happen could begin. Personally I am not in favour of throwing away all the strongly marked up work we have done in the description.html files in favour of the rather erratic and obscure markup of reSt for those pages. Perhaps 'finicky' is a better word. I'm happy to see the help pages get pretty, and yes reSt+sphinx-alikes is very pretty, but would like it to be in parallel, and reSt translated from our existing HTML docs automatically (ie the description.html parts), in the same way (or better) than the man pages are now.

yes no problem for me to keep both versions

thanks, Hamish

best Luca

comment:22 in reply to:  19 Changed 5 years ago by lucadelu

Replying to hellik:

Replying to lucadelu:

In r52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the pandoc software.

does this mean that pandoc would be another extern dependecy to get the docs?

so right now I only test on Linux, if pandoc it missing return an error but it is not reported at the end of make process. For the future I hope to fix compilation issue and run compile restructured text only with make restdocs and not like now only with make. If someone can help in Make configuration it's really appreciated.

Could you test compilation on windows please?

Helmut

best Luca

comment:23 in reply to:  18 ; Changed 5 years ago by neteler

Replying to hamish: ...

I don't think lack of a working htDig install*, or reliance on "site:" google search, is a fatal blow for html.

[*] (is that still the case? if so as offered earlier, I'm happy to spend a little time on it)

htDig is missing for many years now (unfortunately) and google search is poor (unfortunately). A better solution is definitely needed and sphinx seems to provide it as it does for many OSGeo projects.

comment:24 in reply to:  20 Changed 5 years ago by neteler

Replying to hamish:

I am glad to see that is indeed the case, but <br> -> <br /> and <br> -> <p> in all the html files?! if the converter is broken, fix the converter! "<br>" is not a tough one to parse..

While I agree, we have many places where <br> is probably abused, i.e. in <li> lists and other odd places. pandoc is basically complaining about these suboptimal use cases.

comment:25 in reply to:  17 ; Changed 5 years ago by glynn

Replying to lucadelu:

In r52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the pandoc software.

I don't see what problem this is trying to solve.

comment:26 in reply to:  25 ; Changed 5 years ago by neteler

Replying to glynn:

Replying to lucadelu:

In r52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the pandoc software.

I don't see what problem this is trying to solve.

Besides a more modern look, it offers an included search engine for the manual which even works offline in local GRASS GIS installations.

comment:27 in reply to:  26 Changed 5 years ago by glynn

Replying to neteler:

I don't see what problem this is trying to solve.

Besides a more modern look, it offers an included search engine for the manual which even works offline in local GRASS GIS installations.

First, bear in mind that an important function of the HTML files is as the source for Unix (nroff) manual pages. Anything which interferes with that isn't acceptable.

Beyond that, I don't really see the point of adding another dependency. Or ReST, for that matter. The output from --html-description is only a fragment of the final HTML; the rest is in HTML, and that isn't going to change (HTML is far better known and supported than ReST).

If you think that there are specific problems with the current HTML, the appropriate solution would be to change the parser-generated HTML and/or the guidelines for the manually-generated HTML.

comment:28 in reply to:  23 ; Changed 5 years ago by hamish

Replying to 22: @Luca: sorry I'm not much of a Makefile expert, but does the command exiting with an error not break out of the 'make' job right away? A "make restdocs" would be nice.

Replying to neteler:

Replying to hamish: ...

I don't think lack of a working htDig install*, or reliance on "site:" google search, is a fatal blow for html.

[*] (is that still the case? if so as offered earlier, I'm happy to spend a little time on it)

htDig is missing for many years now (unfortunately) and google search is poor (unfortunately). A better solution is definitely needed

It seems like a problem solved over and over again in the mid 90s (which really shows in htDig's cosmetics). There must be a better local site search package available.... we could pour hours of time into getting htDig working but at the end of the day it's still htDig, which I'm not sure of others' impressions of but I never really found too visually pleasing. Ideally there would be some tool which we could configure to also search the grass5 docs etc, but move those results all the way down to the end of page 17, with the grass 6.4 hits returning first.

and sphinx seems to provide it as it does for many OSGeo projects.

I'd enjoy seeing sphinx in parallel with the html docs and available, they look great, but they do take up more space and subtle things like two spaces in front of ".." instead of three, or not enough whitespace around "*" bullet points can cause your next paragraph to silently not display, with no error logged in the build messages (something I was fighting with two days ago, after a work-day of fighting with whitespace in fortran77 code). I just think we should be careful with the word "replace" the html docs at this point. As mentioned earlier, my other concern is to throw away all the strong markup and hand crafting (including <br>s) that has gone into the html description.htmls, as ReSt?'s markup is by design much looser and sensitive. To keep (valid!) html as the source for those and converting to ReSt? automatically with panodoc (IIUC how this is intended to work) would be great. The more the merrier. And pandoc -> LaTeX -> a better PDF booklet while we're at it.

Replying to neteler:

While I agree, we have many places where <br> is probably abused, i.e. in <li> lists and other odd places. pandoc is basically complaining about these suboptimal use cases.

do the pages pass proper html validation checks? I typically set my GRASS_HTML_BROWSER to dillo with the htmlbug validation tool turned on to test as I work.

Or is pandoc not fully supporting valid html &/or brittle in how it does? if so, perhaps a 'sed -e 's+<br>+<br />+g' pre-processing step (etc) piped in as passing the files to pandoc would work around that deficiency in pandoc, until such time as pandoc is fixed.

Replying to glynn:

Beyond that, I don't really see the point of adding another dependency.

we have an optional make pdfdocs, why not an optional make restdocs too and host them somewhere? If it all works well & is self contained we can look at bundling the same with the binary installers, e.g. as with the new grass-dev-doc package for debian which ships the programmers' manual.

Again, we should be careful with our use of the word "replace"; perhaps "augment" the current offerings is a better term for now? Local off-line search of the help pages is a nice goal, e.g. for use from a laptop in the field. (perhaps there is some python-html grepping library we could use?)

shrug, Hamish

comment:29 in reply to:  28 ; Changed 5 years ago by neteler

Replying to hamish:

Replying to neteler:

...

It seems like a problem solved over and over again in the mid 90s (which really shows in htDig's cosmetics). There must be a better local site search package available.

Maybe, but I spent too much lifetime on this already.

...

and sphinx seems to provide it as it does for many OSGeo projects.

I'd enjoy seeing sphinx in parallel with the html docs and available, they look great,

There seems to be a misunderstanding. The proposal is to *keep* the current HTML docs since the new sphinx mechanism uses them as input.

The point is to offer the resulting HTML pages on the server as well as to the user in local, no problems to have two offerings here ("classical" HTML pages and the "new" ones).

As mentioned earlier, my other concern is to throw away all the strong markup and hand crafting (including <br>s) that has gone into the html description.htmls,

Sure, nobody said this. I just pointed out that there are HTML errors in the current HTML pages which need to be fixed anyway (and which will make pandoc more happy). I am surprised that some of these pass the W3 validator.

...

And pandoc -> LaTeX -> a better PDF booklet while we're at it.

Yes if they don't become a 1000 pages manual which then will be printed and kill the rain forest.

Replying to neteler:

While I agree, we have many places where <br> is probably abused, i.e. in <li> lists and other odd places. pandoc is basically complaining about these suboptimal use cases.

do the pages pass proper html validation checks?

Strangely yes. See for example r52667 for changes which improve the current HTML and which help pandoc as well.

Or is pandoc not fully supporting valid html &/or brittle in how it does? if so, perhaps a 'sed -e 's+<br>+<br />+g' pre-processing step (etc) piped in as passing the files to pandoc would work around that deficiency in pandoc, until such time as pandoc is fixed.

... your suggestion needs to be tested.

Local off-line search of the help pages is a nice goal, e.g. for use from a laptop in the field.

Also: not all people in the world are always online... a searchable user manual is a must have nowadays. Especially when offering 400 modules.

Markus

comment:30 in reply to:  29 ; Changed 5 years ago by glynn

Replying to neteler:

There seems to be a misunderstanding. The proposal is to *keep* the current HTML docs since the new sphinx mechanism uses them as input.

In which case, what is the point of the --rest-description option? Also, Rest.make, restdir target, etc? IOW, why does the ReST generation require anything other than the generated HTML files in dist.<arch>/docs/html/*.html?

comment:31 in reply to:  30 ; Changed 5 years ago by wenzeslaus

Replying to glynn:

In which case, what is the point of the --rest-description option? Also, Rest.make, restdir target, etc? IOW, why does the ReST generation require anything other than the generated HTML files in dist.<arch>/docs/html/*.html?

The generated HTML files (in dist.<arch>/docs/html/*.html) are not enough because for example parameter list is represented as HTML description but ReST has its own representation of the parameter list (ReST option list). Once module description is converted to HTML the information whether this description list is module parameter list or some general list in hand-written module description (module.html file) is lost.

The conversion of complete generated HTML files (in dist.<arch>/docs/html/*.html) is possible but there are only two options. The first one is the usage of a generic converter (as is now used for module.html files) but any clever standard formatting in ReST cannot be used. The second one is to create a custom (context aware) transformation which uses both HTML markup and contents (e.g. contents of <h2> tag) but this can be a lot of work (I've tried it using XSLT but I gave it up). Another option would be to use XSLT and generated XML description but direct generation of ReST description seems like a less complex solution for me.

comment:32 in reply to:  31 ; Changed 5 years ago by glynn

Replying to wenzeslaus:

Once module description is converted to HTML the information whether this description list is module parameter list or some general list in hand-written module description (module.html file) is lost.

The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

The second one is to create a custom (context aware) transformation which uses both HTML markup and contents (e.g. contents of <h2> tag)

This is what I'm proposing.

but this can be a lot of work (I've tried it using XSLT but I gave it up).

IMHO, it's preferable to cluttering up the build system with ReST-specific features. The nroff manual pages are generated without requiring anything beyond one rule in Html.make and one in man/Makefile. An added advantage is that any errors which occur while generating them result in the corresponding module being listed in the error.log file.

As it stands, I'm inclined to revert most of r52656 (other than the fixes to v.in.ogr.html, which should have been a separate commit). Also r52459, unless there's some other use for it.

comment:33 in reply to:  32 ; Changed 5 years ago by neteler

Replying to glynn:

The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

While well defined, it is much easier and not that invasive to directly render the module parameters/flags descriptions in ReST.

IMHO, it's preferable to cluttering up the build system with ReST-specific features.

A Makefile guru may well see that the current approach could be simplified (rather than ditched).

The usage of Sphinx offers capabilities we cannot achieve in a different way from the current HTML documentation. And the current HTML core pages will remain as before, just an additional output is rendered:

HTML core page (as present) --+
                              |
g.parser --> HTML --------.---+---> HTML as currently


HTML core page (as present) --+
                              |
g.parser --> REST ------------+---> pandoc ---> Sphinx ---> additional alternative
                                                            HTML with search

comment:34 in reply to:  33 Changed 5 years ago by hellik

Replying to neteler:

Replying to glynn:

The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

While well defined, it is much easier and not that invasive to directly render the module parameters/flags descriptions in ReST.

maybe related:

http://lists.osgeo.org/pipermail/grass-commit/2012-August/023889.html

Log: Make --html-description output easier to parse Add ReST generator

comment:35 in reply to:  33 Changed 5 years ago by glynn

Replying to neteler:

The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

While well defined, it is much easier and not that invasive to directly render the module parameters/flags descriptions in ReST.

IMHO, it's preferable to cluttering up the build system with ReST-specific features.

A Makefile guru may well see that the current approach could be simplified (rather than ditched).

The current approach mirrors the mechanism used to generate the HTML files, which is significantly more involved than the mechanism used to generate the manual pages from the completed HTML files. If we can generate the ReST files directly from the completed HTML files (and there's no fundamental reason why we can't), it would simplify the build process somewhat.

In r52956, I've modified the --html-description output to make it easier to parse (adding DIV tags around various sections) and added a script to generate ReST output from the completed HTML pages.

comment:36 in reply to:  32 ; Changed 5 years ago by glynn

Replying to glynn:

As it stands, I'm inclined to revert most of r52656

Done in r53240.

I've kept the v.in.ogr.html fixes, as well as the various Python scripts (which aren't being used), but reverted the Makefile changes.

If you need help on doing this correctly (i.e. like how the manual pages are built), or additional changes to the HTML format, please ask.

comment:37 in reply to:  36 Changed 5 years ago by neteler

Replying to glynn:

I've kept the v.in.ogr.html fixes, as well as the various Python scripts (which aren't being used), but reverted the Makefile changes.

For the record, the topics have been reinstated in r53525 and r53526.

If you need help on doing this correctly (i.e. like how the manual pages are built), or additional changes to the HTML format, please ask.

The topics page (http://grass.osgeo.org/grass70/manuals/html70_user/topics.html) should become two or three column...

comment:38 Changed 5 years ago by timmie

Cc: timmie added

So where are we now?

  • Keeping the HTML docs are apparently the consence
  • The current doc are not searchable
  • So can we user the technology behind the Sphinx search be used for GRASS?
  • What about other approaches like Whoosh [1]?

Ideally, the search would be on the website but also in the wxGui.

[1] http://pythonhosted.org/Whoosh/intro.html

comment:39 Changed 5 years ago by timmie

Component: WebsiteDocs

comment:40 in reply to:  38 Changed 4 years ago by wenzeslaus

Replying to timmie:

So where are we now?

I'm interested too, the last commit linked here is a revert (r53240).

Keeping the HTML docs are apparently the consence

Of course, this is the format how it is stored and the resulting pages can be even using some JavaScript? or some additional processing during compilation (this wasn't really explored so far).

The current doc are not searchable

Using Sphinx would be a nice workaround to get time to solve our custom search.

So can we user the technology behind the Sphinx search be used for GRASS?

Wouldn't this be much harder than using Sphinx and our HTML together? Sphinx can still be better for Python developers while HTML would be for other users?

I guess that using Sphinx parts would be more difficult than using some standalone package (but really just guessing).

What about other approaches like Whoosh?

And what about some JavaScript? solutions at least for keywords, labels, descriptions and names?

Ideally, the search would be on the website but also in the wxGui.

There is a different search in the wxGUI in Layer Manager, Search modules tab. You can search the module according to keywords, label, description and name (all at once). To get the documentation you currently have to open the module dialog/form and go to Manual tab. Better way would be to open manual page directly from the Search modules tab. Similar think is implemented in the extension manager/addons installer. And to the searchable manual pages in GU, I'm not sure what would be the easiest way to implement this.

PS: [1] is interpreted by Trac as changeset link while [http://abc.org Abc] is interpreted as link with text. And note that Trac syntax for bullet lists is "space-star-space":

 * dsd
 * sdsasd

It would be really great to have http://trac.osgeo.org/osgeo/ticket/592 solved, so we would see the live preview of the ticket (instead of pressing Preview button at the bottom of the page).

comment:41 Changed 3 years ago by wenzeslaus

Keywords: JavaScript Sphinx Website added

As I was saying, perhaps some JavaScript? which would go through some JSON or XML file would be enough? Search could be graphically incorporated in the same way as TOC. The JSON/XML file would be generated during build and would contain name, label, description and keywords for each module. This wouldn't be full text but it is good enough. It works well (enough) in Search modules tab in wxGUI.

This search could be attached to each page but there could be also a separate page. However, this wouldn't work so well I think (it would be less prominent). The file with the metadata would be quite large but with today's web and loading on demand it could work. Some special care would have to be done for the local pages.

Somebody interested in some JavaScript? development?

comment:42 Changed 19 months ago by martinl

Milestone: 7.0.07.0.5

comment:43 Changed 18 months ago by martinl

Milestone: 7.0.57.2.0

There were some attempts for sphinx support, changing milestone to 7.2

comment:44 Changed 11 months ago by neteler

Milestone: 7.2.07.2.1

Ticket retargeted after milestone closed

comment:45 Changed 9 months ago by wenzeslaus

Milestone: 7.2.17.4.0

The current state:

As mentioned above, the Modules tab supports search in keywords, names and descriptions. In trunk (for 7.4), there is also Advanced search button which open G7:g.search.modules which is available since 7.2 and can do full text search in manual pages.

Also, there is Google search on the main website which needs change from http:// to https:// and searches through the website and all documentation.

https://grass.osgeo.org/documentation/search-engine

Sphinx is used for Python documentation and has search (which has full text):

https://grass.osgeo.org/grass72/manuals/libpython

Doxygen is used for C documentation which also has its own search (which doesn't have full text):

https://grass.osgeo.org/programming7

comment:46 in reply to:  45 Changed 9 months ago by neteler

Replying to wenzeslaus:

Also, there is Google search on the main website which needs change from http:// to https:// and searches through the website and all documentation.

Done.

https://grass.osgeo.org/documentation/search-engine

Note: See TracTickets for help on using tickets.