Changeset 66518


Ignore:
Timestamp:
Oct 16, 2015, 12:14:06 PM (9 years ago)
Author:
wenzeslaus
Message:

get the page description for addons index also from meta comment and text

When the page does not have the standard format or content, try to
use meta in HTML comments and when this fails try to extract first
sentece from the text.

Also don't search the description line in the whole file when
Keywords section is missing.

Location:
grass-addons/tools/addons
Files:
4 edited
1 copied

Legend:

Unmodified
Added
Removed
  • grass-addons/tools/addons/get_page_description.py

    r66517 r66518  
    2424
    2525
     26def get_desc_from_comment_meta_line(text):
     27    """
     28    >>> get_desc_from_comment_meta_line("<!-- meta page description: Abc abc-->")
     29    'Abc abc'
     30    """
     31    text = text.split("<!-- meta page description:", 1)[1]
     32    text = text.split("-->", 1)[0]
     33    return text.strip()
     34
     35
     36def get_desc_from_desc_text(text):
     37    r"""Get description defined as first sentence in the given text.
     38
     39    Sentence is defined as text which ends with dot and space.
     40    The string is expected to contain this. The other case not handled.
     41
     42    >>> get_desc_from_desc_text("Abc abc.abc abc.")
     43    'Abc abc.abc abc.'
     44    >>> get_desc_from_desc_text("Abc abc.abc abc. ")
     45    'Abc abc.abc abc.'
     46    >>> get_desc_from_desc_text("Abc abc.abc\n abc.\n")
     47    'Abc abc.abc\n abc.'
     48    """
     49    # this matches the sentence but gives also whole string even if it
     50    # is not the sentence
     51    text = re.split(r"\.(\s|$)", text, 1)[0]
     52    # strip spaces at the beginning and add the tripped dot back
     53    return text.lstrip() + '.'
     54
     55
    2656def main(filename):
    2757    with open(filename) as page_file:
    2858        desc = None
    2959        in_desc_block = False
     60        in_desc_section = False
     61        desc_section = ''
     62        desc_section_num_lines = 0
    3063        desc_block_start = re.compile(r'NAME')
    31         desc_block_end = re.compile(r'KEYWORDS')
     64        # the incomplete manual pages have NAME followed by DESCRIPTION
     65        desc_block_end = re.compile(r'KEYWORDS|DESCRIPTION')
     66        desc_section_start = re.compile(r'DESCRIPTION')
    3267        desc_line = re.compile(r' - ')
     68        comment_meta_desc_line = re.compile(r'<!-- meta page description:.*-->')
    3369        for line in page_file:
    3470            line = line.rstrip()  # remove '\n' at end of line
     
    4076                if desc_line.search(line):
    4177                    desc = get_desc_from_manual_page_line(line)
     78            # if there was nothing in the generated section of the page
     79            # try find manually added meta comments which are placed
     80            # at the beginning of the manually edited part of the page
     81            if not desc and comment_meta_desc_line.search(line):
     82                desc = get_desc_from_comment_meta_line(line)
     83            # if there was nothing else, last thing to try is get the first
     84            # sentence from the description section (which is also last
     85            # item in the file from all things we are trying
     86            if in_desc_section:
     87                desc_section += line + "\n"
     88                desc_section_num_lines += 1
     89                if desc_section_num_lines > 4:
     90                    in_desc_section = False
     91            if not desc and desc_section_start.search(line):
     92                in_desc_section = True
     93        if not desc and desc_section:
     94            desc = get_desc_from_desc_text(desc_section)
    4295        if not desc:
    4396            desc = "(incomplete manual page, please fix)"
  • grass-addons/tools/addons/test/data/g.broken.example.html

    r66517 r66518  
    1414<h2>DESCRIPTION</h2>
    1515
    16 This is a test page which should be emulate a broken manual page.
     16This is a test page which should emulate a broken manual page.
    1717This can happen for example, when module cannot generate a proper
    1818description (broken imports, not using parser, etc.).
    19 
    2019
    2120<h2>SEE ALSO</h2>
  • grass-addons/tools/addons/test/data/g.no.keywords.html

    r66517 r66518  
    1010
    1111<h2>NAME</h2>
    12 <em><b>r.broken.example</b></em> <h2>KEYWORDS</h2>
     12<em><b>r.broken.example</b></em>
    1313
    1414<h2>DESCRIPTION</h2>
    1515
    16 This is a test page which should be emulate a broken manual page.
     16This is a test page which should emulate a broken manual page
     17without keywords section.
    1718This can happen for example, when module cannot generate a proper
    1819description (broken imports, not using parser, etc.).
    1920
     21 - This line is supposed to look like description line but it is in a wrong place.
    2022
    2123<h2>SEE ALSO</h2>
  • grass-addons/tools/addons/test/data/wxGUI.example.html

    r66517 r66518  
    1515<h2>DESCRIPTION</h2>
    1616
    17 This is a test page which should be similar to a wxGUI manual pages.
     17This is a test page which should be similar to wxGUI manual pages.
    1818
    1919
  • grass-addons/tools/addons/test/test_description_extraction.sh

    r66517 r66518  
    99../get_page_description.py data/wxGUI.example.html
    1010../get_page_description.py data/g.broken.example.html
     11../get_page_description.py data/g.no.keywords.html
Note: See TracChangeset for help on using the changeset viewer.