wiki:Python3Support

Version 3 (modified by wenzeslaus, 6 years ago) ( diff )

more reasoning for Unicode (or not)

Python 3 support in GRASS

Python versions

  • keep compatibility with 2.7 (may still work with 2.6, but we don't care)
  • port to work with 3.5

Python components include:

  • Python Scripting Library
  • PyGRASS
  • Temporal Library
  • ctypes
  • wxGUI

Python Scripting Library

What to consider:

  • The API is used not only by the GRASS Development Team (core devs) but in general, e.g. by writing addons or custom user scripts.
    • Maybe the core devs can be convinced to follow certain special practices for the core modules, but it doesn't seem realistic that addon contributors will follow them if there are too distant from what is standard for the language (less serious example is requiring PEP8 conventions versus some custom ones).
    • The purpose of the API is to make it simple for people to use and extend GRASS GIS.
  • Trained (and even the non-trained) Python 3 programmers will expect API to behave in the same way as the standard library and language in general.
    • One writes os.environ['PATH'], not os.environ[b'PATH'] nor os.environ[u'PATH'].
    • GUI needs Unicode at the end.

Possible approach:

  • functions need to accept unicode and return unicode
  • functions wrapping Python Popen class (read_command, run_command, ...) will have parameter encoding
    • encoding=None means expects and returns bytes (the current state)
    • encoding='default' means it takes current encoding using utils._get_encoding()
    • encoding='utf-8' takes whatever encoding user specifies, e.g., utf-8 in this case
    • this is similar to Popen class in Python3.6
    • by default encoding='default' to enable expected behavior by users, the following example shows Python3 behavior if we keep using bytes instead of unicode:
# return bytes
ret = read_command('r.what', encoding=None, ...

for item in ret.splitlines():
    line = item.split('|')[3:]

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'str'

# we would have to use:
for item in ret.splitlines():
    line = item.split(b'|')[3:]

Unicode as the default type in the API, e.g. for keys, but also for many values, is supported by Unicode being the default string literal type in Python 3. API users will expect that expressions such as hypothetical computation_region['north'] will work. Unlike in Python 2, there is a difference in Python 3 between computation_region[u'north'] and computation_region[b'north']. See comparison of dictionary behavior in 2 and 3:

# Python 2
>>> d = {'a': 1, b'b': 2}
>>> d['b']
2
>>> d[u'b']
2
>>> # i.e. no difference between u'' and b'' keys
>>> and that applies for creating also:
>>> d = {u'a': 1, b'a': 2}
>>> d['a']
2
>>> # because
>>> d
{u'a': 2}
# Python 3
>>> # unlike in 2, we get now two entries:
>>> d = {'a': 1, b'a': 2}
>>> d
{b'a': 2, 'a': 1}
>>> d['a']
1
>>> d[b'a']
2
>>> # it becomes little confusing when we combine unicode and byte keys
>>> d = {'a': 1, b'b': 2}
>>> d['a']
1
>>> d['b']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'b'
>>> d[b'b']
2
>>> # in other words, user needs to know and specify the key as bytes

Attachments (1)

Download all attachments as: .zip

Note: See TracWiki for help on using the wiki.