Opened 2 years ago

Last modified 13 months ago

#692 new defect

Python 3.9 on v2 installer breaks part of stdlib

Reported by: akominlsfi Owned by: osgeo4w-dev@…
Priority: major Component: Installer
Version: Keywords:
Cc:

Description

Python 3.9 (with v2 installer) breaks stdlib usage: ssl and sqlite, or venv, depending on the way python is executed.

Our use case currently is to create a venv for QGIS plugin development on Windows machines, using python-qgis-ltr -m venv --system-site-packages, an additional .pth file in the created venv pointing to qgis python module directory (since those are not in site-packages) and setting the necessary dll directories from osgeo install to PATH. That allows installing development modules to the virtual environment and keeping osgeo install site packages clean.

For current v2 qgis-ltr package the python-qgis-ltr.bat points to the exe in apps/python39/python.exe, ssl and sqlite dlls are in bin, and neither of those can be resolved by default. Creating a venv works, but it has the same problem for ssl and sqlite.

When this fix is made, the bat will point to the exe in bin/python.exe and ssl and sqlite dlls can be resolved by default. This will however break the venv module, since the PYTHONHOME that is setup by the bat cannot be baked into the venv, and the venv will actually try to find the standard library paths relative the original executable, eg. bin/DLLS, bin/lib.

It seems previously (in v1) no dlls were duplicated to apps/python37/dlls since setting bin to PATH was enough to resolve sqlite and sll dlls. Current v2 installer duplicates only the ssl dlls to the apps/python39/dlls directory, but there seems to be a problem: libcrypto-1_1.dll and libssl-1_1.dll are found next to the _ssl.pyd module, but it actually seems to fail due to libssl-1_1.dll having a dependency on another dll found only in bin, libcrypto-1_1_x64.dll. Copying that to apps/python39/dlls, using the original bat pointing to the exe in apps/python39/python.exe and starting an interactive console import ssl actually now succeeds. Same thing with the sqlite3 module, import sqlite3 works after copying sqlite3.dll from bin to apps/python39/dlls. There is this change on how ssl dlls work, but seems the problem with x64 vs non-x64 named dlls has resurfaced?

Some of this has been also documented on this QGIS issue when I looked into this previously.

Is the duplication of stdlib required dlls also next to the relevant .pyd modules a valid fix? Since ssl dlls are duplicated already (partially, x64-labeled missing), sqlite dll is the only one I found relevant by looking at the .pyd files with Dependency Walker. This would help somewhat, but still leave a problem with all the osgeo-installed python modules that have their dlls are bin and .pyds somewhere in site-packages. The bat fix pointing to the different exe works for those, but breaks the venv module.

Another way I thought just now could be to patch the osgeo-installed python standard library importlib module? It could be possible to add all the dll paths of the osgeo install tree in importlib init. This is currently done by qgis and PyQt5 modules to find their own requirements. This could help to preserve the original bat style (pointing to the apps/python39/python.exe), no duplication necessary for the stdlib dlls, and import ssl/import sqlite3 would work due to the patch, and venv would work since the original executable directory structure is as expected (and the importlib patch is also valid for imports in the venv since that comes from the shared stdlib).

Change History (7)

comment:1 by akominlsfi, 2 years ago

Regarding the patching of dll directories I found this PEP #648 for site customizations, which seems to be what osgeo4w packaging could benefit a lot from (each package that gets installed and needs for example shared dlls could add their respective script to the directory).

Since that PEP is still a draft, sitecustomize or a custom .pth file/files (using site module) could be way to go for now regarding the patching of dll directories? Location of these could be in site-packages or next to the apps/python37/python.exe. Creating a venv without the --system-site-packages flag needs the patching anyway if stdlib dlls are not duplicated (for ssl and sqlite3 to work), so the customization could be done always by using the exe path.

This change should preserve the current usage (osgeo4w shell, python-qgis.bats, anything else?) and also the python environment would work now as a standalone installation directly from the exe that just happens to include a lot of stuff in system site-packages without any special preset environment needed before execution.


Side note: It could also help if the QGIS install added a .pth file to the site-packages. The module is currently resolved by the wrapper bat extending PYTHONPATH to include apps/qgis/python, but since import qgis now works if its found at all even if the wrapper bat is not used (fixed here), it could be used as if it was a normal site-package in the python environment with the help of a .pth file pointing to the module path.

comment:2 by akominlsfi, 2 years ago

I tested this theory on a fresh v2 network install with only qgis-ltr-full (3.16.11) selected.

Creating a file <osgeo-install>/apps/python39/sitecustomize.py with these contents:

import os

os.add_dll_directory('<osgeo-install>/bin')
os.add_dll_directory('<osgeo-install>/apps/qgis-ltr/bin')
os.add_dll_directory('<osgeo-install>/apps/Qt5/bin')

and a file <osgeo-install>/apps/python39/lib/site-packages/qgis.pth with these contents:

<osgeo-install>/apps/qgis-ltr/python

Then executing python directly from <osgeo-install>/apps/python39/python.exe works with all the imports I figured out to test (qgis,PyQt5,osgeo.gdal,osgeo.ogr,scipy,numpy,pandas,matplotlib,psycopg2,pyproj) that have .pyds and may have required dlls somewhere else in the install tree. Not sure about what can be best way to bake in the env variables that would be setup by <osgeo-install>/etc/ini/*.bat files, eg. GDAL_DATA, PROJ_LIB etc. that might be necessary for the environment to function in some cases? Could that be something that is also made in customization script, set variable if not already present?

Also with this setup creating a venv works both without system site packages (qgis etc. not importable as would be expected), and with system site packages (qgis etc. importable), since the dll path patch is found also when executing the venv python exe.

in reply to:  2 ; comment:3 by jef, 2 years ago

Replying to akominlsfi:

I tested this theory on a fresh v2 network install with only qgis-ltr-full (3.16.11) selected.

Creating a file <osgeo-install>/apps/python39/sitecustomize.py with these contents:

os.add_dll_directory('<osgeo-install>/apps/qgis-ltr/bin')

we have multiple versions of qgis.

in reply to:  3 ; comment:4 by akominlsfi, 2 years ago

Replying to jef:

we have multiple versions of qgis.

Well yes, that was specifically for my install since the package name and the paths would anyway need templating at install time. I assume something like a pre-post install hook could be useful to add/remove necessary lines in sitecustomize.py or to create/remove necessary pth files?

Its not necessarily specific to QGIS (it just appeared using the python-qgis wrapper bat), but installing only python+gdal has the same issue, python module in site-packages and dlls in bin. venvs are broken if using the bin exe, or dll paths need to be setup if using the apps exe.

Does this sound like a viable idea? It would help development a lot if the venvs can be supported. If you can give some pointers how this could be implemented in the scripts we could possibly make a draft pull request later of the proposed change, we'd need this when updating to the next QGIS LTR later this year. As it is possible to patch the things on own machine after install its not really a blocker, but I think this could benefit osgeo4w also?

in reply to:  4 ; comment:5 by jef, 2 years ago

Replying to akominlsfi:

Replying to jef:

we have multiple versions of qgis.

Well yes, that was specifically for my install since the package name and the paths would anyway need templating at install time. I assume something like a pre-post install hook could be useful to add/remove necessary lines in sitecustomize.py or to create/remove necessary pth files?

How would that work for multiple installed versions of QGIS?

in reply to:  5 comment:6 by akominlsfi, 2 years ago

Replying to jef:

How would that work for multiple installed versions of QGIS?

Oh I see, its possible to select multiple different suffix version of QGIS to the same install tree (qgis, qgis-ltr, qgis-ltr-dev etc)? It seems adding the pth file and dll path for qgis cant work properly in that case, so probably no reason for different behaviour if only one qgis exists. That can be easily manually patched in venvs or directly to the install if necessary, since correct dll paths will be added by qgis module itself and only a pth file is necessary for the module to be found.

How about the bin path added always as a dll path, and qt if installed?

Since currently neither the bin or apps python exe works without some environment setup beforehand, dll path setup should allow direct execution of apps exe and still keep any osgeo shell or qgis bat wrapped calls to the bin exe since those set the PYTHONHOME.

PyQt5 setups its own dll paths only if the ddls are found in PATH, so that could benefit from a sitecustomize so the PATH does not necessarily need to contain the qt dlls.

comment:7 by Andreas Müller, 13 months ago

I created the file sitecustomize.py with this code:

import os
for p in os.getenv("PATH").split(";"):
    if os.path.exists(p):
        os.add_dll_directory(p)

and it seems to fix issues I had with an IDE I use (PyScripter using PyQt5) plus using pip in a virtual environment (were I got a ssl issue).

Note: See TracTickets for help on using tickets.