-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when importing PUDL installed from conda-forge #2426
Comments
I am able to reproduce this. Seems very bad! What the heck?! I doubt it's related, but I get the same behavior (albeit faster!) if I |
I found a suggestion on stackoverflow to use From the full results which are below, it looks like the problem is happening in Confirmed in issue uber/h3-py#313. import faulthandler
faulthandler.enable()
import timezonefinder
|
Thank you for that sleuthing! It's disturbing that the issue has been open for a month with no resolution. |
@arengel - FYI, for internal planning purposes I'm going to edit this issue to include a 'scope' and 'next steps' section. (Thanks for bringing this up!) |
It looks like there was some work on the H3 conda-forge feedstock to try and fix this but it petered out. |
We'd like to work on this but not necessarily blow up our whole sprint working on this. Scope
Out of scope:
Next steps
|
I'm interested in helping but am on WSL, not mac... |
Replicated on a new M2 Macbook air, will play around with this a bit... |
Seems like it's probably a bunch of dependency futzing. We've considered using conda-lock to lock exact versions of all of our dependencies so this kind of drift after the fact can't happen, if that's something you feel like exploring! |
Locking dependancies makes total sense and is why I've always used a pinned |
We have some annoying to build dependencies that need C/C++/Fortran extensions (e.g. geospatial stuff) and some non-python dependencies that |
(also I have a PR (#2479) open right now to move all our project metadata / build stuff into |
I ran into some problems with conda-lock not playing nicely with our
git-pinned pip requirements. But I think we may have gotten rid of some
and/or conda-lock may have gotten better.
Zane Selvans ***@***.***>于2023年4月3日 周一下午6:01写道:
… (also I have a PR (#2479
<#2479>) open right now
to move all our project metadata / build stuff into pyproject.toml and
get rid of setup.py finally.
—
Reply to this email directly, view it on GitHub
<#2426 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATBKMU3A67W53FWM5WPQTTW7NCDNANCNFSM6AAAAAAWBW4HQM>
.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
All right, I got something working! I also implemented conda-lock which I think will be a great way to better track dependancies. However I mamba is so much faster that might make sense to let users decide if they prefer more deterministic locking or faster, sloppier locking Using pure mamba and environment.yml# mamba does not support creating from an environment.yml file, so create and then update:
mamba env create --name test
mamba env update -n test --file environment.yml Using conda-lock to create lockfiles
Using conda-lock.yml to install from scratch 😄
|
… silicon Signed-off-by: Nelson Auner <[email protected]>
… silicon Signed-off-by: Nelson Auner <[email protected]>
This should hopefully be fixed by v2022.11.30.post1 which I will get up on |
I may have spoken too soon here. Unfortunately the only version of $ mamba search h3-py
# Name Version Build Channel
h3-py 3.7.4 py310h0f1eb42_1 conda-forge
h3-py 3.7.4 py311ha397e9f_1 conda-forge
h3-py 3.7.4 py38h2b1e499_1 conda-forge
h3-py 3.7.4 py39h23fbdae_1 conda-forge |
Maybe h3-py 3.7.4 only works on Python 3.10? My next guess would be to try to reproduce this issue in 3.10 vs. 3.11... https://github.com/uber/h3-py/blob/master/CHANGELOG.md#374---2022-04-14 |
For those following along, this is still an issue given that h3-py is still not updated on conda (but is on pip). This issue can be "patched" simply by, after running the conda install, using pip to update h3-py: # Install and activate environment
➜ mamba create --name test python=3.10 catalystcoop.pudl
➜ conda activate test
# Load python and attempt to import `pudl`:
(test) ➜ python
Python 3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:12:31) [Clang 14.0.6 ] on darwin
>>> import pudl
....
[1] 96656 segmentation fault python
# Show that we're using virtual env pip and update the h3 module:
(test) ➜ pudl git:(pandas-2.0) ✗ which pip
/Users/nelsonauner/mambaforge/envs/test/bin/pip
(test) ➜ pudl git:(pandas-2.0) ✗ pip install "h3>=3.7.6,<3.8"
Collecting h3<3.8,>=3.7.6
Using cached h3-3.7.6-cp310-cp310-macosx_11_0_arm64.whl (904 kB)
Installing collected packages: h3
Attempting uninstall: h3
Found existing installation: h3 3.7.4
Uninstalling h3-3.7.4:
Successfully uninstalled h3-3.7.4
Successfully installed h3-3.7.6
# Now it works |
Great news - https://anaconda.org/conda-forge/h3-py was updated over the weekend to 3.7.6 and now works totally fine: Zane - can you close this if everything here looks fine? I would at some point love to pick up convo on versioning we had in #2497 but lower priority than upgrading pandas 2.0 (base) ➜ pudl git:(dev) ✗ mamba create --name test-update python=3.10 catalystcoop.pudl
# ... + h3-py 3.7.6 py310h0f1eb42_0 conda-forge/osx-arm64 298kB
(base) ➜ pudl git:(dev) ✗ conda activate test-update
(test-update) ➜ pudl git:(dev) ✗ python
Python 3.10.10 | packaged by conda-forge | (main, Mar 24 2023, 20:12:31) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pudl
/Users/nelsonauner/mambaforge/envs/test-update/lib/python3.10/site-packages/pudl/analysis/spatial.py:7: UserWarning: Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas will still use PyGEOS by default for now. To force to use and test Shapely 2.0, you have to set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:
import os
os.environ['USE_PYGEOS'] = '0'
import geopandas
In a future release, GeoPandas will switch to using Shapely by default. If you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).
import geopandas as gpd
>>> print("it works, hooray!!")
it works, hooray!! |
Oh great! Does this mean that the |
@zaneselvans Yes, no longer necessary! |
Okay great. Sounds like this is fixed. Hopefully we don't have to deal with any more of this software dependency stuff going forward with the bigger data distributions! |
Describe the bug
I just re-created a new conda environment where I installed PUDL from conda-forge. When I import PUDL in that environment, it results in a segmentation fault.
The last time I installed PUDL from conda-forge, this didn't happen, though I can't say for sure when that would have been.
Bug Severity
How badly is this bug affecting you?
To Reproduce
Create the new environment, install PUDL, activate the environment, and launch a python console.
In the console, import PUDL
The console then produces this output and exits.
Software Environment?
mamba create --name test python=3.10 catalystcoop.pudl
Full set of packages and versions:
Additional context
Add any other context about the problem here.
========
edit from @jdangerx
Scope
Next steps
The text was updated successfully, but these errors were encountered: