Package and testing updates and fixes#435
Package and testing updates and fixes#435jmcvey3 wants to merge 11 commits intoMHKiT-Software:developfrom
Conversation
|
And the wind-hindcast-cache tests have a dtype error. Something was changed and now the data returns as float64 instead of float32 |
|
@jmcvey3 thank you for breaking this out into a separate PR. There is a non obvious "test" here where we are verifying that the conda-forge install works. The original environment.yml is the same as this: https://github.com/conda-forge/mhkit-feedstock/blob/main/recipe/meta.yaml which is what conda-forge uses to build the conda version of mhkit. Also important is that conda-forge uses this line: to build the conda forge package. TBH this is a bit of python packing magic that I don't fully understand. In lieu of understanding, the tests build mhkit using a similar workflow as conda forge and basically "verify" that conda-forge will work, or not work. I think we should maintain this pattern and restore the environment.yml to be the same as the conda forge version. Having to define the same versions in two different files, pyproject.toml and environment.yml is really annoying, and maintaining a 3rd file in another repo even more annoying. I haven't seen any solutions for this in python, but maybe something better exists. I think it is possible to autogen environment.yml from pyproject.toml to keep versions the same, so maybe that helps here. Or some other tool altogether. Or maybe we need some other packaging solution that handles H5/HDF5 binaries better. I'd be happy to try to pick this up, and try to get the tests working. Let us know what makes sense here. |
|
@simmsa Thank you for the background here, this is starting to ring a bell So to be clear, the only reason the extra dependencies are in environment.yml is to test and make sure the conda-forge meta.yaml will function properly? If so, should we move the original environment.yml file into something like "mhkit/tests/conda-forge/enviroment.yml" and leave a barebones environment.yml file in the main repo directory? That way we can refer to the former in the "conda-ubunut-latest" tests and the latter in the "pip-ubuntu-latest" tests and separate them properly? |
| @@ -54,8 +56,6 @@ river = [ | |||
| ] | |||
|
|
|||
| dolfyn = [ | |||
There was a problem hiding this comment.
I also saw that the h5py libraries are set here, but dolfyn doesn't actually require any of the h5-related libraries. These might belong under the wave module dependencies
There was a problem hiding this comment.
@jmcvey3 every netcdf file I have worked with is a h5 file: https://docs.unidata.ucar.edu/netcdf-c/4.9.3/interoperability_hdf5.html
Historically for mhkit it has been beneficial to control the Hdf5 library versions because it is typical for different libraries to use different versions, see #351, Issue #303, Issue #177.
There was a problem hiding this comment.
@simmsa I'm just commenting that they were listed under the dolfyn module and don't need to be. I learned when refactoring dolfyn that netCDF4 could incorporate h5 capabilities if needed, but it doesn't have to. The python h5 packages are bulky, so I wrote dolfyn to write netCDF4 files to only need the python netCDF4 dependencies.
There was a problem hiding this comment.
I support moving them to wave dependencies then. Part of the original motivation of the optional install set-up was to allow for small installs and keep the h5 packages in Dolfyn will force the module to be larger than necessary
Def not the only reason. environment.yml in the root allows developers to use conda to install mhkit packages. It seems valuable to support multiple installation methods for developers, which is why we define the dependencies twice. |
I don't understand - you can still use conda with a smaller environment.yml file. If you're downloading from conda-forge, then you'll get the packages from conda-forge. There's nothing different between packages downloaded from conda-forge and those downloaded from pip, assuming the versions are the same. |
|
@jmcvey3 one other thing that you are may be fighting is the optional dependencies settings: We set those to allow pip install "mhkit[]" which allows module level installs, but I don't think any of us realized the implications of module level installs and the complications. I pushed a couple of fixes: #421, #422, #423 and #424 when we figured out that Lines 129 to 142 in 6f806c8 https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#finding-namespace-packages Do you think the above failures have anything to do with the module changes? |
|
@simmsa I don't think I'm fighting the optional dependency installs; those seem to be working fine. I do remember having an issue with the package initialization process running into errors trying to import optional packages, but I think that was resolved. |
akeeste
left a comment
There was a problem hiding this comment.
A couple clarifying questions on this PR:
- why the
--no-depschange in the Actions workflows? What problem was this flag causing? numexprandpyarroware added to thepyproject.tomlbut I don't see them imported anywhere. Where are these two packages used?
And to confirm my understanding of the environment.yml files:
So now the all-inclusive tests/conda-forge/environment.yml serves the purpose of testing a pure-conda installation of MHKiT (plus the local GitHub version of MHKiT) used on a given PR. While mhkit/environment.yml is used in the pip installation of MHKiT and only installs python, pip, and hdf5 with conda before setting up the rest of the environment with pip
Lastly, my test_cdip.py is not working locally. I need to evaluate this further to see what is going on.
| @@ -54,8 +56,6 @@ river = [ | |||
| ] | |||
|
|
|||
| dolfyn = [ | |||
There was a problem hiding this comment.
I support moving them to wave dependencies then. Part of the original motivation of the optional install set-up was to allow for small installs and keep the h5 packages in Dolfyn will force the module to be larger than necessary
For the "--no-deps" changes, the "pip-ubuntu-latest" tests were installing all dependencies via the conda environment file, rather than letting pip do the package dependency handling. This basically means conda installed all the packages, and we aren't testing if the versions that pip wants to install also function. Sometimes conda-approved versions are behind those on pip. Good point, we should check if we still need to set "numexpr" and "pyarrow". They're dependencies for other packages, and I think we have them set because of some problematic package with a bad version release. Yes, your understanding is correct. We can probably also test if we need hdf5 installed via conda if there is a user out there that wants to use venv or some such. Yeah the "test_cdip.py" issues were what launched this whole escapade. Make sure you clear your mhkit cache on your local first. The first time you run the test, it writes the cache, and every time after you run the test, it reads the cache. |
A general PR to cover tests that have been breaking recently to python and dependency updates. Also contains cleanup of the package in general