-
Notifications
You must be signed in to change notification settings - Fork 350
feat(GeoDataFrame support): geopandas support for shapefile exporting #2671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…frame()` routines
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #2671 +/- ##
===========================================
+ Coverage 55.5% 72.3% +16.7%
===========================================
Files 644 667 +23
Lines 124135 130272 +6137
===========================================
+ Hits 68947 94231 +25284
+ Misses 55188 36041 -19147
🚀 New features to boost your workflow:
|
|
@wpbonelli it looks like the smoke tests are failing on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to be able to lean on geopandas for shapefile export. I guess we should look at using xarray for netcdf export similarly
General comment re: deprecations, we usually put an expected removal version in the deprecation warning which I think is typically ~2 minor releases ahead?
| @property | ||
| def geo_dataframe(self): | ||
| """ | ||
| Returns a geopandas GeoDataFrame of the model grid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the running convention is to add deprecation notes to docstrings as well as the warning in the code itself
| filename : str or PathLike | ||
| path of the shapefile to write | ||
| mg : flopy.discretization.grid.Grid object | ||
| modelgrid : flopy.discretization.grid.Grid object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest keeping/deprecating mg just making it optional and introducing modelgrid alongside it to avoid breakage. or, since this function will go away soon, is it worth renaming this parameter at all?
|
|
||
| def model_attributes_to_shapefile( | ||
| path: Union[str, PathLike], | ||
| filename: Union[str, PathLike], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as mg/modelgrid above
| Parameters | ||
| ---------- | ||
| shpname : str or PathLike | ||
| filename : str or PathLike |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
| geoms, | ||
| shpname: Union[str, PathLike] = "recarray.shp", | ||
| mg=None, | ||
| filename: Union[str, PathLike] = "recarray.shp", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
|
||
| return gdf | ||
|
|
||
| def export(self, f, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we would ideally deprecate export() now that the geodataframe approach is the recommended way to export to shapefile, but since export() also handles netcdf we have to wait til there is a similar netcdf export capability via e.g. xarray? working in the same way as the geopandas approach in this PR like to_xarray().to_netcdf()?
| self.repeating = True | ||
| self.empty_keys = {} | ||
|
|
||
| def to_geodataframe(self, gdf=None, kper=0, sparse=False, truncate_attrs=False, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use case you have in mind for sparse=False? Since stress period data is sparse by default (at least in 3.x) it seems reasonable to return a sparse geodataframe too, either exclusively or at least by default?
Kind of like what you're doing in shapefile_export_example.py for WEL.
I guess I'm wondering when someone calling to_geodataframe() on stress period data would want one containing the whole grid, rather than just the relevant boundary conditions. In that case couldn't they get the grid geodataframe and .merge() it manually with the SPD frame or pass the grid frame into the gdf parameter here.
Performance wise, it seems expensive to always do the full grid then dropna().
|
|
||
| @property | ||
| def geo_dataframe(self): | ||
| def geodataframe(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why to_geodataframe... elsewhere and not here, other than the fact the old property didn't have to_...?
suggest also deprecating the old one to avoid breakage?
| optional attribute name, default uses util2d name | ||
| forgive : bool | ||
| optional flag to continue if data shape not compatible with GeoDataFrame | ||
| truncate_attrs : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think GeoDataFrame.to_file() already truncates for you when writing to shapefile, is this parameter necessary? It seems like having our own API for this would be useful if we let the user customize the truncation but otherwise redundant?
| geoms = GeoSpatialCollection(geoms, "Point") | ||
| gdf = gpd.GeoDataFrame.from_features(geoms) | ||
|
|
||
| for col in list(welldata): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of looping to add columns, could this just be
gdf = gpd.GeoDataFrame(welldata, geometry=geoms)
This PR supersedes the draft PR #2573 and continues the cleanup of legacy shapefile exporting code and integration of geopandas GeoDataFrame support.