Xarray brings the power of pandas-style data frames to gridded data sets. This notebook provides a few examples of using xarray to store and manipulate grids of topography and related data such as slope gradient and aspect. In order to read a topography data set and generate some derived grids, we'll use the GDAL geospatial library.
(Greg Tucker, University of Colorado, for Computational Tools class, Oct 2018)
First, some imports:
import numpy as np
import xarray as xr
from osgeo import gdal
from matplotlib import pyplot as plt
%matplotlib inline
This example uses a lidar-derived digital elevation model (DEM) of a portion of the Boulder Creek Critical Zone Observtory in the Colorado Front Range.
Navigate a browser to http://criticalzone.org/boulder/data/dataset/2915/
Select Betasso (Snow off - filtered) - 1m Filtered DSM
Save the zip file, and double-click to unzip it. Inside the folder img
you will see a file called czo_1m_bt1.img
. This is a 1 m resolution lidar-derived DEM of a stretch of Boulder Creek Canyon, with the small Betasso tributary catchment located roughly in the center.
We'll set a variable to identify the relative path name to the file. The default line below assumes the folder img
is in your current working directory; change it if needed.
betasso_dem_name = 'img/czo_1m_bt1.img'
The following code uses the GDAL function Open
to open and read the contents of the file into a variable geo
, and then assign the gridded elevation data to a numpy array called zb
. Later on, we'll want the coordinates of the lower left corner of the DEM, so we call the GetGeoTransform()
. This function provides (among other things) the coordinates of the upper-left corner. For completeness, the example here shows how to obtain the lower right corner. From these, we find the x coordinate of the left side and the y coordinate of the bottom, which we'll use shortly.
geo = gdal.Open(betasso_dem_name)
zb = geo.ReadAsArray()
ulx, xres, xskew, uly, yskew, yres = geo.GetGeoTransform()
print((ulx, xres, xskew, uly, yskew, yres))
lrx = ulx + (geo.RasterXSize * xres)
lry = uly + (geo.RasterYSize * yres)
left_x = ulx
bottom_y = lry
print((left_x, bottom_y))
As is typical of a DEM, there are some cells whose values are undefined. These are indicated by a "no data" value, in this case a large negative value. Because these cells can complicate plotting and analysis, we will identify the lowest valid elevation in the grid, and then assign it to the no-data cells.
This isn't necessarily an ideal solution. An alternative would be to assign these as "nan" (not a number). But that can cause its own set of problems, so for now we'll stick with this workaround.
zb_min = np.amin(zb[np.where(zb>0.0)[0],np.where(zb>0.0)[1]])
print('Minimum elevation is ' + str(zb_min))
zb[np.where(zb<0.0)[0],np.where(zb<0.0)[1]] = zb_min
Let's look at our DEM:
plt.imshow(zb)
We'll define a little function to calculate slope gradient from a given 2D elevation array:
def slope_gradient(z):
"""Calculate absolute slope gradient elevation array.
Parameters
----------
z : ndarray (2D)
Array of elevation values
Returns
-------
ndarray (2D) : array of gradient values of the same dimensions
"""
x, y = np.gradient(z)
slope = np.sqrt(x*x + y*y)
return slope
Apply our function to make a slope grid:
sb = slope_gradient(zb)
Plot the result (and calculate the median gradient while we're at it):
plt.imshow(sb, vmin=0.0, vmax=1.0, cmap='pink')
print(np.median(sb))
Now let's do the same thing for aspect:
def aspect(z):
"""Calculate aspect from DEM."""
x, y = np.gradient(z)
return np.arctan2(-x, y)
ab = aspect(zb)
plt.imshow(ab)
We can look at the histogram of aspect values:
abdeg = (180./np.pi)*ab # convert to degrees
n, bins, patches = plt.hist(abdeg.flatten(), 50, density=True)
plt.xlabel('Slope Aspect (degrees)')
plt.ylabel('Relative frequency')
Finally, a hillshade. Start with a (borrowed) function, then apply it and plot the result.
def hillshade(z, azimuth=315.0, angle_altitude=45.0):
"""Generate a hillshade image from DEM.
Notes: adapted from example on GeoExamples blog,
published March 24, 2014, by Roger Veciana i Rovira.
"""
x, y = np.gradient(z)
slope = np.pi/2. - np.arctan(np.sqrt(x*x + y*y))
aspect = np.arctan2(-x, y)
azimuthrad = azimuth*np.pi / 180.
altituderad = angle_altitude*np.pi / 180.
shaded = (np.sin(altituderad) * np.sin(slope)
+ np.cos(altituderad) * np.cos(slope)
* np.cos(azimuthrad - aspect))
return 255*(shaded + 1)/2
hb = hillshade(zb)
plt.imshow(hb, cmap='gray')
We're going to create a DataArray
to store our elevation grid along with information about its coordinate values. To do this, we'll create two 1D numpy arrays to represent the east-west coordinates and north-south coordinates, respectively.
The DataArray takes as arguments: the gridded data as a numpy array (zb
), labels for its two dimensions, and arrays to indicate values that accompany each of the two axes.
(Note that dist_north
has decreasing values; this is because the first pixel is actually the top left---northwest---rather than bottom left.)
dist_north = np.arange(zb.shape[0] - 1, -1, -1) + bottom_y
dist_east = np.arange(zb.shape[1]) + left_x
elev_da = xr.DataArray(zb, dims=['y', 'x'],
coords={'y' : dist_north,
'x' : dist_east})
elev_da
DataArray
has a built-in plot function:
elev_da.plot()
plt.axis('equal')
Notice that the plot above shows the actual geographic coordinates associated with the grid values, rather than simply the array indices.
We can add information about the units of our data, and give the data a standard name (here from the CSDMS Standard Names). These attributes live in the DataArray
's attrs
property:
elev_da.attrs['units'] = 'm'
elev_da.attrs['standard_name'] = 'topographic__elevation'
elev_da
To create an xarray Dataset, we pass:
(1) a dictionary of data_vars
containing, for each grid, a name for the grid as the key, and a tuple with names for its axes and the numpy array that holds the grid's data.
(2) information on the coordinates, as before.
betasso = xr.Dataset(
data_vars={'elevation': (('y', 'x'), zb),
'slope': (('y', 'x'), sb),
'aspect': (('y', 'x'), ab),
'hillshade': (('y', 'x'), hb)},
coords={'y': dist_north,
'x': dist_east})
betasso
Individual grids can be accessed by:
<name of Dataset>.<name of variable>
One can access array elements and slices as with any numpy array. For example, if we want to plot an elevation profile along row 1000:
betasso.elevation[1000].plot()
...or a profile of aspect values along column 1000:
betasso.aspect[:, 1000].plot()
Operations with the individual numpy arrays work in the usual way. For example, convert elevation values to feet:
elev_ft = betasso.elevation / 0.3048
elev_ft.plot()
plt.axis('equal')
We can do averages along either grid dimension using the mean
function. Here's an example of an east-west topographic profile showing, at each point, the north-south average elevation in the elevation grid. Notice that the result of applying mean
is a new Dataset.
betasso_NS_mean = betasso.mean('y')
betasso_NS_mean
betasso_NS_mean.elevation.plot()
After all that hard work, it's useful to be able to store our Dataset in a file for later use, or access by another application:
betasso.to_netcdf('betasso_lidar_topo.nc')
fin.