geoist.gridder package¶
Subpackages¶
Submodules¶
geoist.gridder.base module¶
Base classes for all gridders.
-
class
geoist.gridder.base.
BaseGridder
[源代码]¶ 基类:
sklearn.base.BaseEstimator
Base class for gridders.
Most methods of this class requires the implementation of a
predict
method. The data returned by it should be a 1d or 2d numpy array for scalar data or a tuple with 1d or 2d numpy arrays for each component of vector data.The
filter
method requires the implementation of afit
method to fit the gridder model to data.Doesn't define any new attributes.
This is a subclass of
sklearn.base.BaseEstimator
and must abide by the same rules of the scikit-learn classes. Mainly:__init__
must only assign values to attributes based on the parameters it receives. All parameters must have default values. Parameter checking should be done infit
.Estimated parameters should be stored as attributes with names ending in
_
.
实际案例
Let's create a class that interpolates by attributing the mean value of the data to every single point (it's not a very good interpolator).
>>> import geoist.gidder as vd >>> import numpy as np >>> from sklearn.utils.validation import check_is_fitted >>> class MeanGridder(vd.base.BaseGridder): ... "Gridder that always produces the mean of all data values" ... def __init__(self, multiplier=1): ... # Init should only assign the parameters to attributes ... self.multiplier = multiplier ... def fit(self, coordiantes, data): ... # Argument checking should be done in fit ... if self.multiplier <= 0: ... raise ValueError('Invalid multiplier {}' ... .format(self.multiplier)) ... self.mean_ = data.mean()*self.multiplier ... # fit should return self so that we can chain operations ... return self ... def predict(self, coordinates): ... # We know the gridder has been fitted if it has the mean ... check_is_fitted(self, ['mean_']) ... return np.ones_like(coordinates[0])*self.mean_ >>> # Try it on some synthetic data >>> synthetic = vd.datasets.CheckerBoard(region=(0, 5, -10, 8)) >>> data = synthetic.scatter() >>> print('{:.4f}'.format(data.scalars.mean())) -32.2182 >>> # Fit the gridder to our synthetic data >>> grd = MeanGridder().fit((data.easting, data.northing), data.scalars) >>> grd MeanGridder(multiplier=1) >>> # Interpolate on a regular grid >>> grid = grd.grid(region=(0, 5, -10, -8), shape=(30, 20)) >>> type(grid) <class 'xarray.core.dataset.Dataset'> >>> np.allclose(grid.scalars, -32.2182) True >>> # Interpolate along a profile >>> profile = grd.profile(point1=(0, -10), point2=(5, -8), size=10) >>> type(profile) <class 'pandas.core.frame.DataFrame'> >>> print(', '.join(['{:.2f}'.format(i) for i in profile.distance])) 0.00, 0.60, 1.20, 1.80, 2.39, 2.99, 3.59, 4.19, 4.79, 5.39 >>> print(', '.join(['{:.1f}'.format(i) for i in profile.scalars])) -32.2, -32.2, -32.2, -32.2, -32.2, -32.2, -32.2, -32.2, -32.2, -32.2
-
filter
(coordinates, data, weights=None)[源代码]¶ Filter the data through the gridder and produce residuals.
Calls
fit
on the data, evaluates the residuals (data - predicted data), and returns the coordinates, residuals, and weights.No very useful by itself but this interface makes gridders compatible with other processing operations and is used by
geoist.gidder.Chain
to join them together (for example, so you can fit a spline on the residuals of a trend).- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
data (array or tuple of arrays) -- The data values of each data point. If the data has more than one component, data must be a tuple of arrays (one for each component).
weights (None or array or tuple of arrays) -- If not None, then the weights assigned to each data point. If more than one data component is provided, you must provide a weights array for each data component (if not None).
- 返回
The coordinates and weights are same as the input. Residuals are the input data minus the predicted data.
- 返回类型
coordinates, residuals, weights
-
fit
(coordinates, data, weights=None)[源代码]¶ Fit the gridder to observed data. NOT IMPLEMENTED.
This is a dummy placeholder for an actual method.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
data (array or tuple of arrays) -- The data values of each data point. If the data has more than one component, data must be a tuple of arrays (one for each component).
weights (None or array or tuple of arrays) -- If not None, then the weights assigned to each data point. If more than one data component is provided, you must provide a weights array for each data component (if not None).
- 返回
This instance of the gridder. Useful to chain operations.
- 返回类型
self
-
grid
(region=None, shape=None, spacing=None, dims=None, data_names=None, projection=None, **kwargs)[源代码]¶ Interpolate the data onto a regular grid.
The grid can be specified by either the number of points in each dimension (the shape) or by the grid node spacing. See
geoist.gidder.grid_coordinates
for details. Other arguments forgeoist.gidder.grid_coordinates
can be passed as extra keyword arguments (kwargs
) to this method.If the interpolator collected the input data region, then it will be used if
region=None
. Otherwise, you must specify the grid region.Use the dims and data_names arguments to set custom names for the dimensions and the data field(s) in the output
xarray.Dataset
. Default names will be provided if none are given.- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
shape (tuple = (n_north, n_east) or None) -- The number of points in the South-North and West-East directions, respectively.
spacing (tuple = (s_north, s_east) or None) -- The grid spacing in the South-North and West-East directions, respectively.
dims (list or None) -- The names of the northing and easting data dimensions, respectively, in the output grid. Defaults to
['northing', 'easting']
. NOTE: This is an exception to the "easting" then "northing" pattern but is required for compatibility with xarray.data_names (list of None) -- The name(s) of the data variables in the output grid. Defaults to
['scalars']
for scalar data,['east_component', 'north_component']
for 2D vector data, and['east_component', 'north_component', 'vertical_component']
for 3D vector data.projection (callable or None) -- If not None, then should be a callable object
projection(easting, northing) -> (proj_easting, proj_northing)
that takes in easting and northing coordinate arrays and returns projected northing and easting coordinate arrays. This function will be used to project the generated grid coordinates before passing them intopredict
. For example, you can use this to generate a geographic grid from a Cartesian gridder.
- 返回
grid -- The interpolated grid. Metadata about the interpolator is written to the
attrs
attribute.- 返回类型
xarray.Dataset
参见
geoist.gidder.grid_coordinates
Generate the coordinate values for the grid.
-
predict
(coordinates)[源代码]¶ Predict data on the given coordinate values. NOT IMPLEMENTED.
This is a dummy placeholder for an actual method.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
- 返回
data -- The data predicted at the give coordinates.
- 返回类型
array
-
profile
(point1, point2, size, dims=None, data_names=None, projection=None, **kwargs)[源代码]¶ Interpolate data along a profile between two points.
Generates the profile along a straight line assuming Cartesian distances. Point coordinates are generated by
geoist.gidder.profile_coordinates
. Other arguments for this function can be passed as extra keyword arguments (kwargs
) to this method.Use the dims and data_names arguments to set custom names for the dimensions and the data field(s) in the output
pandas.DataFrame
. Default names are provided.Includes the calculated Cartesian distance from point1 for each data point in the profile.
- 参数
point1 (tuple) -- The easting and northing coordinates, respectively, of the first point.
point2 (tuple) -- The easting and northing coordinates, respectively, of the second point.
size (int) -- The number of points to generate.
dims (list or None) -- The names of the northing and easting data dimensions, respectively, in the output dataframe. Defaults to
['northing', 'easting']
. NOTE: This is an exception to the "easting" then "northing" pattern but is required for compatibility with xarray.data_names (list of None) -- The name(s) of the data variables in the output dataframe. Defaults to
['scalars']
for scalar data,['east_component', 'north_component']
for 2D vector data, and['east_component', 'north_component', 'vertical_component']
for 3D vector data.projection (callable or None) -- If not None, then should be a callable object
projection(easting, northing) -> (proj_easting, proj_northing)
that takes in easting and northing coordinate arrays and returns projected northing and easting coordinate arrays. This function will be used to project the generated profile coordinates before passing them intopredict
. For example, you can use this to generate a geographic profile from a Cartesian gridder.
- 返回
table -- The interpolated values along the profile.
- 返回类型
pandas.DataFrame
-
scatter
(region=None, size=300, random_state=0, dims=None, data_names=None, projection=None, **kwargs)[源代码]¶ Interpolate values onto a random scatter of points.
Point coordinates are generated by
geoist.gidder.scatter_points
. Other arguments for this function can be passed as extra keyword arguments (kwargs
) to this method.If the interpolator collected the input data region, then it will be used if
region=None
. Otherwise, you must specify the grid region.Use the dims and data_names arguments to set custom names for the dimensions and the data field(s) in the output
pandas.DataFrame
. Default names are provided.- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
size (int) -- The number of points to generate.
random_state (numpy.random.RandomState or an int seed) -- A random number generator used to define the state of the random permutations. Use a fixed seed to make sure computations are reproducible. Use
None
to choose a seed automatically (resulting in different numbers with each run).dims (list or None) -- The names of the northing and easting data dimensions, respectively, in the output dataframe. Defaults to
['northing', 'easting']
. NOTE: This is an exception to the "easting" then "northing" pattern but is required for compatibility with xarray.data_names (list of None) -- The name(s) of the data variables in the output dataframe. Defaults to
['scalars']
for scalar data,['east_component', 'north_component']
for 2D vector data, and['east_component', 'north_component', 'vertical_component']
for 3D vector data.projection (callable or None) -- If not None, then should be a callable object
projection(easting, northing) -> (proj_easting, proj_northing)
that takes in easting and northing coordinate arrays and returns projected northing and easting coordinate arrays. This function will be used to project the generated scatter coordinates before passing them intopredict
. For example, you can use this to generate a geographic scatter from a Cartesian gridder.
- 返回
table -- The interpolated values on a random set of points.
- 返回类型
pandas.DataFrame
-
score
(coordinates, data, weights=None)[源代码]¶ Score the gridder predictions against the given data.
Calculates the R^2 coefficient of determination of between the predicted values and the given data values. A maximum score of 1 means a perfect fit. The score can be negative.
If the data has more than 1 component, the scores of each component will be averaged.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
data (array or tuple of arrays) -- The data values of each data point. If the data has more than one component, data must be a tuple of arrays (one for each component).
weights (None or array or tuple of arrays) -- If not None, then the weights assigned to each data point. If more than one data component is provided, you must provide a weights array for each data component (if not None).
- 返回
score -- The R^2 score
- 返回类型
float
-
geoist.gridder.base.
check_fit_input
(coordinates, data, weights, unpack=True)[源代码]¶ Validate the inputs to the fit method of gridders.
Checks that the coordinates, data, and weights (if given) all have the same shape. Weights arrays are raveled.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
data (array or tuple of arrays) -- The data values of each data point. Data can have more than one component. In such cases, data should be a tuple of arrays.
weights (None or array) -- If not None, then the weights assigned to each data point. Typically, this should be 1 over the data uncertainty squared. If the data has multiple components, the weights have the same number of components.
unoack (bool) -- If False, data and weights will be tuples always. If they are single arrays, then they will be returned as a 1-element tuple. If True, will unpack the tuples if there is only 1 array in each.
- 返回
The validated inputs in the same order. If weights are given, will ravel the array before returning.
- 返回类型
validated_inputs
-
geoist.gridder.base.
get_data_names
(data, data_names)[源代码]¶ Get default names for data fields if none are given based on the data.
实际案例
>>> import numpy as np >>> east, north, up = [np.arange(10)]*3 >>> get_data_names((east,), data_names=None) ('scalars',) >>> get_data_names((east, north), data_names=None) ('east_component', 'north_component') >>> get_data_names((east, north, up), data_names=None) ('east_component', 'north_component', 'vertical_component') >>> get_data_names((up, north), data_names=('ringo', 'george')) ('ringo', 'george')
-
geoist.gridder.base.
get_dims
(dims)[源代码]¶ Get default dimension names.
实际案例
>>> get_dims(dims=None) ('northing', 'easting') >>> get_dims(dims=('john', 'paul')) ('john', 'paul')
geoist.gridder.blockreduce module¶
Classes for reducing/aggregating data in blocks.
-
class
geoist.gridder.blockreduce.
BlockMean
(spacing, region=None, adjust='spacing', center_coordinates=False, uncertainty=False)[源代码]¶ 基类:
geoist.gridder.blockreduce.BlockReduce
Apply a (weighted) mean to the data in blocks/windows.
Returns the mean data value for each block along with the associated coordinates and weights. Coordinates can be determined through the mean of the data coordinates or as the center of each block. Weights can be calculated in three ways:
Using the variance of the data:
weights=1/variance
. This is the only possible option when no input weights are provided.Using the uncertainty of the weighted mean propagated from the uncertainties in the data:
weights=1/uncertainty**2
. In this case, we assume that the input weights are also1/uncertainty**2
. Do not normalize or scale the weights if using uncertainty propagation.Using the weighted variance of the data:
1/weighted_variance
. In this case, we make no assumptions about the nature of the weights.
For all three options, the output weights are scaled to the range [0, 1].
This class always outputs weights. If you want to calculate a blocked mean and not output any weights, use
geoist.gidder.BlockReduce
withnumpy.average
instead.Using the propagated uncertainties may be more adequate if your data is smooth in each block but have very different uncertainties. The propagation will preserve a low weight for data that have large uncertainties but don't vary much inside the block.
The weighted variance should be used when the data vary a lot in each block but have very similar uncertainties. This is also the best choice if your input weights aren't
1/uncertainty**2
but are a relative importance of the data instead.If a data region to be divided into blocks is not given, it will be the bounding region of the data. When using this class to decimate data before gridding, it's best to use the same region and spacing as the desired grid.
If the given region is not divisible by the spacing (block size), either the region or the spacing will have to be adjusted. By default, the spacing will be rounded to the nearest multiple. Optionally, the East and North boundaries of the region can be adjusted to fit the exact spacing given.
Blocks without any data are omitted from the output.
Implements the
filter
method so it can be used withgeoist.gidder.Chain
. Only acts during data fitting and is ignored during prediction.- 参数
spacing (float, tuple = (s_north, s_east), or None) -- The block size in the South-North and West-East directions, respectively. A single value means that the size is equal in both directions.
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
adjust ({'spacing', 'region'}) -- Whether to adjust the spacing or the region if required. Ignored if shape is given instead of spacing. Defaults to adjusting the spacing.
center_coordinates (bool) -- If True, then the returned coordinates correspond to the center of each block. Otherwise, the coordinates are calculated by applying the same reduction operation to the input coordinates.
uncertainty (bool) -- If True, the blocked weights will be calculated by uncertainty propagation of the data uncertainties. If this is case, then the input weights must be
1/uncertainty**2
. Do not normalize the input weights. If False, then the blocked weights will be calculated as1/variance
and no assumptions are made of the input weights (so they can be normalized).
参见
block_split
Split a region into blocks and label points accordingly.
BlockReduce
Apply the mean in blocks. Will output weights.
geoist.gidder.Chain
Apply filter operations successively on data.
-
filter
(coordinates, data, weights=None)[源代码]¶ Apply the blocked mean to the given data.
Returns the reduced data value for each block along with the associated coordinates and weights. See the class docstring for details.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
data (array or tuple of arrays) -- The data values at each point. If you want to reduce more than one data component, pass in multiple arrays as elements of a tuple. All arrays must have the same shape.
weights (None or array or tuple of arrays) -- If not None, then the weights assigned to each data point. If more than one data component is provided, you must provide a weights array for each data component (if not None). If calculating the output weights through uncertainty propagation, then weights must be
1/uncertainty**2
.
- 返回
blocked_coordinates (tuple of arrays) -- (easting, northing) arrays with the coordinates of each block that contains data.
blocked_mean (array or tuple of arrays) -- The block averaged data values.
blocked_weights (array or tuple of arrays) -- The weights calculated for the blocked data values.
-
class
geoist.gridder.blockreduce.
BlockReduce
(reduction, spacing, region=None, adjust='spacing', center_coordinates=False)[源代码]¶ 基类:
sklearn.base.BaseEstimator
Apply a reduction/aggregation operation to the data in blocks/windows.
Returns the reduced data value for each block along with the associated coordinates, which can be determined through the same reduction applied to the coordinates or as the center of each block.
If a data region to be divided into blocks is not given, it will be the bounding region of the data. When using this class to decimate data before gridding, it's best to use the same region and spacing as the desired grid.
If the given region is not divisible by the spacing (block size), either the region or the spacing will have to be adjusted. By default, the spacing will be rounded to the nearest multiple. Optionally, the East and North boundaries of the region can be adjusted to fit the exact spacing given.
Blocks without any data are omitted from the output.
Implements the
filter
method so it can be used withgeoist.gidder.Chain
. Only acts during data fitting and is ignored during prediction.- 参数
reduction (function) -- A reduction function that takes an array and returns a single value (e.g.,
np.mean
,np.median
, etc).spacing (float, tuple = (s_north, s_east), or None) -- The block size in the South-North and West-East directions, respectively. A single value means that the size is equal in both directions.
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
adjust ({'spacing', 'region'}) -- Whether to adjust the spacing or the region if required. Ignored if shape is given instead of spacing. Defaults to adjusting the spacing.
center_coordinates (bool) -- If True, then the returned coordinates correspond to the center of each block. Otherwise, the coordinates are calculated by applying the same reduction operation to the input coordinates.
参见
block_split
Split a region into blocks and label points accordingly.
BlockMean
Apply the mean in blocks. Will output weights.
geoist.gidder.Chain
Apply filter operations successively on data.
-
filter
(coordinates, data, weights=None)[源代码]¶ Apply the blocked aggregation to the given data.
Returns the reduced data value for each block along with the associated coordinates, which can be determined through the same reduction applied to the coordinates or as the center of each block.
If weights are given, the reduction function must accept a
weights
keyword argument. The weights are passed in to the reduction but we have no generic way aggregating the weights or reporting uncertainties. For that, look to the specialized classes likegeoist.gidder.BlockMean
.- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
data (array or tuple of arrays) -- The data values at each point. If you want to reduce more than one data component, pass in multiple arrays as elements of a tuple. All arrays must have the same shape.
weights (None or array or tuple of arrays) -- If not None, then the weights assigned to each data point. If more than one data component is provided, you must provide a weights array for each data component (if not None).
- 返回
blocked_coordinates (tuple of arrays) -- (easting, northing) arrays with the coordinates of each block that contains data.
blocked_data (array) -- The block reduced data values.
-
geoist.gridder.blockreduce.
attach_weights
(reduction, weights)[源代码]¶ Create a partial application of reduction with the proper weights attached.
Makes a function that calls reduction and gives it the weights corresponding to the index of the particular values it receives. Meant for used in a groupby aggregation of a pandas.DataFrame. See class BlockReduce.
geoist.gridder.chain module¶
Class for chaining gridders.
-
class
geoist.gridder.chain.
Chain
(steps)[源代码]¶ 基类:
geoist.gridder.base.BaseGridder
Chain filtering operations to fit on each subsequent output.
The
filter
method of each element of the set is called with the outputs of the previous one. For gridders and trend estimators this means that each element fits the residuals (input data minus predicted data) of the previous one.When predicting data, the predictions of each step in the chain are added together. Steps that don't implement a
predict
method are ignored.This provides a convenient way to chaining operations like trend estimation to a given gridder.
- 参数
steps (list) -- A list of
('name', step)
pairs wherestep
is any geoist.gidder class that implements afilter(coordinates, data, weights)
method (includingChain
itself).
-
region_
¶ The boundaries (
[W, E, S, N]
) of the data used to fit the chain. Used as the default region for thegrid
andscatter
methods.- Type
tuple
-
named_steps
¶ A dictionary version of steps where the
'name'
strings are keys and the estimator/gridder/processor objects are the values.- Type
dict
参见
geoist.gidder.Vector
Fit an estimator to each component of multi-component vector data
-
fit
(coordinates, data, weights=None)[源代码]¶ Fit the chained estimators to the given data.
Each estimator in the chain is fitted to the residuals of the previous estimator. The coordinates are preserved. Only the data values are changed.
The data region is captured and used as default for the
grid
andscatter
methods.All input arrays must have the same shape.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
data (array or tuple of arrays) -- the data values of each data point. if the data has more than one component, data must be a tuple of arrays (one for each component).
weights (none or array or tuple of arrays) -- if not none, then the weights assigned to each data point. if more than one data component is provided, you must provide a weights array for each data component (if not none).
- 返回
Returns this estimator instance for chaining operations.
- 返回类型
self
-
property
named_steps
¶ A dictionary version of steps.
-
predict
(coordinates)[源代码]¶ Evaluates the data predicted by the chain on the given set of points.
Predictions from each step in the chain are added together. Any step that doesn't implement a
predict
method is ignored.Requires a fitted gridder (see
fit
).- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
- 返回
data -- The data values predicted on the given points.
- 返回类型
array
geoist.gridder.coordinates module¶
Functions for generating and manipulating coordinates.
-
geoist.gridder.coordinates.
block_split
(coordinates, spacing, adjust='spacing', region=None)[源代码]¶ Split a region into blocks and label points according to where they fall.
The labels are integers corresponding to the index of the block. The same index is used for the coordinates of each block.
注解
If installed, package
pykdtree
will be used instead ofscipy.spatial.cKDTree
for better performance.- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
spacing (float, tuple = (s_north, s_east), or None) -- The block size in the South-North and West-East directions, respectively. A single value means that the size is equal in both directions.
adjust ({'spacing', 'region'}) -- Whether to adjust the spacing or the region if required. Ignored if shape is given instead of spacing. Defaults to adjusting the spacing.
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates. If not region is given, will use the bounding region of the given points.
- 返回
block_coordinates (tuple of arrays) -- (easting, northing) arrays with the coordinates of the center of each block.
labels (array) -- integer label for each data point. The label is the index of the block to which that point belongs.
参见
BlockReduce
Apply a reduction operation to the data in blocks (windows).
实际案例
>>> from geoist.gidder import grid_coordinates >>> coords = grid_coordinates((-5, 0, 5, 10), spacing=1) >>> block_coords, labels = block_split(coords, spacing=2.5) >>> for coord in block_coords: ... print(', '.join(['{:.2f}'.format(i) for i in coord])) -3.75, -1.25, -3.75, -1.25 6.25, 6.25, 8.75, 8.75 >>> print(labels.reshape(coords[0].shape)) [[0 0 0 1 1 1] [0 0 0 1 1 1] [0 0 0 1 1 1] [2 2 2 3 3 3] [2 2 2 3 3 3] [2 2 2 3 3 3]]
-
geoist.gridder.coordinates.
check_region
(region)[源代码]¶ Check that the given region dimensions are valid.
For example, the west limit should not be greater than the east and there must be exactly 4 values given.
- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
- 引发
ValueError -- If the region doesn't have exactly 4 entries, W > E, or S > N.
-
geoist.gridder.coordinates.
get_region
(coordinates)[源代码]¶ Get the bounding region of the given coordinates.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
- 返回
region -- The boundaries of a given region in Cartesian or geographic coordinates.
- 返回类型
tuple = (W, E, S, N)
实际案例
>>> coords = grid_coordinates((0, 1, -10, -6), shape=(10, 10)) >>> print(get_region(coords)) (0.0, 1.0, -10.0, -6.0)
-
geoist.gridder.coordinates.
grid_coordinates
(region, shape=None, spacing=None, adjust='spacing', pixel_register=False, extra_coords=None)[源代码]¶ Generate the coordinates for each point on a regular grid.
The grid can be specified by either the number of points in each dimension (the shape) or by the grid node spacing.
If the given region is not divisible by the desired spacing, either the region or the spacing will have to be adjusted. By default, the spacing will be rounded to the nearest multiple. Optionally, the East and North boundaries of the region can be adjusted to fit the exact spacing given. See the examples below.
- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
shape (tuple = (n_north, n_east) or None) -- The number of points in the South-North and West-East directions, respectively.
spacing (float, tuple = (s_north, s_east), or None) -- The grid spacing in the South-North and West-East directions, respectively. A single value means that the spacing is equal in both directions.
adjust ({'spacing', 'region'}) -- Whether to adjust the spacing or the region if required. Ignored if shape is given instead of spacing. Defaults to adjusting the spacing.
pixel_register (bool) -- If True, the coordinates will refer to the center of each grid pixel instead of the grid lines. In practice, this means that there will be one less element per dimension of the grid when compared to grid line registered. Default is False.
extra_coords (None, scalar, or list) -- If not None, then value(s) of extra coordinate arrays to be generated. These extra arrays will have the same shape as the others but will contain a constant value. Will generate an extra array per value given in extra_coords. Use this to generate arrays of constant heights or times, for example, that might be needed to evaluate a gridder.
- 返回
coordinates -- Arrays with coordinates of each point in the grid. Each array contains values for a dimension in the order: easting, northing, vertical, and any extra dimensions given in extra_coords. All arrays will have the specified shape.
- 返回类型
tuple of arrays
实际案例
>>> east, north = grid_coordinates(region=(0, 5, 0, 10), shape=(5, 3)) >>> print(east.shape, north.shape) (5, 3) (5, 3) >>> # Lower printing precision to shorten this example >>> import numpy as np; np.set_printoptions(precision=1, suppress=True) >>> print(east) [[0. 2.5 5. ] [0. 2.5 5. ] [0. 2.5 5. ] [0. 2.5 5. ] [0. 2.5 5. ]] >>> print(north) [[ 0. 0. 0. ] [ 2.5 2.5 2.5] [ 5. 5. 5. ] [ 7.5 7.5 7.5] [10. 10. 10. ]] >>> # The grid can also be specified using the spacing between points >>> # instead of the shape. >>> east, north = grid_coordinates(region=(0, 5, 0, 10), spacing=2.5) >>> print(east.shape, north.shape) (5, 3) (5, 3) >>> print(east) [[0. 2.5 5. ] [0. 2.5 5. ] [0. 2.5 5. ] [0. 2.5 5. ] [0. 2.5 5. ]] >>> print(north) [[ 0. 0. 0. ] [ 2.5 2.5 2.5] [ 5. 5. 5. ] [ 7.5 7.5 7.5] [10. 10. 10. ]] >>> # Generate arrays for other coordinates that have a constant value. >>> east, north, height = grid_coordinates(region=(0, 5, 0, 10), spacing=2.5, ... extra_coords=57) >>> print(east.shape, north.shape, height.shape) (5, 3) (5, 3) (5, 3) >>> print(height) [[57. 57. 57.] [57. 57. 57.] [57. 57. 57.] [57. 57. 57.] [57. 57. 57.]] >>> east, north, height, time = grid_coordinates(region=(0, 5, 0, 10), spacing=2.5, ... extra_coords=[57, 0.1]) >>> print(east.shape, north.shape, height.shape, time.shape) (5, 3) (5, 3) (5, 3) (5, 3) >>> print(height) [[57. 57. 57.] [57. 57. 57.] [57. 57. 57.] [57. 57. 57.] [57. 57. 57.]] >>> print(time) [[0.1 0.1 0.1] [0.1 0.1 0.1] [0.1 0.1 0.1] [0.1 0.1 0.1] [0.1 0.1 0.1]] >>> # The spacing can be different for northing and easting, respectively >>> east, north = grid_coordinates(region=(-5, 1, 0, 10), spacing=(2.5, 1)) >>> print(east.shape, north.shape) (5, 7) (5, 7) >>> print(east) [[-5. -4. -3. -2. -1. 0. 1.] [-5. -4. -3. -2. -1. 0. 1.] [-5. -4. -3. -2. -1. 0. 1.] [-5. -4. -3. -2. -1. 0. 1.] [-5. -4. -3. -2. -1. 0. 1.]] >>> print(north) [[ 0. 0. 0. 0. 0. 0. 0. ] [ 2.5 2.5 2.5 2.5 2.5 2.5 2.5] [ 5. 5. 5. 5. 5. 5. 5. ] [ 7.5 7.5 7.5 7.5 7.5 7.5 7.5] [10. 10. 10. 10. 10. 10. 10. ]] >>> # If the region can't be divided into the desired spacing, the spacing >>> # will be adjusted to conform to the region >>> east, north = grid_coordinates(region=(-5, 0, 0, 5), spacing=2.6) >>> print(east.shape, north.shape) (3, 3) (3, 3) >>> print(east) [[-5. -2.5 0. ] [-5. -2.5 0. ] [-5. -2.5 0. ]] >>> print(north) [[0. 0. 0. ] [2.5 2.5 2.5] [5. 5. 5. ]] >>> east, north = grid_coordinates(region=(-5, 0, 0, 5), spacing=2.4) >>> print(east.shape, north.shape) (3, 3) (3, 3) >>> print(east) [[-5. -2.5 0. ] [-5. -2.5 0. ] [-5. -2.5 0. ]] >>> print(north) [[0. 0. 0. ] [2.5 2.5 2.5] [5. 5. 5. ]] >>> # You can also choose to adjust the East and North boundaries of the >>> # region instead. >>> east, north = grid_coordinates(region=(-5, 0, 0, 5), spacing=2.6, ... adjust='region') >>> print(east.shape, north.shape) (3, 3) (3, 3) >>> print(east) [[-5. -2.4 0.2] [-5. -2.4 0.2] [-5. -2.4 0.2]] >>> print(north) [[0. 0. 0. ] [2.6 2.6 2.6] [5.2 5.2 5.2]] >>> east, north = grid_coordinates(region=(-5, 0, 0, 5), spacing=2.4, ... adjust='region') >>> print(east.shape, north.shape) (3, 3) (3, 3) >>> print(east) [[-5. -2.6 -0.2] [-5. -2.6 -0.2] [-5. -2.6 -0.2]] >>> print(north) [[0. 0. 0. ] [2.4 2.4 2.4] [4.8 4.8 4.8]] >>> # We can optionally generate coordinates for the center of each grid >>> # pixel instead of the corner (default) >>> east, north = grid_coordinates(region=(0, 5, 0, 10), spacing=2.5, ... pixel_register=True) >>> # Lower printing precision to shorten this example >>> import numpy as np; np.set_printoptions(precision=2, suppress=True) >>> print(east.shape, north.shape) (4, 2) (4, 2) >>> print(east) [[1.25 3.75] [1.25 3.75] [1.25 3.75] [1.25 3.75]] >>> print(north) [[1.25 1.25] [3.75 3.75] [6.25 6.25] [8.75 8.75]]
参见
scatter_points
Generate the coordinates for a random scatter of points
profile_coordinates
Coordinates for a profile between two points
-
geoist.gridder.coordinates.
inside
(coordinates, region)[源代码]¶ Determine which points fall inside a given region.
Points at the boundary are counted as being outsize.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
- 返回
are_inside -- An array of booleans with the same shape as the input coordinate arrays. Will be
True
if the respective coordinates fall inside the area,False
otherwise.- 返回类型
array of booleans
实际案例
>>> import numpy as np >>> east = np.array([1, 2, 3, 4, 5, 6]) >>> north = np.array([10, 11, 12, 13, 14, 15]) >>> region = [2.5, 5.5, 12, 15] >>> print(inside((east, north), region)) [False False True True True False] >>> # This also works for 2D-arrays >>> east = np.array([[1, 1, 1], ... [2, 2, 2], ... [3, 3, 3]]) >>> north = np.array([[5, 7, 9], ... [5, 7, 9], ... [5, 7, 9]]) >>> region = [0.5, 2.5, 6, 9] >>> print(inside((east, north), region)) [[False True True] [False True True] [False False False]]
-
geoist.gridder.coordinates.
pad_region
(region, pad)[源代码]¶ Extend the borders of a region by the given amount.
- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
pad (float or tuple = (pad_north, pad_east)) -- The amount of padding to add to the region. If it's a single number, add this to all boundaries of region equally. If it's a tuple of numbers, then will add different padding to the North-South and East-West dimensions.
- 返回
padded_region -- The padded region.
- 返回类型
list = [W, E, S, N]
实际案例
>>> pad_region((0, 1, -5, -3), 1) (-1, 2, -6, -2) >>> pad_region((0, 1, -5, -3), (3, 2)) (-2, 3, -8, 0)
-
geoist.gridder.coordinates.
profile_coordinates
(point1, point2, size, extra_coords=None)[源代码]¶ Coordinates for a profile along a straight line between two points.
- 参数
point1 (tuple or list) --
(easting, northing)
West-East and South-North coordinates of the first point, respectively.point2 (tuple or list) --
(easting, northing)
West-East and South-North coordinates of the second point, respectively.size (int) -- Number of points to sample along the line.
extra_coords (None, scalar, or list) -- If not None, then value(s) of extra coordinate arrays to be generated. These extra arrays will have the same size as the others but will contain a constant value. Will generate an extra array per value given in extra_coords. Use this to generate arrays of constant heights or times, for example, that might be needed to evaluate a gridder.
- 返回
coordinates, distances -- The coordinates of points along the straight line and the distances from the first point.
- 返回类型
tuple and 1d array
实际案例
>>> (east, north), dist = profile_coordinates((1, 10), (1, 20), size=11) >>> print('easting:', ', '.join('{:.1f}'.format(i) for i in east)) easting: 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0 >>> print('northing:', ', '.join('{:.1f}'.format(i) for i in north)) northing: 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0 >>> print('distance:', ', '.join('{:.1f}'.format(i) for i in dist)) distance: 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0 >>> (east, north, height), dist = profile_coordinates( ... (1, 10), (1, 20), size=11, extra_coords=35) >>> print(height) [35. 35. 35. 35. 35. 35. 35. 35. 35. 35. 35.] >>> (east, north, height, time), dist = profile_coordinates( ... (1, 10), (1, 20), size=11, extra_coords=[35, 0.1]) >>> print(height) [35. 35. 35. 35. 35. 35. 35. 35. 35. 35. 35.] >>> print(time) [0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1]
参见
scatter_points
Generate the coordinates for a random scatter of points
grid_coordinates
Generate coordinates for each point on a regular grid
-
geoist.gridder.coordinates.
project_region
(region, projection)[源代码]¶ Calculate the bounding box of a region in projected coordinates.
- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
projection (callable or None) -- If not None, then should be a callable object (like a function)
projection(easting, northing) -> (proj_easting, proj_northing)
that takes in easting and northing coordinate arrays and returns projected northing and easting coordinate arrays.
- 返回
proj_region -- The bounding box of the projected region.
- 返回类型
list = [W, E, S, N]
实际案例
>>> def projection(x, y): ... return (2*x, -1*y) >>> project_region((3, 5, -9, -4), projection) (6.0, 10.0, 4.0, 9.0)
-
geoist.gridder.coordinates.
scatter_points
(region, size, random_state=None, extra_coords=None)[源代码]¶ Generate the coordinates for a random scatter of points.
The points are drawn from a uniform distribution.
- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
size (int) -- The number of points to generate.
random_state (numpy.random.RandomState or an int seed) -- A random number generator used to define the state of the random permutations. Use a fixed seed to make sure computations are reproducible. Use
None
to choose a seed automatically (resulting in different numbers with each run).extra_coords (None, scalar, or list) -- If not None, then value(s) of extra coordinate arrays to be generated. These extra arrays will have the same size as the others but will contain a constant value. Will generate an extra array per value given in extra_coords. Use this to generate arrays of constant heights or times, for example, that might be needed to evaluate a gridder.
- 返回
coordinates -- Arrays with coordinates of each point in the grid. Each array contains values for a dimension in the order: easting, northing, vertical, and any extra dimensions given in extra_coords. All arrays will have the specified size.
- 返回类型
tuple of arrays
实际案例
>>> # We'll use a seed value will ensure that the same will be generated >>> # every time. >>> easting, northing = scatter_points((0, 10, -2, -1), 4, random_state=0) >>> print(', '.join(['{:.4f}'.format(i) for i in easting])) 5.4881, 7.1519, 6.0276, 5.4488 >>> print(', '.join(['{:.4f}'.format(i) for i in northing])) -1.5763, -1.3541, -1.5624, -1.1082 >>> easting, northing, height = scatter_points((0, 10, -2, -1), 4, random_state=0, ... extra_coords=12) >>> print(height) [12. 12. 12. 12.] >>> easting, northing, height, time = scatter_points( ... (0, 10, -2, -1), 4, random_state=0, extra_coords=[12, 1986]) >>> print(height) [12. 12. 12. 12.] >>> print(time) [1986. 1986. 1986. 1986.]
参见
grid_coordinates
Generate coordinates for each point on a regular grid
profile_coordinates
Coordinates for a profile between two points
-
geoist.gridder.coordinates.
spacing_to_shape
(region, spacing, adjust)[源代码]¶ Convert the grid spacing to a grid shape.
Adjusts the spacing or the region if the desired spacing is not a multiple of the grid dimensions.
- 参数
region (list = [W, E, S, N]) -- The boundaries of a given region in Cartesian or geographic coordinates.
spacing (float, tuple = (s_north, s_east), or None) -- The grid spacing in the South-North and West-East directions, respectively. A single value means that the spacing is equal in both directions.
adjust ({'spacing', 'region'}) -- Whether to adjust the spacing or the region if required. Ignored if shape is given instead of spacing. Defaults to adjusting the spacing.
- 返回
shape, region -- The calculated shape and region that best fits the desired spacing. Spacing or region may be adjusted.
- 返回类型
tuples
geoist.gridder.genpnt module¶
Name : genpnt.py Created on : 2018/11/24 08:57 Author : Steve Chen <chenshi@cea-igp.ac.cn> Affiliation : Institute of Geophysics, CEA. Version : 0.1.0 Copyright : Copyright (C) 2018-2020 GEOIST Development Team. All Rights Reserved. License : Distributed under the MIT License. See LICENSE.txt for more info. Github : https://igp-gravity.github.io/ Description : Generate points on a map as regular grids or points scatters.
-
geoist.gridder.genpnt.
circular_scatter
(area, n, z=None, random=False, seed=None)[源代码]¶ Generate a set of n points positioned in a circular array.
The diameter of the circle is equal to the smallest dimension of the area
Parameters:
- arealist = [x1, x2, y1, y2]
Area inside of which the points are contained
- nint
Number of points
- zfloat or 1d-array
Optional. z coordinate of the points. If given, will return an array with the value z.
- randomTrue or False
If True, positions of the points on the circle will be chosen at random
- seedNone or int
Seed used to generate the pseudo-random numbers if random==True. If None, will use a different seed every time. Use the same seed to generate the same random sequence.
Returns:
[x, y]
Numpy arrays with the x and y coordinates of the points
[x, y, z]
If z given. Arrays with the x, y, and z coordinates of the points
-
geoist.gridder.genpnt.
regular
(area, shape, z=None)[源代码]¶ Create a regular grid.
The x directions is North-South and y East-West. Imagine the grid as a matrix with x varying in the lines and y in columns.
Returned arrays will be flattened to 1D with
numpy.ravel
.警告
As of version 0.4, the
shape
argument was corrected to beshape = (nx, ny)
instead ofshape = (ny, nx)
.Parameters:
- area
(x1, x2, y1, y2)
: Borders of the grid
- shape
Shape of the regular grid, ie
(nx, ny)
.
- z
Optional. z coordinate of the grid points. If given, will return an array with the value z.
Returns:
[x, y]
Numpy arrays with the x and y coordinates of the grid points
[x, y, z]
If z given. Numpy arrays with the x, y, and z coordinates of the grid points
Examples:
>>> x, y = regular((0, 10, 0, 5), (5, 3)) >>> print(x) [ 0. 0. 0. 2.5 2.5 2.5 5. 5. 5. 7.5 7.5 7.5 10. 10. 10. ] >>> print(x.reshape((5, 3))) [[ 0. 0. 0. ] [ 2.5 2.5 2.5] [ 5. 5. 5. ] [ 7.5 7.5 7.5] [ 10. 10. 10. ]] >>> print(y.reshape((5, 3))) [[ 0. 2.5 5. ] [ 0. 2.5 5. ] [ 0. 2.5 5. ] [ 0. 2.5 5. ] [ 0. 2.5 5. ]] >>> x, y = regular((0, 0, 0, 5), (1, 3)) >>> print(x.reshape((1, 3))) [[ 0. 0. 0.]] >>> print(y.reshape((1, 3))) [[ 0. 2.5 5. ]] >>> x, y, z = regular((0, 10, 0, 5), (5, 3), z=-10) >>> print(z.reshape((5, 3))) [[-10. -10. -10.] [-10. -10. -10.] [-10. -10. -10.] [-10. -10. -10.] [-10. -10. -10.]]
-
geoist.gridder.genpnt.
scatter
(area, n, z=None, seed=None)[源代码]¶ Create an irregular grid with a random scattering of points.
Parameters:
- area
(x1, x2, y1, y2)
: Borders of the grid
- n
Number of points
- z
Optional. z coordinate of the points. If given, will return an array with the value z.
- seedNone or int
Seed used to generate the pseudo-random numbers. If None, will use a different seed every time. Use the same seed to generate the same random points.
Returns:
[x, y]
Numpy arrays with the x and y coordinates of the points
[x, y, z]
If z given. Arrays with the x, y, and z coordinates of the points
Examples:
>>> # Passing in a seed value will ensure that scatter will return the same >>> # values given the same input. Use seed=None if you don't want this. >>> x, y = scatter((0, 10, 0, 2), 4, seed=0) >>> # Small function to print the array values with 4 decimal places >>> pprint = lambda arr: print(', '.join('{:.4f}'.format(i) for i in arr)) >>> pprint(x) 5.4881, 7.1519, 6.0276, 5.4488 >>> pprint(y) 0.8473, 1.2918, 0.8752, 1.7835
>>> # scatter can take the z argument as well >>> x2, y2, z2 = scatter((-10, 1, 1245, 3556), 6, z=-150, seed=2) >>> pprint(x2) -5.2041, -9.7148, -3.9537, -5.2115, -5.3760, -6.3663 >>> pprint(y2) 1717.9430, 2676.1352, 1937.5020, 1861.6378, 2680.4403, 2467.8474 >>> pprint(z2) -150.0000, -150.0000, -150.0000, -150.0000, -150.0000, -150.0000
geoist.gridder.interpolation module¶
Name : interpolation.py Created on : 2018/11/24 08:57 Author : Steve Chen <chenshi@cea-igp.ac.cn> Affiliation : Institute of Geophysics, CEA. Version : 0.1.0 Copyright : Copyright (C) 2018-2020 GEOIST Development Team. All Rights Reserved. License : Distributed under the MIT License. See LICENSE.txt for more info. Github : https://igp-gravity.github.io/ Description : 2D interpolation, griding, and profile extraction.
-
geoist.gridder.interpolation.
fill_nans
(x, y, v, xp, yp, vp)[源代码]¶ " Fill in the NaNs or masked values on interpolated points using nearest neighbors.
警告
Operation is performed in place. Replaces the NaN or masked values of the original array!
Parameters:
- x, y1D arrays
Arrays with the x and y coordinates of the original data points (not interpolated).
- v1D array
Array with the scalar value assigned to the data points (not interpolated).
- xp, yp1D arrays
Points where the data values were interpolated.
- vp1D array
Interpolated data values (the one that has NaNs or masked values to replace).
-
geoist.gridder.interpolation.
interp
(x, y, v, shape, area=None, algorithm='cubic', extrapolate=False)[源代码]¶ Interpolate spacial data onto a regular grid.
Utility function that generates a regular grid with
regular
and callsinterp_at
on the generated points.Parameters:
- x, y1D arrays
Arrays with the x and y coordinates of the data points.
- v1D array
Array with the scalar value assigned to the data points.
- shapetuple = (nx, ny)
Shape of the interpolated regular grid, ie (nx, ny).
- areatuple = (x1, x2, y1, y2)
The are where the data will be interpolated. If None, then will get the area from x and y.
- algorithmstring
Interpolation algorithm. Either
'cubic'
,'nearest'
,'linear'
(see scipy.interpolate.griddata).
- extrapolateTrue or False
If True, will extrapolate values outside of the convex hull of the data points.
Returns:
[x, y, v]
Three 1D arrays with the interpolated x, y, and v
-
geoist.gridder.interpolation.
interp_at
(x, y, v, xp, yp, algorithm='cubic', extrapolate=False)[源代码]¶ Interpolate spacial data onto specified points.
Wraps
scipy.interpolate.griddata
.Parameters:
- x, y1D arrays
Arrays with the x and y coordinates of the data points.
- v1D array
Array with the scalar value assigned to the data points.
- xp, yp1D arrays
Points where the data values will be interpolated
- algorithmstring
Interpolation algorithm. Either
'cubic'
,'nearest'
,'linear'
(see scipy.interpolate.griddata)
- extrapolateTrue or False
If True, will extrapolate values outside of the convex hull of the data points.
Returns:
- v1D array
1D array with the interpolated v values.
-
geoist.gridder.interpolation.
profile
(x, y, v, point1, point2, size, algorithm='cubic')[源代码]¶ Extract a profile between 2 points from spacial data.
Uses interpolation to calculate the data values at the profile points.
Parameters:
- x, y1D arrays
Arrays with the x and y coordinates of the data points.
- v1D array
Array with the scalar value assigned to the data points.
- point1, point2lists = [x, y]
Lists the x, y coordinates of the 2 points between which the profile will be extracted.
- sizeint
Number of points along the profile.
- algorithmstring
Interpolation algorithm. Either
'cubic'
,'nearest'
,'linear'
(see scipy.interpolate.griddata).
Returns:
- [xp, yp, distances, vp]1d arrays
xp
andyp
are the x, y coordinates of the points along the profile.distances
are the distances of the profile points frompoint1
.vp
are the data points along the profile.
geoist.gridder.mask module¶
Mask grid points based on different criteria.
-
geoist.gridder.mask.
distance_mask
(data_coordinates, maxdist, coordinates=None, grid=None, projection=None)[源代码]¶ Mask grid points that are too far from the given data points.
Distances are Euclidean norms. If using geographic data, provide a projection function to convert coordinates to Cartesian before distance calculations.
Either coordinates or grid must be given:
If coordinates is not None, produces an array that is False when a point is more than maxdist from the closest data point and True otherwise.
If grid is not None, produces a mask and applies it to grid (an
xarray.Dataset
).
注解
If installed, package
pykdtree
will be used instead ofscipy.spatial.cKDTree
for better performance.- 参数
data_coordinates (tuple of arrays) -- Same as coordinates but for the data points.
maxdist (float) -- The maximum distance that a point can be from the closest data point.
coordinates (None or tuple of arrays) -- Arrays with the coordinates of each point that will be masked. Should be in the following order: (easting, northing, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
grid (None or
xarray.Dataset
) -- 2D grid with values to be masked. Will use the first two dimensions of the grid as northing and easting coordinates, respectively. The mask will be applied to grid using thexarray.Dataset.where
method.projection (callable or None) -- If not None, then should be a callable object
projection(easting, northing) -> (proj_easting, proj_northing)
that takes in easting and northing coordinate arrays and returns projected easting and northing coordinate arrays. This function will be used to project the given coordinates (or the ones extracted from the grid) before calculating distances.
- 返回
mask -- If coordinates was given, then a boolean array with the same shape as the elements of coordinates. If grid was given, then an
xarray.Dataset
with the mask applied to it.- 返回类型
array or
xarray.Dataset
实际案例
>>> from geoist.gidder import grid_coordinates >>> region = (0, 5, -10, -4) >>> spacing = 1 >>> coords = grid_coordinates(region, spacing=spacing) >>> mask = distance_mask((2.5, -7.5), maxdist=2, coordinates=coords) >>> print(mask) [[False False False False False False] [False False True True False False] [False True True True True False] [False True True True True False] [False False True True False False] [False False False False False False] [False False False False False False]] >>> # Mask an xarray.Dataset directly >>> import xarray as xr >>> coords_dict = {"easting": coords[0][0, :], "northing": coords[1][:, 0]} >>> data_vars = {"scalars": (["northing", "easting"], np.ones(mask.shape))} >>> grid = xr.Dataset(data_vars, coords=coords_dict) >>> masked = distance_mask((3.5, -7.5), maxdist=2, grid=grid) >>> print(masked.scalars.values) [[nan nan nan nan nan nan] [nan nan nan 1. 1. nan] [nan nan 1. 1. 1. 1.] [nan nan 1. 1. 1. 1.] [nan nan nan 1. 1. nan] [nan nan nan nan nan nan] [nan nan nan nan nan nan]]
geoist.gridder.model_selection module¶
Functions for automating model selection through cross-validation.
Supports using a dask.distributed.Client object for parallelism. The DummyClient is used as a serial version of the parallel client.
-
class
geoist.gridder.model_selection.
DummyClient
[源代码]¶ 基类:
object
Dummy client to mimic a dask.distributed.Client for immediate local execution.
>>> client = DummyClient() >>> client.submit(sum, (1, 2, 3)) 6
-
geoist.gridder.model_selection.
cross_val_score
(estimator, coordinates, data, weights=None, cv=None, client=None)[源代码]¶ Score an estimator/gridder using cross-validation.
Similar to
sklearn.model_selection.cross_val_score
but modified to accept spatial multi-component data with weights.By default, will use
sklearn.model_selection.KFold
to split the dataset. Another cross-validation class can be passed in through the cv argument.Can optionally run in parallel using dask. To do this, pass in a
dask.distributed.Client
as the client argument. Tasks in this function will be submitted to the dask cluster, which can be local. In this case, the resulting scores won't be the actual values butdask.distributed.Future
objects. Call their.result()
methods to get back the values or pass them along to other dask computations.- 参数
estimator (geoist.gidder gridder) -- Any geoist.gidder gridder class that has the
fit
andscore
methods.coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
data (array or tuple of arrays) -- the data values of each data point. If the data has more than one component, data must be a tuple of arrays (one for each component).
weights (none or array or tuple of arrays) -- if not none, then the weights assigned to each data point. If more than one data component is provided, you must provide a weights array for each data component (if not none).
cv (None or cross-validation generator) -- Any scikit-learn cross-validation generator. Defaults to
sklearn.model_selection.KFold
.client (None or dask.distributed.Client) -- If None, then computations are run serially. Otherwise, should be a dask
Client
object. It will be used to dispatch computations to the dask cluster.
- 返回
scores -- List of scores for each split of the cross-validation generator. If client is not None, then the scores will be futures.
- 返回类型
list
实际案例
>>> from geoist.gidder import grid_coordinates, Trend >>> coords = grid_coordinates((0, 10, -10, -5), spacing=0.1) >>> data = 10 - coords[0] + 0.5*coords[1] >>> # A linear trend should perfectly predict this data >>> scores = cross_val_score(Trend(degree=1), coords, data) >>> print(', '.join(['{:.2f}'.format(score) for score in scores])) 1.00, 1.00, 1.00, 1.00, 1.00
>>> # To run parallel, we need to create a dask.distributed Client. It will >>> # create a local cluster if no arguments are given so we can run the >>> # scoring on a single machine. >>> from dask.distributed import Client >>> client = Client() >>> # The scoring will now only submit tasks to our local cluster >>> scores = cross_val_score(Trend(degree=1), coords, data, client=client) >>> # The scores are not the actual values but Futures >>> type(scores[0]) <class 'distributed.client.Future'> >>> # We need to call .result() to get back the actual value >>> print('{:.2f}'.format(scores[0].result())) 1.00
-
geoist.gridder.model_selection.
fit_score
(estimator, train_data, test_data)[源代码]¶ Fit an estimator on the training data and then score it on the testing data
-
geoist.gridder.model_selection.
select
(arrays, index)[源代码]¶ Index each array in a tuple of arrays.
If the arrays tuple contains a
None
, the entire tuple will be returned as is.- 参数
arrays (tuple of arrays) --
index (array) -- An array of indices to select from arrays.
- 返回
indexed_arrays
- 返回类型
tuple of arrays
实际案例
>>> import numpy as np >>> select((np.arange(5), np.arange(-3, 2, 1)), [1, 3]) (array([1, 3]), array([-2, 0])) >>> select((None, None, None, None), [1, 2]) (None, None, None, None)
-
geoist.gridder.model_selection.
train_test_split
(coordinates, data, weights=None, **kwargs)[源代码]¶ Split a dataset into a training and a testing set for cross-validation.
Similar to
sklearn.model_selection.train_test_split
but is tuned to work on multi-component spatial data with optional weights.Extra keyword arguments will be passed to
sklearn.model_selection.ShuffleSplit
, except forn_splits
which is always 1.- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...).
data (array or tuple of arrays) -- the data values of each data point. If the data has more than one component, data must be a tuple of arrays (one for each component).
weights (none or array or tuple of arrays) -- if not none, then the weights assigned to each data point. If more than one data component is provided, you must provide a weights array for each data component (if not none).
- 返回
train, test -- Each is a tuple = (coordinates, data, weights) generated by separating the input values randomly.
- 返回类型
tuples
实际案例
>>> import numpy as np >>> # Split 2-component data with weights >>> data = (np.array([1, 3, 5, 7]), np.array([0, 2, 4, 6])) >>> coordinates = (np.arange(4), np.arange(-4, 0)) >>> weights = (np.array([1, 1, 2, 1]), np.array([1, 2, 1, 1])) >>> train, test = train_test_split(coordinates, data, weights, ... random_state=0) >>> print("Coordinates:", train[0], test[0], sep='\n ') Coordinates: (array([3, 1, 0]), array([-1, -3, -4])) (array([2]), array([-2])) >>> print("Data:", train[1], test[1], sep='\n ') Data: (array([7, 3, 1]), array([6, 2, 0])) (array([5]), array([4])) >>> print("Weights:", train[2], test[2], sep='\n ') Weights: (array([1, 1, 1]), array([1, 2, 1])) (array([2]), array([1])) >>> # Split single component data without weights >>> train, test = train_test_split(coordinates, data[0], None, ... random_state=0) >>> print("Coordinates:", train[0], test[0], sep='\n ') Coordinates: (array([3, 1, 0]), array([-1, -3, -4])) (array([2]), array([-2])) >>> print("Data:", train[1], test[1], sep='\n ') Data: (array([7, 3, 1]),) (array([5]),) >>> print("Weights:", train[2], test[2], sep='\n ') Weights: (None,) (None,)
geoist.gridder.padding module¶
Name : padding.py Created on : 2018/11/24 08:57 Author : Steve Chen <chenshi@cea-igp.ac.cn> Affiliation : Institute of Geophysics, CEA. Version : 0.1.0 Copyright : Copyright (C) 2018-2020 GEOIST Development Team. All Rights Reserved. License : Distributed under the MIT License. See LICENSE.txt for more info. Github : https://igp-gravity.github.io/ Description : Apply padding to data grids using different algorithms for the filling.
-
geoist.gridder.padding.
pad_array
(a, npd=None, padtype='OddReflectionTaper')[源代码]¶ Return a padded array of arbitrary dimension.
The function takes an array of arbitrary dimension and pads it either to the dimensions given by the tuple npd, or to the next power of 2 if npd is not given.
An odd reflection with a cosine taper (
padtype='OddReflectionTaper'
) is the preferred method of padding for Fourier Transform operations. The odd reflection optimally preserves the frequency content while adding minimal sharp inflections.注解
Requires gridded data of the same dimension as the desired output (i.e. no flattened arrays; use reshape).
注解
This function returns a deep copy of the original array.
Parameters:
- aarray
Array (N-D) to be padded
- npdtuple (optional)
Desired shape of new padded array. If not provided, the nearest power of 2 will be used.
- padtypestring (optional)
What method will be used to pad the new values. Can be lower or upper case. Options:
oddreflectiontaper: Generates odd reflection then tapers to the mean using a cosine function (Default)
oddreflection: Pads with the odd reflection, with no taper
reflection: Pads with simple reflection
lintaper: Linearly tapers to the mean
value: Numeric value (e.g.,
'10.4'
). Input a float or integer directly.edge: Uses the edge value as a constant pad
mean: Uses the mean of the vector along each axis
Returns:
- apnumpy array
Padded array. The array core is a deep copy of the original array
- npslist
List of tuples containing the number of elements padded onto each dimension.
Examples:
>>> import numpy as np >>> z = np.array([3, 4, 4, 5, 6]) >>> zpad, nps = pad_array(z) >>> print(zpad) [ 4.4 3.2 3. 4. 4. 5. 6. 4.4] >>> print(nps) [(2, 1)]
>>> shape = (5, 6) >>> z = np.ones(shape) >>> zpad, nps = pad_array(z, padtype='5') >>> zpad array([[ 5., 5., 5., 5., 5., 5., 5., 5.], [ 5., 5., 5., 5., 5., 5., 5., 5.], [ 5., 1., 1., 1., 1., 1., 1., 5.], [ 5., 1., 1., 1., 1., 1., 1., 5.], [ 5., 1., 1., 1., 1., 1., 1., 5.], [ 5., 1., 1., 1., 1., 1., 1., 5.], [ 5., 1., 1., 1., 1., 1., 1., 5.], [ 5., 5., 5., 5., 5., 5., 5., 5.]]) >>> print(nps) [(2, 1), (1, 1)]
-
geoist.gridder.padding.
pad_coords
(xy, shape, nps)[源代码]¶ Apply padding to coordinate vectors.
Designed to be used in concert with
pad_array
, this function takes a list of coordinate vectors and pads them using the same number of elements as the padding of the data array.注解
This function returns a list of arrays in the same format as, for example,
regular
. It is a list of flattenednp.meshgrid
for each vector in the same order as was input through argument xy.Parameters:
- xylist
List of arrays of coordinates
- shapetuple
Size of original array
- npslist
List of tuples containing the number of elements padded onto each dimension (use the output from
pad_array
).
Returns:
- coordspadlist
List of padded coordinate arrays
Examples:
>>> import numpy as np >>> from fatiando.gridder import regular >>> shape = (5, 6) >>> x, y, z = regular((-10, 10, -20, 0), shape, z=-25) >>> gz = np.zeros(shape) >>> gzpad, nps = pad_array(gz) >>> print(x.reshape(shape)[:, 0]) [-10. -5. 0. 5. 10.] >>> print(y.reshape(shape)[0, :]) [-20. -16. -12. -8. -4. 0.] >>> xy = [x, y] >>> N = pad_coords(xy, shape, nps) >>> print(N[0].reshape(gzpad.shape)[:, 0]) [-20. -15. -10. -5. 0. 5. 10. 15.] >>> print(N[1].reshape(gzpad.shape)[0, :]) [-24. -20. -16. -12. -8. -4. 0. 4.]
-
geoist.gridder.padding.
unpad_array
(a, nps)[源代码]¶ Remove padding from an array.
This function takes a padded array and removes the padding from both sides. Designed to use the output of
pad_array
.注解
Unlike
pad_array
, this function returns a slice of the input array. Therefore, any changes to the padded array will be reflected in the unpadded array!Parameters:
- aarray
Array to be un-padded. Can be of arbitrary dimension.
- npslist
List of tuples giving the min and max indices for the cutoff. Use the value returned by
pad_array
.
Returns:
- barray
Array of same dimension as a, with padding removed
Examples:
>>> import numpy as np >>> z = np.array([3, 4, 4, 5, 6]) >>> zpad, nps = pad_array(z) >>> print(zpad) [ 4.4 3.2 3. 4. 4. 5. 6. 4.4] >>> zunpad = unpad_array(zpad, nps) >>> print(zunpad) [ 3. 4. 4. 5. 6.]
geoist.gridder.scipygridder module¶
A gridder that uses scipy.interpolate as the backend.
-
class
geoist.gridder.scipygridder.
ScipyGridder
(method='cubic', extra_args=None)[源代码]¶ 基类:
geoist.gridder.base.BaseGridder
A scipy.interpolate based gridder for scalar Cartesian data.
Provides a geoist.gidder gridder interface to the scipy interpolators
scipy.interpolate.LinearNDInterpolator
,scipy.interpolate.NearestNDInterpolator
, andscipy.interpolate.CloughTocher2DInterpolator
(cubic).- 参数
method (str) -- The interpolation method. Either
'linear'
,'nearest'
, or'cubic'
.extra_args (None or dict) -- Extra keyword arguments to pass to the scipy interpolator class. See the documentation for each interpolator for a list of possible arguments.
-
interpolator_
¶ An instance of the corresponding scipy interpolator class.
- Type
scipy interpolator class
-
region_
¶ The boundaries (
[W, E, S, N]
) of the data used to fit the interpolator. Used as the default region for thegrid
andscatter
methods.- Type
tuple
-
fit
(coordinates, data, weights=None)[源代码]¶ Fit the interpolator to the given data.
Any keyword arguments passed as the
extra_args
attribute will be used when instantiating the scipy interpolator.The data region is captured and used as default for the
grid
andscatter
methods.- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
data (array) -- The data values that will be interpolated.
weights (None or array) -- If not None, then the weights assigned to each data point. Typically, this should be 1 over the data uncertainty squared. Ignored for this interpolator. Only present for compatibility with other gridder.
- 返回
self -- Returns this gridder instance for chaining operations.
- 返回类型
geoist.gidder.ScipyGridder
-
predict
(coordinates)[源代码]¶ Interpolate data on the given set of points.
Requires a fitted gridder (see
fit
).- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
- 返回
data -- The data values interpolated on the given points.
- 返回类型
array
geoist.gridder.slicing module¶
Name : slicing.py Created on : 2018/11/24 08:57 Author : Steve Chen <chenshi@cea-igp.ac.cn> Affiliation : Institute of Geophysics, CEA. Version : 0.1.0 Copyright : Copyright (C) 2018-2020 GEOIST Development Team. All Rights Reserved. License : Distributed under the MIT License. See LICENSE.txt for more info. Github : https://igp-gravity.github.io/ Description : Functions for segmenting spacial data (windowing, cutting, etc).
-
geoist.gridder.slicing.
cut
(x, y, scalars, area)[源代码]¶ Return a subsection of a grid.
The returned subsection is not a copy! In technical terms, returns a slice of the numpy arrays. So changes made to the subsection reflect on the original grid. Use numpy.copy to make copies of the subsections and avoid this.
Parameters:
- x, y
Arrays with the x and y coordinates of the data points.
- scalars
List of arrays with the scalar values assigned to the grid points.
- area
(x1, x2, y1, y2)
: Borders of the subsection
Returns:
[subx, suby, subscalars]
Arrays with x and y coordinates and scalar values of the subsection.
Examples:
>>> import numpy as np >>> x = np.array([1, 2, 3, 4, 5, 6]) >>> y = np.array([10, 11, 12, 13, 14, 15]) >>> data = np.array([42, 65, 92, 24, 135, 456]) >>> area = [2.5, 5.5, 12, 15] >>> xs, ys, [datas] = cut(x, y, [data], area) >>> print(xs) [3 4 5] >>> print(ys) [12 13 14] >>> print(datas) [ 92 24 135] >>> # This also works for 2D-arrays >>> x = np.array([[1, 1, 1], ... [2, 2, 2], ... [3, 3, 3]]) >>> y = np.array([[5, 7, 9], ... [5, 7, 9], ... [5, 7, 9]]) >>> data = np.array([[12, 84, 53], ... [43, 79, 29], ... [45, 27, 10]]) >>> area = [0.5, 2.5, 7, 9] >>> xs, ys, [datas] = cut(x, y, [data], area) >>> print(xs) [1 1 2 2] >>> print(ys) [7 9 7 9] >>> print(datas) [84 53 79 29]
-
geoist.gridder.slicing.
inside
(x, y, area)[源代码]¶ Tell which indices of an array fall inside an area.
Parameters:
- x, yndarrays
The x and y coordinate vectors.
- arealist = [xmin, xmax, ymin, ymax]
x and y limits of the area.
Returns:
- is_insidendarray of booleans
An array of booleans. Will be
True
if the respective coordinates fall inside the area,False
otherwise.
Examples:
>>> import numpy as np >>> x = np.array([1, 2, 3, 4, 5, 6]) >>> y = np.array([10, 11, 12, 13, 14, 15]) >>> area = [2.5, 5.5, 12, 15] >>> is_inside = inside(x, y, area) >>> print(is_inside) [False False True True True False] >>> # This also works for 2D-arrays >>> x = np.array([[1, 1, 1], ... [2, 2, 2], ... [3, 3, 3]]) >>> y = np.array([[5, 7, 9], ... [5, 7, 9], ... [5, 7, 9]]) >>> area = [0.5, 2.5, 7, 9] >>> is_inside = inside(x, y, area) >>> print(is_inside) [[False True True] [False True True] [False False False]]
geoist.gridder.spline module¶
Biharmonic splines in 2D.
-
geoist.gridder.spline.
GREENS_FUNC_JIT
(east, north, mindist)¶ Calculate the Green's function for the Bi-Harmonic Spline
-
class
geoist.gridder.spline.
Spline
(mindist=1e-05, damping=None, force_coords=None, engine='auto')[源代码]¶ 基类:
geoist.gridder.base.BaseGridder
Biharmonic spline interpolation using Green's functions.
This gridder assumes Cartesian coordinates.
Implements the 2D splines of [Sandwell1987]. The Green's function for the spline corresponds to the elastic deflection of a thin sheet subject to a vertical force. For an observation point at the origin and a force at the coordinates given by the vector \(\mathbf{x}\), the Green's function is:
\[g(\mathbf{x}) = \|\mathbf{x}\|^2 \left(\log \|\mathbf{x}\| - 1\right)\]In practice, this function is not defined for data points that coincide with a force. To prevent this, a fudge factor is added to \(\|\mathbf{x}\|\).
The interpolation is performed by estimating forces that produce deflections that fit the observed data (using least-squares). Then, the interpolated points can be evaluated at any location.
By default, the forces will be placed at the same points as the input data given to
fit
. This configuration provides an exact solution on top of the data points. However, this solution can be unstable for certain configurations of data points.Approximate (and more stable) solutions can be obtained by applying damping regularization to smooth the estimated forces (and interpolated values) or by not using the data coordinates to position the forces (use the force_coords parameter).
Data weights can be used during fitting but only have an any effect when using the approximate solutions.
Before fitting, the Jacobian (design, sensitivity, feature, etc) matrix for the spline is normalized using
sklearn.preprocessing.StandardScaler
without centering the mean so that the transformation can be undone in the estimated forces.- 参数
mindist (float) -- A minimum distance between the point forces and data points. Needed because the Green's functions are singular when forces and data points coincide. Acts as a fudge factor.
damping (None or float) -- The positive damping regularization parameter. Controls how much smoothness is imposed on the estimated forces. If None, no regularization is used.
force_coords (None or tuple of arrays) -- The easting and northing coordinates of the point forces. If None (default), then will be set to the data coordinates the first time
fit
is called.engine (str) -- Computation engine for the Jacobian matrix and prediction. Can be
'auto'
,'numba'
, or'numpy'
. If'auto'
, will use numba if it is installed or numpy otherwise. The numba version is multi-threaded and usually faster, which makes fitting and predicting faster.
-
force_
¶ The estimated forces that fit the observed data.
- Type
array
-
region_
¶ The boundaries (
[W, E, S, N]
) of the data used to fit the interpolator. Used as the default region for thegrid
andscatter
methods.- Type
tuple
-
fit
(coordinates, data, weights=None)[源代码]¶ Fit the biharmonic spline to the given data.
The data region is captured and used as default for the
grid
andscatter
methods.All input arrays must have the same shape.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
data (array) -- The data values of each data point.
weights (None or array) -- If not None, then the weights assigned to each data point. Typically, this should be 1 over the data uncertainty squared.
- 返回
Returns this estimator instance for chaining operations.
- 返回类型
self
-
jacobian
(coordinates, force_coords, dtype='float64')[源代码]¶ Make the Jacobian matrix for the 2D biharmonic spline.
Each column of the Jacobian is the Green's function for a single force evaluated on all observation points [Sandwell1987].
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
force_coords (tuple of arrays) -- Arrays with the coordinates for the forces. Should be in the same order as the coordinate arrays.
dtype (str or numpy dtype) -- The type of the Jacobian array.
- 返回
jacobian -- The (n_data, n_forces) Jacobian matrix.
- 返回类型
2D array
-
predict
(coordinates)[源代码]¶ Evaluate the estimated spline on the given set of points.
Requires a fitted estimator (see
fit
).- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
- 返回
data -- The data values evaluated on the given points.
- 返回类型
array
-
geoist.gridder.spline.
greens_func
(east, north, mindist)[源代码]¶ Calculate the Green's function for the Bi-Harmonic Spline
-
geoist.gridder.spline.
jacobian_numba
(east, north, force_east, force_north, mindist, jac)[源代码]¶ Calculate the Jacobian matrix using numba to speed things up.
-
geoist.gridder.spline.
jacobian_numpy
(east, north, force_east, force_north, mindist, jac)[源代码]¶ Calculate the Jacobian using numpy broadcasting.
-
geoist.gridder.spline.
predict_numba
(east, north, force_east, force_north, mindist, forces, result)[源代码]¶ Calculate the predicted data using numba to speed things up.
geoist.gridder.synthetic module¶
Generators of synthetic datasets.
-
class
geoist.gridder.synthetic.
CheckerBoard
(amplitude=1000, region=0, 5000, - 5000, 0, w_east=None, w_north=None)[源代码]¶ 基类:
geoist.gridder.base.BaseGridder
Generate synthetic data in a checkerboard pattern.
The mathematical model is:
\[f(e, n) = a \sin\left(\frac{2\pi}{w_e} e\right) \cos\left(\frac{2\pi}{w_n} n\right)\]in which \(e\) is the easting coordinate, \(n\) is the northing coordinate, \(a\) is the amplitude, and \(w_e\) and \(w_n\) are the wavelengths in the east and north directions, respectively.
- 参数
amplitude (float) -- The amplitude of the checkerboard undulations.
region (tuple) -- The boundaries (
[W, E, S, N]
) of the region used to generate the synthetic data.w_east (float) -- The wavelength in the east direction. Defaults to half of the West-East size of the evaluating region.
w_north (float) -- The wavelength in the north direction. Defaults to half of the South-North size of the evaluating region.
实际案例
>>> synth = CheckerBoard() >>> # Default values for the wavelengths are selected automatically >>> print(synth.w_east_, synth.w_north_) 2500.0 2500.0 >>> # Checkerboard.grid produces an xarray.Dataset with data on a regular grid >>> grid = synth.grid(shape=(11, 6)) >>> type(grid) <class 'xarray.core.dataset.Dataset'> >>> # scatter and profile generate pandas.DataFrame objects >>> table = synth.scatter() >>> print(sorted(table.columns)) ['easting', 'northing', 'scalars'] >>> profile = synth.profile(point1=(0, 0), point2=(2500, -2500), size=100) >>> print(sorted(profile.columns)) ['distance', 'easting', 'northing', 'scalars']
-
predict
(coordinates)[源代码]¶ Evaluate the checkerboard function on a given set of points.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
- 返回
data -- The evaluated checkerboard function.
- 返回类型
array
-
property
region_
¶ Used to fool the BaseGridder methods
-
property
w_east_
¶ Use half of the E-W extent
-
property
w_north_
¶ Use half of the N-S extent
geoist.gridder.trend module¶
2D polynomial trends.
-
class
geoist.gridder.trend.
Trend
(degree)[源代码]¶ 基类:
geoist.gridder.base.BaseGridder
Fit a 2D polynomial trend to spatial data.
The trend is estimated through weighted least-squares regression.
The Jacobian (design, sensitivity, feature, etc) matrix for the regression is normalized using
sklearn.preprocessing.StandardScaler
without centering the mean so that the transformation can be undone in the estimated coefficients.- 参数
degree (int) -- The degree of the polynomial. Must be >= 1.
-
coef_
¶ The estimated polynomial coefficients that fit the observed data.
- Type
array
-
region_
¶ The boundaries (
[W, E, S, N]
) of the data used to fit the interpolator. Used as the default region for thegrid
andscatter
methods.- Type
tuple
实际案例
>>> from geoist.gidder import grid_coordinates >>> coordinates = grid_coordinates((1, 5, -5, -1), shape=(5, 5)) >>> data = 10 + 2*coordinates[0] - 0.4*coordinates[1] >>> trend = Trend(degree=1).fit(coordinates, data) >>> print(', '.join(['{:.1f}'.format(i) for i in trend.coef_])) 10.0, 2.0, -0.4 >>> import numpy as np >>> np.allclose(trend.predict(coordinates), data) True >>> # Use weights to account for outliers >>> data_out = data.copy() >>> data_out[2, 2] += 500 >>> weights = np.ones_like(data) >>> weights[2, 2] = 1e-10 >>> trend_out = Trend(degree=1).fit(coordinates, data_out, weights) >>> print(', '.join(['{:.1f}'.format(i) for i in trend_out.coef_])) 10.0, 2.0, -0.4 >>> residual = data_out - trend_out.predict(coordinates) >>> print('{:.2f}'.format(residual[2, 2])) 500.00
-
fit
(coordinates, data, weights=None)[源代码]¶ Fit the trend to the given data.
The data region is captured and used as default for the
grid
andscatter
methods.All input arrays must have the same shape.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
data (array) -- The data values of each data point.
weights (None or array) -- If not None, then the weights assigned to each data point. Typically, this should be 1 over the data uncertainty squared.
- 返回
Returns this estimator instance for chaining operations.
- 返回类型
self
-
jacobian
(coordinates, dtype='float64')[源代码]¶ Make the Jacobian matrix for a 2D polynomial.
Each column of the Jacobian is
easting**i * northing**j
for each (i, j) pair in the polynomial.- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
dtype (str or numpy dtype) -- The type of the output Jacobian numpy array.
- 返回
jacobian -- The (n_data, n_coefficients) Jacobian matrix.
- 返回类型
2D array
实际案例
>>> import numpy as np >>> east = np.linspace(0, 4, 5) >>> north = np.linspace(-5, -1, 5) >>> print(Trend(degree=1).jacobian((east, north), dtype=np.int)) [[ 1 0 -5] [ 1 1 -4] [ 1 2 -3] [ 1 3 -2] [ 1 4 -1]] >>> print(Trend(degree=2).jacobian((east, north), dtype=np.int)) [[ 1 0 -5 0 0 25] [ 1 1 -4 1 -4 16] [ 1 2 -3 4 -6 9] [ 1 3 -2 9 -6 4] [ 1 4 -1 16 -4 1]]
-
predict
(coordinates)[源代码]¶ Evaluate the polynomial trend on the given set of points.
Requires a fitted estimator (see
fit
).- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
- 返回
data -- The trend values evaluated on the given points.
- 返回类型
array
-
geoist.gridder.trend.
polynomial_power_combinations
(degree)[源代码]¶ Combinations of powers for a 2D polynomial of a given degree.
Produces the (i, j) pairs to evaluate the polynomial with
x**i*y**j
.- 参数
degree (int) -- The degree of the 2D polynomial. Must be >= 1.
- 返回
combinations -- A tuple with
(i, j)
pairs.- 返回类型
tuple
实际案例
>>> print(polynomial_power_combinations(1)) ((0, 0), (1, 0), (0, 1)) >>> print(polynomial_power_combinations(2)) ((0, 0), (1, 0), (0, 1), (2, 0), (1, 1), (0, 2)) >>> # This is a long polynomial so split it in two lines >>> print(" ".join([str(c) for c in polynomial_power_combinations(3)])) (0, 0) (1, 0) (0, 1) (2, 0) (1, 1) (0, 2) (3, 0) (2, 1) (1, 2) (0, 3)
geoist.gridder.utils module¶
General utilities.
-
geoist.gridder.utils.
check_data
(data)[源代码]¶ Check the data argument and make sure it's a tuple. If the data is a single array, return it as a tuple with a single element.
This is the default format accepted and used by all gridders and processing functions.
实际案例
>>> check_data([1, 2, 3]) ([1, 2, 3],) >>> check_data(([1, 2], [3, 4])) ([1, 2], [3, 4])
-
geoist.gridder.utils.
dummy_jit
(**kwargs)[源代码]¶ Replace numba.jit if not installed with a function that raises RunTimeError.
Use as a decorator.
- 参数
function -- A function that you would decorate with
numba.jit
.- 返回
A function that raises
RunTimeError
warning that numba isn't installed.- 返回类型
function
-
geoist.gridder.utils.
grid_to_table
(grid)[源代码]¶ Convert a grid to a table with the values and coordinates of each point.
Takes a 2D grid as input, extracts the coordinates and runs them through
numpy.meshgrid
to create a 2D table. Works for 2D grids and any number of variables. Use cases includes passing gridded data to functions that expect data in XYZ format, such asgeoist.gidder.BlockReduce
- 参数
grid (
xarray.Dataset
) -- A 2D grid with one or more data variables.- 返回
table -- Table with coordinates and variable values for each point in the grid.
- 返回类型
pandas.DataFrame
实际案例
>>> import xarray as xr >>> import numpy as np >>> # Create a sample grid with a single data variable >>> temperature = xr.DataArray( ... np.arange(20).reshape((4, 5)), ... coords=(np.arange(4), np.arange(5, 10)), ... dims=['northing', 'easting'] ... ) >>> grid = xr.Dataset({"temperature": temperature}) >>> table = grid_to_table(grid) >>> list(sorted(table.columns)) ['easting', 'northing', 'temperature'] >>> print(table.northing.values) [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3] >>> print(table.easting.values) [5 6 7 8 9 5 6 7 8 9 5 6 7 8 9 5 6 7 8 9] >>> print(table.temperature.values) [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19] >>> # Grids with multiple data variables will have more columns. >>> wind_speed = xr.DataArray( ... np.arange(20, 40).reshape((4, 5)), ... coords=(np.arange(4), np.arange(5, 10)), ... dims=['northing', 'easting'] ... ) >>> grid['wind_speed'] = wind_speed >>> table = grid_to_table(grid) >>> list(sorted(table.columns)) ['easting', 'northing', 'temperature', 'wind_speed'] >>> print(table.northing.values) [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3] >>> print(table.easting.values) [5 6 7 8 9 5 6 7 8 9 5 6 7 8 9 5 6 7 8 9] >>> print(table.temperature.values) [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19] >>> print(table.wind_speed.values) [20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39]
-
geoist.gridder.utils.
maxabs
(*args)[源代码]¶ Calculate the maximum absolute value of the given array(s).
Use this to set the limits of your colorbars and center them on zero.
- 参数
args -- One or more arrays. If more than one are given, a single maximum will be calculated across all arrays.
- 返回
maxabs -- The maximum absolute value across all arrays.
- 返回类型
float
实际案例
>>> maxabs((1, -10, 25, 2, 3)) 25 >>> maxabs((1, -10.5, 25, 2), (0.1, 100, -500), (-200, -300, -0.1, -499)) 500.0
-
geoist.gridder.utils.
n_1d_arrays
(arrays, n)[源代码]¶ Get the first n elements from a tuple/list, make sure they are arrays, and ravel.
- 参数
arrays (tuple of arrays) -- The arrays. Can be lists or anything that can be converted to a numpy array (including numpy arrays).
n (int) -- How many arrays to return.
- 返回
1darrays -- The converted 1D numpy arrays.
- 返回类型
tuple of arrays
实际案例
>>> import numpy as np >>> arrays = [np.arange(4).reshape(2, 2)]*3 >>> n_1d_arrays(arrays, n=2) (array([0, 1, 2, 3]), array([0, 1, 2, 3]))
-
geoist.gridder.utils.
parse_engine
(engine)[源代码]¶ Choose the best engine available and check if it's valid.
- 参数
engine (str) -- The name of the engine. If
"auto"
will favor numba if it's available.- 返回
engine -- The name of the engine that should be used.
- 返回类型
str
-
geoist.gridder.utils.
variance_to_weights
(variance, tol=1e-15, dtype='float64')[源代码]¶ Converts data variances to weights for gridding.
Weights are defined as the inverse of the variance, scaled to the range [0, 1], i.e.
variance.min()/variance
.Any variance that is smaller than tol will automatically receive a weight of 1 to avoid zero division or blown up weights.
- 参数
variance (array or tuple of arrays) -- An array with the variance of each point. If there are multiple arrays in a tuple, will calculated weights for each of them separately. Can have NaNs but they will be converted to zeros and therefore receive a weight of 1.
tol (float) -- The tolerance, or cutoff threshold, for small variances.
dtype (str or numpy dtype) -- The type of the output weights array.
- 返回
weights -- Data weights in the range [0, 1] with the same shape as variance. If more than one variance array was provided, then this will be a tuple with the weights corresponding to each variance array.
- 返回类型
array or tuple of arrays
实际案例
>>> print(variance_to_weights([0, 2, 0.2, 1e-16])) [1. 0.1 1. 1. ] >>> print(variance_to_weights([0, 0, 0, 0])) [1. 1. 1. 1.] >>> for w in variance_to_weights(([0, 1, 10], [2, 4.0, 8])): ... print(w) [1. 1. 0.1] [1. 0.5 0.25]
geoist.gridder.vector module¶
Classes for dealing with vector data.
-
geoist.gridder.vector.
GREENS_FUNC_2D_JIT
(east, north, mindist, poisson)¶ Calculate the Green's functions for the 2D elastic case.
-
class
geoist.gridder.vector.
Vector
(components)[源代码]¶ 基类:
geoist.gridder.base.BaseGridder
Fit an estimator to each component of multi-component vector data.
Provides a convenient way of fitting and gridding vector data using scalar gridders and estimators.
Each data component provided to
fit
is fitted to a separated estimator. Methods likegrid
andpredict
will operate on the multiple components simultaneously.警告
Never pass code like this as input to this class:
[vd.Trend(1)]*3
. This creates 3 references to the same instance ofTrend
, which means that they will all get the same coefficients after fitting. Use a list comprehension instead:[vd.Trend(1) for i in range(3)]
.- 参数
components (tuple or list) -- A tuple or list of the estimator/gridder instances used for each component. The estimators will be applied for each data component in the same order that they are given here.
-
components
¶ Tuple of the fitted estimators on each component of the data.
- Type
tuple
-
region_
¶ The boundaries (
[W, E, S, N]
) of the data used to fit the interpolator. Used as the default region for thegrid
andscatter
methods.- Type
tuple
参见
geoist.gidder.Chain
Chain filtering operations to fit on each subsequent output.
-
fit
(coordinates, data, weights=None)[源代码]¶ Fit the estimators to the given multi-component data.
The data region is captured and used as default for the
grid
andscatter
methods.All input arrays must have the same shape. If weights are given, there must be a separate array for each component of the data.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
data (tuple of array) -- The data values of each component at each data point. Must be a tuple.
weights (None or tuple of array) -- If not None, then the weights assigned to each data point of each data component. Typically, this should be 1 over the data uncertainty squared.
- 返回
Returns this estimator instance for chaining operations.
- 返回类型
self
-
predict
(coordinates)[源代码]¶ Evaluate each data component on a set of points.
Requires a fitted estimator (see
fit
).- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
- 返回
data -- The values for each vector component evaluated on the given points. The order of components will be the same as was provided to
fit
.- 返回类型
tuple of array
-
class
geoist.gridder.vector.
VectorSpline2D
(poisson=0.5, mindist=10000.0, damping=None, force_coords=None, engine='auto')[源代码]¶ 基类:
geoist.gridder.base.BaseGridder
Elastically coupled interpolation of 2-component vector data.
This gridder assumes Cartesian coordinates.
Uses the Green's functions based on elastic deformation from [SandwellWessel2016]. The interpolation is done by estimating point forces that generate an elastic deformation that fits the observed vector data. The deformation equations are based on a 2D elastic sheet with a constant Poisson's ratio. The data can then be predicted at any desired location.
The east and north data components are coupled through the elastic deformation equations. This coupling is controlled by the Poisson's ratio, which is usually between -1 and 1. The special case of Poisson's ratio -1 leads to an uncoupled interpolation, meaning that the east and north components don't interfere with each other.
The point forces are traditionally placed under each data point. The force locations are set the first time
fit
is called. Subsequent calls will fit using the same force locations as the first call. This configuration results in an exact prediction at the data points but can be unstable.[SandwellWessel2016] stabilize the solution using Singular Value Decomposition but we use ridge regression instead. The regularization can be controlled using the damping argument. Alternatively, you can specify the position of the forces manually using the force_coords argument. Regularization or forces not coinciding with data points will result in a least-squares estimate, not an exact solution. Note that the least-squares solution is required for data weights to have any effect.
Before fitting, the Jacobian (design, sensitivity, feature, etc) matrix for the spline is normalized using
sklearn.preprocessing.StandardScaler
without centering the mean so that the transformation can be undone in the estimated forces.- 参数
poisson (float) -- The Poisson's ratio for the elastic deformation Green's functions. Default is 0.5. A value of -1 will lead to uncoupled interpolation of the east and north data components.
mindist (float) -- A minimum distance between the point forces and data points. Needed because the Green's functions are singular when forces and data points coincide. Acts as a fudge factor. A good rule of thumb is to use the average spacing between data points.
damping (None or float) -- The positive damping regularization parameter. Controls how much smoothness is imposed on the estimated forces. If None, no regularization is used.
force_coords (None or tuple of arrays) -- The easting and northing coordinates of the point forces. If None (default), then will be set to the data coordinates the first time
fit
is called.engine (str) -- Computation engine for the Jacobian matrix and predictions. Can be
'auto'
,'numba'
, or'numpy'
. If'auto'
, will use numba if it is installed or numpy otherwise. The numba version is multi-threaded and usually faster, which makes fitting and predicting faster.
-
force_
¶ The estimated forces that fit the observed data.
- Type
array
-
region_
¶ The boundaries (
[W, E, S, N]
) of the data used to fit the interpolator. Used as the default region for thegrid
andscatter
methods.- Type
tuple
-
fit
(coordinates, data, weights=None)[源代码]¶ Fit the gridder to the given 2-component vector data.
The data region is captured and used as default for the
grid
andscatter
methods.All input arrays must have the same shape.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
data (tuple of array) -- A tuple
(east_component, north_component)
of arrays with the vector data values at each point.weights (None or tuple array) -- If not None, then the weights assigned to each data point. Must be one array per data component. Typically, this should be 1 over the data uncertainty squared.
- 返回
Returns this estimator instance for chaining operations.
- 返回类型
self
-
jacobian
(coordinates, force_coords, dtype='float64')[源代码]¶ Make the Jacobian matrix for the 2D coupled elastic deformation.
The Jacobian is segmented into 4 parts, each relating a force component to a data component [SandwellWessel2016]:
| J_ee J_ne |*|f_e| = |d_e| | J_ne J_nn | |f_n| |d_n|
The forces and data are assumed to be stacked into 1D arrays with the east component on top of the north component.
- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
force_coords (tuple of arrays) -- Arrays with the coordinates for the forces. Should be in the same order as the coordinate arrays.
dtype (str or numpy dtype) -- The type of the Jacobian array.
- 返回
jacobian -- The (n_data*2, n_forces*2) Jacobian matrix.
- 返回类型
2D array
-
predict
(coordinates)[源代码]¶ Evaluate the fitted gridder on the given set of points.
Requires a fitted estimator (see
fit
).- 参数
coordinates (tuple of arrays) -- Arrays with the coordinates of each data point. Should be in the following order: (easting, northing, vertical, ...). Only easting and northing will be used, all subsequent coordinates will be ignored.
- 返回
data -- A tuple
(east_component, north_component)
of arrays with the predicted vector data values at each point.- 返回类型
tuple of arrays
-
geoist.gridder.vector.
greens_func_2d
(east, north, mindist, poisson)[源代码]¶ Calculate the Green's functions for the 2D elastic case.
-
geoist.gridder.vector.
jacobian_2d_numba
(east, north, force_east, force_north, mindist, poisson, jac)[源代码]¶ Calculate the Jacobian matrix using numba to speed things up.
-
geoist.gridder.vector.
jacobian_2d_numpy
(east, north, force_east, force_north, mindist, poisson, jac)[源代码]¶ Calculate the Jacobian matrix using numpy broadcasting.
Module contents¶
Create and operate on data grids, scatters, and profiles.