Masks
Donut Masking¶
donut(gdf, low, high, container=None, distribution='uniform', seed=None, snap_to_streets=False)
¶
Apply donut masking to a GeoDataFrame, randomly displacing points between a minimum and maximum distance. Advantages of this mask is speed and simplicity, though it does not handle highly varied population densities well.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gdf |
GeoDataFrame
|
GeoDataFrame containing sensitive points. |
required |
low |
float
|
Minimum distance to displace points. Unit must match that of the |
required |
high |
float
|
Maximum displacement to displace points. Unit must match that of the |
required |
container |
GeoDataFrame
|
A GeoDataFrame containing polygons within which intersecting sensitive points should
remain after masking. This works by masking a point, checking if it intersects
the same polygon prior to masking, and retrying until it does. Useful for preserving
statistical relationships, such as census tract, or to ensure that points are not
displaced into impossible locations, such as the ocean. CRS must match that of |
None
|
distribution |
str
|
The distribution used to determine masking distances. |
'uniform'
|
seed |
int
|
Used to seed the random number generator so that masked datasets are reproducible. Randomly generated if left undefined. |
None
|
snap_to_streets |
bool
|
If True, points are snapped to the nearest node on the OSM street network after masking. This can reduce the chance of false-attribution. |
False
|
Returns:
Type | Description |
---|---|
GeoDataFrame
|
A GeoDataFrame containing masked points. |
Source code in maskmypy/masks/donut.py
Street Masking¶
street(gdf, low, high, max_length=1000, seed=None, padding=0.2)
¶
Apply street masking to a GeoDataFrame, displacing points along the OpenStreetMap street
network. This helps account for variations in population density, and reduces the likelihood
of false attribution as points are always displaced to the street network. Each point is
snapped to the nearest node on the network, then displaced along the surround network between
low
and high
nodes away.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gdf |
GeoDataFrame
|
GeoDataFrame containing sensitive points. |
required |
low |
int
|
Minimum number of nodes along the OSM street network to traverse. |
required |
high |
int
|
Maximum number of nodes along the OSM street network to traverse. |
required |
max_length |
float
|
When locating the closest node to each point on the street network, MaskMyPy verifies
that its immediate neighbours are no more than |
1000
|
seed |
int
|
Used to seed the random number generator so that masked datasets are reproducible. Randomly generated if left undefined. |
None
|
padding |
float
|
OSM network data is retrieved based on the bounding box of the sensitive GeoDataFrame.
Padding is used to expand this bounding box slightly to reduce unwanted edge-effects.
A value of |
0.2
|
Returns:
Type | Description |
---|---|
GeoDataFrame
|
A GeoDataFrame containing masked points. |
Source code in maskmypy/masks/street.py
Location Swapping¶
locationswap(gdf, low, high, address, seed=None, snap_to_streets=False)
¶
Applies location swapping to a GeoDataFrame, displacing points to a randomly selected address that is between a minimum and maximum distance away from the original point. While address data is the most common data type used to provide eligible swap locations, other point-based datasets may be used.
Note: If a sensitive point has no address points within range, the point is displaced to (0,0).
Example
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gdf |
GeoDataFrame
|
GeoDataFrame containing sensitive points. |
required |
low |
float
|
Minimum distance to displace points. Unit must match that of the |
required |
high |
float
|
Maximum displacement to displace points. Unit must match that of the |
required |
address |
GeoDataFrame
|
GeoDataFrame containing points that sensitive locations may be swapped to. While addresses are most common, other point-based data may be used as well. |
required |
seed |
int
|
Used to seed the random number generator so that masked datasets are reproducible. Randomly generated if left undefined. |
None
|
snap_to_streets |
bool
|
If True, points are snapped to the nearest node on the OSM street network after masking. This can reduce the chance of false-attribution. |
False
|
Source code in maskmypy/masks/locationswap.py
Voronoi Masking¶
voronoi(gdf, snap_to_streets=False)
¶
Apply voronoi masking to a GeoDataFrame, displacing points to the nearest edges of a vornoi diagram. Note: because voronoi masking lacks any level of randomization, snapping to streets is recommended for this mask to provide another level of obfuscation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
gdf |
GeoDataFrame
|
GeoDataFrame containing sensitive points. |
required |
snap_to_streets |
bool
|
If True, points are snapped to the nearest node on the OSM street network after masking. This can reduce the chance of false-attribution. |
False
|
Returns:
Type | Description |
---|---|
GeoDataFrame
|
A GeoDataFrame containing masked points. |