Open Geo Hub

Spatial Data Anonymization Made Easy

What is Open GeoHub (OGH):

OGH’s mission is to bring you GIS tools which make everyday spatial analysis easier. As a first step, we have added tools to make spatial data anonymization accessible for everyone. This is to promote open science and transparency and facilitate publication of open datasets for research and practice purposes.

Using the anonymization tools available on Open GeoHub you can easily anonymize your spatial datasets. You may need to anonymize your datasets for a number of reasons. For example:

One may easily anonymize their data by adding noise or through othe rmethods. However, a blind anonymization will either not provide a robust enough anonymization, or result in a dataset which is no longer usable. Finding the subtle balance between safety and maiting data quality is the key in a successfull anonymization process.

On this website you will find two methods for anonymizing your datasets (currently only for point data):

  1. donut anonymizer
  2. advanced anonymizer

 

1. Donut anonymization:

In this method the point is randomly relocated to a location which is at least a and at most b units of distance away from the original location. Therefore, the point can be relocated to any location within this donut area.

...

This is a fast and efficient process which will provide sufficient protection for most datasets.

2. Advanced anonymization:

This method uses a more sophisticated context based approach to anonymize datasets using a customized Gaussian function. This method is developed by Kamyar Hasanzadeh, PhD, and his colleagues and it is fully described in this paper.

This method takes contextual factors into consideration in the anonymization. Specifically, the population density and number of neighbors will be taken into account during this process. This is a more resource-hungry process which is suitable for more sensitive data. An example of this is home location data collected as points through a map-based survey.

...
...

K-anonymity

In this method you can also estimate the K-anonymity. A feature is K-anonymous if it cannot be identified from among at least K-1 other features. For example, instead of sharing points representing individuals' homes, one would share the neighborhood, the grid cell, the region, or any other bigger corresponding spatial unit. Such techniques help prevent a data subject from being singled out by grouping them with, at least, K−1 other individuals.

In this tool we estimate the expected level of k-anonymity for each individual point by multiplying the local population density by a circular ring area approximation of the Gaussian probability distribution function.