De-identifying sensitive API data

php
Description

APIs are often used to manage sensitive data, however in many cases the data must be partially masked or entirely anonymized before it can be presented to the client. A few such situations include:

  • Displaying sensitive PII (such as health-related data) to individuals lacking permission to view the entire record.
  • Sending data to machine learning models.
  • Sending customer data to third-party marketing research firms.
  • Presenting valuable intellectual property in a public forum (i.e. hiding manufacturing details pertaining to a particular composite or chemical)

DreamFactory developers have several options regarding implementing de-identification of data. If the API in question is retrieving data from a database, you can attach a post-process event handler to the endpoint, and use PHP, Python (version 2 or 3), or NodeJS to intercept and manipulate the response before it's returned to the client. Let's use the following record as a representative record for the examples that follow:

{
  "emp_id": "111AD87BY",
  "birth_date": "1953-09-02",
  "first_name": "Steve",
  "last_name": "Smith",
  "hire_date": "1986-06-26"
}

Applying a Data Mask to JSON Response

Sometimes you would like to present a partial representation of a particular value, such as a phone number or employee ID. This would allow a customer representative to verify the employee's identity without having full access to a potentially sensitive piece of information. Here is an example of how that would be done using a post-process event handler:

$responseBody = $event['response']['content'];

foreach ($responseBody['resource'] as $n => $record) {
    $record["employee_id"] = substr_replace($record["employee_id"], '***', 0, 3);
    $responseBody['resource'][$n] = $record;
}

$event['response']['content'] = $responseBody;

Once enabled, the records returned from this endpoint would look like this:

{
  "emp_id": "111AD8***",
  "birth_date": "1953-09-02",
  "first_name": "Steve",
  "last_name": "Smith",
  "hire_date": "1986-06-26"
}

Removing Data from a JSON Response

Sometimes data masking won't be enough to properly de-identify a sensitive record; you might have to remove the data altogether. This is easily accomplished using PHP's unset() function in conjunction with a DreamFactory post-process event handler:

$responseBody = $event['response']['content'];

foreach ($responseBody['resource'] as $n => $record) {
    unset($record["employee_id"]);
    unset($record["birth_date"]);
    $responseBody['resource'][$n] = $record;
}

$event['response']['content'] = $responseBody;

Need API Advice?

Our team has advised thousands of companies around the world on API projects. Go to market faster by talking to the API experts.

jeanie

Ready to get started?