HIPAA Data Classification

Personal medical and healthcare data associated with the United States' Health Insurance Portability and Accountability Act. Customers can use these identifiers to support HIPAA compliance within their organizations.

For more details on each of these built-in data classifications, see Built-In Data Classifications. You can create custom data classifications based on any of these built-in data classifications. Simply copy the desired built-in classification, make changes to the copy, and save it as a custom data classification.

The building blocks for data classifications are data identifiers. The data identifiers selected for a data classification determine the type of data for which rules using that data classification will scan. Built-in data classifications have data identifiers already selected for them; you can produce your own customized versions of these classifications by removing selected identifiers or adding other identifiers.

Using the inclusion and exclusion options on the Data Classification page, you can fine-tune your classification to be more precise and reduce false positives. You can exclude specific terms and regular expression (regex) patterns by creating a custom identifier in the exclusion area, or by excluding a pre-existing built-in identifier.

The exclusion applies only to the content that's been matched, not to every document that meets the exclusion criteria.

For example, consider a data classification that targets the built-in identifier Health Condition and Person Name (US). You want to block instances of "John Smith cancer" while allowing instances of "John Smith cancer fundraising." To achieve this, you can craft a custom data identifier for "cancer fundraising" and set up an exclusion for this identifier within your data classification. As a result, matches for the Health Condition and Person Name (US) identifier will be flagged, except when the phrase "cancer fundraising" is present.

The system compares data identifiers selected for exclusion against both the keywords and the proximity terms of included data identifiers. If the Data Loss Prevention Report reveals that a particular rule or identifier is generating false positives, consider using terms and identifiers exclusion to remedy the situation.

If you select a data identifier for both inclusion and exclusion, exclusion will take precedence.

The system offers two types of built-in data identifiers you can choose to remove from or add to your customized version of a built-in data classification:

Built-In Identifiers
These identify data using pattern matching and dictionary lookups. The descriptions shown in the GUI provide details about the type of data they match. For more information, see Built-In Data Identifiers.
Machine Learning Identifiers
These identify data based on AI analysis of example documents. For example, the identifier for Patent Files has been trained to recognize documents that are likely patent applications. For more information, see Built-In Data Identifiers.

The system offers three types of data identifiers you can create yourself applying different methods of data analysis. You can add these to your customized version of a built-in data classification:

Custom Identifiers
You can create custom identifiers to match specific terms and pattern expressions of your choosing. See Create a Custom Identifier .
Exact Data Match Identifiers
Exact Data Match Identifiers use fingerprinting to identify data in structured documents that match criteria you define. (See Create an Exact Data Match Identifier for more information.)
Indexed Document Match Identifiers
Indexed Document Match Identifiers use fingerprinting to identify data in unstructured documents that match criteria you define. See Create an Indexed Document Match Identifier for more information.

To delete or edit a data classification, see Delete or Edit a Classification.

Previous topic GDPR Data Classification Next topic Prerequisites