Create a Data Classification
You can create a data classification to help you monitor content with specific characteristics. Custom data classifications can be used in real time rules, SaaS API rules and discovery scans.
The building blocks for data classifications are data identifiers. The data identifiers that you choose for a data classification determine the type of data for which rules using that data classification will scan.
Using the inclusion and exclusion options on the Data Classification page, you can fine-tune your classification to be more precise and reduce false positives. You can exclude specific terms and regular expression (regex) patterns by creating a custom identifier in the exclusion area, or by excluding a pre-existing built-in identifier.
The exclusion applies only to the content that's been matched, not to every document that meets the exclusion criteria.
For example, consider a data classification that targets the built-in identifier Health Condition and Person Name (US). You want to block instances of "John Smith cancer" while allowing instances of "John Smith cancer fundraising." To achieve this, you can craft a custom data identifier for "cancer fundraising" and set up an exclusion for this identifier within your data classification. As a result, matches for the Health Condition and Person Name (US) identifier will be flagged, except when the phrase "cancer fundraising" is present.
The system compares data identifiers selected for exclusion against both the keywords and the proximity terms of included data identifiers. If the Data Loss Prevention Report reveals that a particular rule or identifier is generating false positives, consider using terms and identifiers exclusion to remedy the situation.
Note: If you select a data identifier for both inclusion and exclusion, exclusion will take precedence.
The system offers two types of built-in data identifiers you can choose from: