Prerequisites
- Full Admin user role. For more information, see Manage Accounts.
- JVM version 17+
- You must download the indexer using the Secure Access GUI. See Step 6 in Create an Exact Data Match Identifier.
- Before running the DLP indexer you must generate an API Key and Secret using the Secure Access GUI. See Step 8 in Create an Exact Data Match Identifier.
|
If you already have a key and secret generated for use with a previous run
of the DLP indexer, you may use those.
|
- The indexer supports indexing source files with up to 55 million records. The exact records limit is determined by the total number of columns and how many of those are of alphanumeric type. The indexer will report the exact limit when attempting to load a file that exceeds it. If your dataset is larger than the limit, you need to split the records into multiple files. For errors received when indexing a large file, see Memory Tuning for DLP Exact Data Matching Indexer.
- The source data CSV file you index must meet the following requirements:
-
A multi-term (multi-word) field can contain a maximum of 6 space-separated words.
-
The data file must contain only 1 byte or 2 byte UTF-8 encoded characters.
-
The first row of data must have between 1 and 50 fields and each row must have the same number of fields.
-
The first row of data must specify the name of each field, and each value must be unique.
-
Data in the second and ensuing rows must comply with the EDM field types and supported formats (See Exact Data Match Field Types.)
-
The field names in the sample data template must match the field names in the actual data source file. The field names must appear in the same order in both files.
|
Do not create, edit, or view the source data CSV file using Microsoft
Excel, as this may corrupt the file. Use a text editor.
|
|
If any of the values provided in the source file to the DLP indexer fail to
be validated as per the supported format, then the DLP indexer will skip that record and
proceed with indexing the remaining records. The indexer also behaves in this manner for any
records that may exceed the template-defined fields, and for empty rows or records with
empty primary values. The DLP indexer generates messages reporting the position of any
skipped records in the file.
|