Batch Upload File Format
Once you’ve determined what data you’ll be including, and which audience you’ll be uploading the data to (if necessary), review the information in the sections below (including the sections pertaining to your file type) to make sure that the data is formatted in a way that Surfside can accept.
Once you’ve made sure that all of the data is formatted correctly, finalize the file for uploading.
File Formatting Guidelines Highlights
Make sure to properly format your files so that we can process them quickly. It’s especially important to keep the following key guidelines in mind:
- All files must contain an audience key (if you aren’t able to provide a client customer ID (CCID) as the audience key for your records, we will choose the PII identifier that has the largest fill rate (percentage of records with a value) and that identifier will be used as the audience key for all files in that audience).
- Once you decide on a format for your files, the audience key and the column headers for the identifier columns that we use for matching must stay exactly consistent across files.
- If you’ll be sending files with different audience keys and/or identifier types (for example, if you’ll be sending both files with PII and files with mobile device IDs), let your Surfside Customer Success manager know. Otherwise, we will assume that all files use the same audience key and contain the same type of identifiers.
- Unlike known identifiers, Surfside can accept only one type of anonymous identifier in a file: cookies, mobile device IDs, custom IDs (CIDs), or Liveramp IdentityLinks. Also, unlike PII identifiers, there should only be one column of anonymous identifiers in the file.
- A given field in an audience can have no more than 100 distinct values.
- A given segment in an audience must have a minimum of 25 unique records in it. If you send a segment that has fewer than 25 records, it will not be imported into your audience.
Formatting Guidelines for All Files
The automated process supports data files that include both customer-identifying information and mobile Device IDs within a single file.
Customer data should NOT contain any of the following:
- Social security, passport number, driver’s license number or other sensitive information
- Credit card numbers, bank account or other financial identifiers
- Medical information regarding physical or mental status, prescriptions or other private information
- Any other information that could be considered sensitive
Uploaded data should only consist of the files required for use withing our partnership. No other files should be part of the uploaded data.
Data File Format
Customer data must be formatted as text files, with one line per record and Tab characters delimiting fields. Character encoding should be UTF-8. Lines should be terminated with the newline character (0xA). Carriage return characters (0xD) are permitted and can act as part of the line termination.
The delimiter between the columns is comma (,). If a column can have multiple valid values, those values need to be delimited by a pipe (|) character. Recommended Device ID format is raw and un-hashed. If hashed values are provided, hashing algorithm must be either MD5 or SHA-1.
To maximize data accuracy, each record must contain the following:
- Unique Device ID, which represents the device and persists across files. This ID is used to identify differences in uploaded data as compared to previous data sets and to enable linking of the audience data.
- PII and geo data, such as email, lat/lon, IP address, etc.
If a column data is unavailable then an empty string needs to be passed for it.
Examples:
001dcbfdfd21a1b0b7f09886fa5c1886,md5,Android,33.754937,-117.992655,,,,,276,Crescent,ST,Apt,123,My City,nyc,20155,4476,,md5,1,,,,,,
039d319d28972c06ee6eca689db5e9fd,md5,iOS,32.307253,- 106.800895,,,,,1751,SAMPLE,AVE,LAS SAMPLE,NM,88005,,9a5fc87842f54eec35f84d485f630b757|01d1d4bb6886104b78c74c795 27e7efa|5a9b6aeb3d6b13a09597b87
023f9939e,md5,1,,,,,,
File Naming
Uploaded file names should have the partner name, followed by a hyphen (-), date (as YYYYMMDD) followed by a hyphen (-) and a data set name to distinguish among multiple uploaded sets of data.
A file suffix component indicting the file type, ex .txt or .csv, should be used.
Note: No hyphens in the company name. No periods in the data set name.
Example:
DataRepo-20200704-purchasers.txt
Compression
It is advised that large files be compressed using gzip compression in order to conserve space and reduce transfer time. Each data file should be compressed individually. A .gz file suffix component should be added if the file is compressed. Example:
DataRepo-20200704-purchasers.txt.gz
Refresh Frequency
Surfside recommends a daily data refresh however we also support Weekly, Monthly and One time only data ingestion. Surfside supports both full data upload and delta feeds. Delta feeds include only changed and new records as compared to the last uploaded data. In order to maximize data accuracy, Surfside recommends that you provide Device IDs in raw format and always include time stamps as specified in the “File Format” section in this document.
Empty Values
Some data records may not have values assigned. Empty values will be ignored and no segment will be assigned for a Device ID when there is no value present.