Articles on: Getting Started

Custom Export

In cases where sharing via data partner isn’t possible or preferable, data can also be custom exported from your internal systems and shared with us via a secured S3 bucket.

File Requirements



File Naming



Each file uploaded to S3 must include two components in the file name:

The category of data (marketing or events)
A .csv or .csv.gz (if GZIP compression is used) extension

Other than the requirements listed in this section, files can have any name and prefix (folder) structure. Once an import is complete, the file will automatically be renamed, but will still reside in the same S3 location. You should not rename the file back to the original extension since it will be imported again.

Single App



Single app files must also include the app's store ID (on Android, a package name like com.company.app and on iOS, a numeric ID like 123456789) in the file name.

For example, a marketing data file for an Android app with store ID “com.company.app” could be named marketing_com.company.app_2021_01.csv. An app events data file for an iOS app with store ID “123456789” could be named events_123456789_2021_02_01.csv.

Multiple Apps



Files that contain data for multiple apps must instead include the text __multi__ (leading and trailing double underscores) anywhere in the file name. Each app's store ID must be specified in each line under a column called "Store ID" within the CSV file's content. See Custom Marketing Data and Custom App Events Data for more information specific to each data category.

For example, a marketing data file containing data for multiple apps could be named marketing_2021_01__multi__.csv. Similarly, an app events file containing data for multiple apps could be named events_2021_02_01__multi__.csv.

File Size



Each file (both marketing and app events) must be a maximum of 1 GB after optional compression. If the data set is larger, it should be split into smaller files delineated by time period so that each fits within the maximum file size. If you are splitting a data set into more than one file, be sure you understand the overwriting rules.

Compression



Files may be compressed using the GZIP format. Any such files must end with a .csv.gz extension (see File Naming for more information on naming requirements).


Data Requirements



Formatting



Custom exported files must adhere to the following CSV formatting requirements:

Records must be separated by a new line
Values must be separated by a comma
The first line must be the header row
Text values must be double-quoted when they contain commas
Decimal values must use a dot (.) as the decimal separator
Missing values must be represented by a blank (empty value)
Double-quotes must be escaped by a second double-quote character when the text value contains a double-quote character
Dates must be formatted according to the ISO 8601 standard (e.g., 2021-01-01 for January 1, 2021)
Countries must be formatted according to ISO 3166-1 alpha-2 standard (e.g., US for United States)
File encoding must be UTF-8 without a byte order mark (BOM)

Duplicates



There must be no duplicate rows in the CSV files. Rows with matching dimensional columns (e.g., date, channel, campaign, country, etc.) are considered duplicates. If any CSV contains duplicates, the import process will not fail, but imported metrics will likely seem low since only one row within each set of duplicates is actually imported.

Specifications



The precise specifications of the data will differ based on the data category. More information and examples are contained in the articles below.

Marketing
App Events


Import Process



S3 Upload



We grant you write access to a secured, encrypted S3 bucket and provide you the S3 URI, AWS region, and programmatic credentials (access key ID and a secret access key). You can use the AWS S3 CLI or API to upload your files to the bucket and prefix (folder) using the provided credentials. Your credentials also allow you to list and download files. AWS Management Console access is disabled for security reasons.

If you are unfamiliar with S3, please see our detailed AWS CLI S3 upload instructions for step-by-step guidance.

Time Period



The historical export period should be ideally 1 year up to the most recent date that complete data is available, which is normally yesterday. If that isn’t possible, the period should be the full available time period.

For ongoing data exports, only the new data since the last export should be included. Therefore, if the ongoing sync frequency is daily, each export should only contain the new data from the previous day. This works differently for marketing and app events files.

For marketing files, new data from the previous day only includes a single date of data. For app events files, new data from the previous day can include cohorted metrics for many install dates. Therefore, ongoing app events exports must start with the install date of the latest incomplete cohort in the previous export. This is calculated by subtracting the highest cohort day you are importing from the latest install date in the previous export. For example, if you’re importing metrics up to cohort day 30, and the previous export ended on January 31, the next export must include the app events from January 1 (January 31 minus 30 days).

Frequency



The frequency included in your pricing plan determines not only how often your models are refreshed, but also how often new input data is imported. New data can be uploaded to S3 at any time, but it will only be imported at the frequency specified in your pricing plan. See the Queuing section below for more details.

The only time your pricing plan does not dictate the frequency of data imports is during the initial data import phase prior to the first model being trained. During this phase, the primary goal is to ensure the input data is in order so that the model resulting from that data is reliable. For that reason, data will be imported daily until the data is validated and initial model is trained.

You can change the days of the week that imports take place by contacting your Customer Success Manager. The following defaults are used.

FrequencyDefault Days of the Week
WeeklyMonday
2x Per WeekMonday and Thursday
3x Per WeekMonday, Wednesday, and Friday


Overwriting



If any import includes dates that overlap with existing data that was previously imported, the new data will overwrite the existing data for the overlapping dates. Sometimes this is important, like when a previous import was incomplete or incorrect, the flawed data can easily be overwritten by simply subsequently uploading a fixed copy containing the same dates.

Queuing



If multiple files of the same data category are uploaded for a single app, they will be imported one at a time in the order in which they were uploaded. Therefore, due to overwriting rules, it is important that any files of the same data category that contain overlapping dates are uploaded in the order in which you wish them to be imported. Usually, this means you should upload files containing earlier dates first.

This is most relevant when the frequency included in your pricing plan is less often than every day. Several files can be pending import, uploaded over a number of days. They will always be imported one at a time in the order in which they were uploaded, each overwriting any dates that overlap with previous files according to the overwriting rules.

Updated on: 31/08/2022

Was this article helpful?

Share your feedback

Cancel

Thank you!