Report Configuration¶

Overview¶

This document details the configuration options available to customize the behavior of individual reports. These configurations provide the flexibility to set up different functionalities and presentations tailored to the specific needs of each report type or for each account. You can manage these configuration keys directly through the reports page on the platform.

flowchart TB
    Platform --> ProjectService["Project Service"]
    ProjectService --> Vortex["Vortex (vt_extractor_properties)"]
    Microservices <--> Vortex

    classDef service fill:#c9f,stroke:#333;
    classDef data fill:#f96,stroke:#333;
    class ProjectService,Microservice service;
    class Vortex database;

Available Configurations¶

`ACTIVE_CHILD_TASK_LIMIT`¶

Default Value: 250
Description: Limits the maximum number of concurrent child processor workflows a single crawler can spawn. The default limit of 250 is in place to manage resource consumption within the Kubernetes environment and prevent potential infrastructure overload caused by an excessive number of pods. This configuration helps control the resource load generated by each report/crawler, safeguarding the stability of the underlying infrastructure.
while the default is 250, this value can be overridden on a per-report basis when a higher degree of parallelism is required due to client priority or report complexity. For instance, for the report identified as 1m, this limit has been increased to 500.
Possible Values: Any Numeric Value
Component Affected: Task Queue Processor Workflow (directly consumed within Temporal). No separate microservices are involved in consuming this configuration.

`ADDITIONAL_DESTINATION`¶

Default Value: null
Description: Enables additional email-based delivery of the same dataset to multiple recipients while ensuring data isolation. This is specifically useful when multiple clients require identical data at the same intervals, but only one of them actively uses the platform. By using this feature, the system avoids duplicating projects and maintains delivery confidentiality—recipients added through this setting will not appear in platform activity logs. Currently, this configuration is only used in the FINRA BrokerCheck project.
Possible Values: valid email address
Component Affected: Phoenix Delivery Service

`ALLOW_SUPPORTING_FILES_DIR`¶

Default Value: false
Description: When enabled, this setting preserves the original folder or directory structure of supporting files as they were during data crawling. By default, supporting files are exported in a flat hierarchy, but this option allows nested directories to be retained and reflected in the final delivery to the customer. This feature is particularly used in select reports under the PattonAllen account.
Possible Values: true or false.
Component Affected: Phoenix Delivery Service

`CRAWLER_ANOMALY`¶

Default Value: null
Description: Allows customization of the crawler's anomaly detection behavior on datasets. Accepts a JSON input to override default detection rules, enabling fine-grained control such as ignoring row count discrepancies or excluding specific fields from anomaly checks. This is useful for reports that require tailored anomaly handling based on known data patterns or client-specific needs.
Possible Values: json
Component Affected: Phoenix Data Quality Service

`CUSTOM_ZIP_FILE_NAME`¶

Default Value: null
Description: Overrides the default naming convention for the exported ZIP file. By default, the name of the first page in a multi-page report is used as the ZIP file name. This setting allows specifying a custom name to be used instead, providing more control over file naming for delivery and archival purposes.
Possible Values: desired file name
Component Affected: Phoenix Export Service, Export Workflow

`SUPPORTING_FILES_ZIP_NAME`¶

Default Value: null
Description: Similar to CUSTOM_ZIP_FILE_NAME but for supporting zip files.
Possible Values: desired file name
Component Affected: Phoenix Export Service, Export Workflow

`DATA_PROFILER`¶

Default Value: {"auto_profile": true}
Description: Enables or disables the automatic generation of data profiles for exported reports. Profiling is performed using DataPrep after each dataset is exported. This setting is useful for selectively turning off profiling for reports where it is not required or might be unnecessary.
Possible Values: json ({"auto_profile": true})
Component Affected: Phoenix Data Quality Service

`DATA_PROFILE_ANOMALY`¶

Default Value: null
Description: Provides a customizable configuration for detecting anomalies within the data profiler output. Accepts a JSON input to define report-specific anomaly detection rules—such as ignoring certain fields, setting thresholds, or excluding known variances. This allows for fine-tuned anomaly detection tailored to the unique characteristics of each report.
Possible Values: json
Component Affected: Phoenix Data Quality Service

`DEFAULT_IMAGE`¶

Default Value: null
Description: Specifies the default backend image to be used during report execution. While crawlers typically use the standard Vortex image, this setting allows you to override it with a different image for specific reports that require custom environments or dependencies.
Possible Values: Dropdown
Component Affected: Phoenix Scheduler Service

`DELIVERY_DESTINATION`¶

Default Value: false
Description: Controls whether delivery destinations can be configured for a report. When disabled, the platform prevents users from setting or modifying delivery destinations, effectively restricting output distribution through standard delivery channels.
Possible Values: true or false
Component Affected: Phoenix Delivery Service

`DELIVERY_FAILURE_EMAILS`¶

Default Value: null
Description: Specifies a list of email addresses to notify in the event of a delivery failure. If delivery to any configured destination fails, alerts will be sent to the addresses provided in this list to ensure timely awareness and resolution.
Possible Values: list of email address
Component Affected: Phoenix Delivery Service

`EXPORT_CRITERIA`¶

Default Value: null
Description: Defines JSON-based rules to determine whether a dataset should be exported. The rules consist of column-value conditions, and data is exported only if the specified criteria are met. This allows for conditional exports based on dynamic dataset content, enabling greater control over report generation. This feature is used by Buildfax.
Possible Values: json
Component Affected: Phoenix Export Service

`EXPORT_FETCH_NUMBER_ROWS`¶

Default Value: null
Description: Defines JSON-based rules to determine whether a dataset should be exported. The rules consist of column-value conditions, and data is exported only if the specified criteria are met. This allows for conditional exports based on dynamic dataset content, enabling greater control over report generation. This feature is used by Buildfax.
Possible Values: Numeric Value
Component Affected: Phoenix Export Service

`META_FIELDS`¶

Default Value: null
Description: Specifies a JSON list of columns to include during export, enabling partial exports. Only the fields defined in this configuration will be included in the exported file, allowing for customized, minimal datasets tailored to specific use cases.
Possible Values: Numeric Value
Component Affected: Phoenix Export Service, Export Workflow

`PARALLEL_API_RUNS_ALLOWED`¶

Default Value: 10
Description: Determines the number of parallel runs allowed when initiating reports or crawlers via API calls. If the number of active runs exceeds the configured limit, additional requests will be queued until a slot becomes available. This helps manage system load and ensures controlled execution of concurrent tasks.
Possible Values: Numeric Value
Component Affected: Crawler Lifecycle Workflow

`PARQUET_SCHEMA`¶

Default Value: null
Description: Allows overriding of default data types in exported Parquet datasets. By default, all columns are treated as strings, but this configuration accepts a JSON schema that maps specific columns to their desired data types. This ensures accurate type representation and compatibility with downstream data processing tools.
Possible Values: json
Component Affected: Phoenix Export Service

`UNARCHIVAL_BATCH_SIZE`¶

Default Value: 30
Description: Controls the batch size used when unarchiving datasets from Avro files stored in S3. Datasets are typically archived after 3 days, and when clients request older data, the system restores it by inserting records back into MongoDB. For reports with a large number of columns or very large row sizes, reducing the batch size helps prevent memory exhaustion during this process.
Possible Values: any numeric value
Component Affected: Crawldata Service

`ZIP_ARC_FILES_REMOVE_TIMESTAMP`¶

Default Value: 0
Description: Determines whether to exclude the timestamp prefix from the names of files within the exported ZIP archive. When enabled, exported files will retain cleaner, static names without timestamp prefixes—useful for scenarios like email deliveries where consistent filenames are preferred or to replace old file in delivery destination like s3, google cloud, azure, ftp etc.
Possible Values: 0 or 1
Component Affected: Phoenix Export Service

How to Apply Configurations¶

To apply or modify report configurations:

Log in to the platform.
Navigate to the Reports page.
Locate the desired report and click the three-dot menu (...) on the right side of the row.
Select Report Configuration from the submenu.

A sidebar will open where you can view, or update configuration attributes as needed.

Important Considerations¶

Changes to these configurations may require a refresh or regeneration of the specific report.
Incorrect configuration values might lead to unexpected behavior or errors in report generation.
Always test configuration changes in a non-production environment first.

Report Configuration¶

Overview¶

Available Configurations¶

ACTIVE_CHILD_TASK_LIMIT¶

ADDITIONAL_DESTINATION¶

ALLOW_SUPPORTING_FILES_DIR¶

CRAWLER_ANOMALY¶

CUSTOM_ZIP_FILE_NAME¶

SUPPORTING_FILES_ZIP_NAME¶

DATA_PROFILER¶

DATA_PROFILE_ANOMALY¶

DEFAULT_IMAGE¶

DELIVERY_DESTINATION¶

DELIVERY_FAILURE_EMAILS¶

EXPORT_CRITERIA¶

EXPORT_FETCH_NUMBER_ROWS¶

META_FIELDS¶

PARALLEL_API_RUNS_ALLOWED¶

PARQUET_SCHEMA¶

UNARCHIVAL_BATCH_SIZE¶

ZIP_ARC_FILES_REMOVE_TIMESTAMP¶