Qore DataProvider Module Reference  1.0.3
DataProvider Module

Introduction to the DataProvider Module

The DataProvider module provides APIs for hierarchical data structures from arbitrary sources to be described, queried, introspected, and updated. It also supports data providers with request-reply semantics such as REST schemas or with SOAP messaging.

The data provider module supports high-performance reading (searching) and writing as well as record creation and upserting and transaction management if supported by the underlying data provider implementation as well.

The Qore command-line program qdp provides a user-friendly interface to data provider functionality.

This module provides the following primary classes:

The following supporting type classes are also provided:

Data Provider Modules

This module uses the "QORE_DATA_PROVIDERS" environment variable to register data provider modules. Each data provider registration module must provide one of the following two public functions.

Data Provider Dynamic Discovery

Implement a public function with the following signature to support dynamic discovery of data providers:

# returns a hash of connection provider factory names to module names
public hash<string, string> sub get_data_provider_map() { ... }

Data Provider Type Dynamic Discovery

Implement a public function with the following signature to support dynamic discovery of data provider types:

# returns a hash of type prefix paths (ex: \c "qore/sftp") to module names
public hash<string, string> sub get_type_provider_map() { ... }

Data provider registration modules declared in the "QORE_DATA_PROVIDERS" environment variable must be separated by the platform-specific PathSep character as in the following examples:

Unix Example
export QORE_DATA_PROVIDERS=MyConnectionProvider:OtherConnectionProvider
Windows CMD.EXE Example
set QORE_DATA_PROVIDERS=MyConnectionProvider;OtherConnectionProvider
Windows PowerShell Example

Data Provider Pipelines

Data provider pipelines allow for efficient processing of record or other data in multiple streams and, if desired, in multiple threads.

Pipeline data can be any data type except a list, as list values are interpreted as multiple output values in pipeline procesor objects.

Data Provider Pipeline Bulk Processing

Bulk processing is processing of record data that is in "hash of lists" form, so a single hash, where each key value is a list of values for that key. Records can be formed as by taking each hash key and then using each list value in order for the values of each record. In case a key is assigned a single value instead of a list, it's interpreted as constant value for all records. Note that the first key value for bulk data must be a list of values in order for the bulk data to be properly detected.

Each pipeline processor element must declare if it is compatible with bulk processing by implementing the AbstractDataProcessor::supportsBulkApiImpl() method.

If a processor does not support the bulk API, but bulk data is submitted, then the bulk data will be separately iterated, and each record will be passed separately to the processor with a significant performance penalty when processing large volumes of data.

Release Notes

DataProvider v1.0.4

  • fixed type-handling bugs handling data provider options (issue 4062)

DataProvider v1.0.3

DataProvider v1.0.2

DataProvider v1.0.1

DataProvider v1.0

  • initial release of the module