Creating a CDA
Data Connection & Access in one file ..
Last updated
Was this helpful?
Data Connection & Access in one file ..
Last updated
Was this helpful?
CDA (Community Data Access) is a CTools component that provides data abstraction for diverse data sources through web services. While it was initially developed to serve as a bridge between data connections and the Community Dashboard Framework (CDF), its functionality has expanded.
Now, it can also be integrated with Report Designer to incorporate data into third-party applications.
Key features of CDA include:
• Multiple configurable output formats
• Performance optimization through customizable caching
• Server-side sorting and pagination capabilities
CDA's data flow works as follows:
When a dashboard (CDF/CDE) or external application requests data through CDA's endpoints
- CDA first checks if caching is enabled
- If enabled, it verifies:
• Whether results for this specific query and parameters exist in cache
• If cached results are still valid (not expired)
• If cache keys match
- Only if no valid cached data is found does CDA query the underlying data sources
This architecture makes CDA an efficient middleware layer that minimizes unnecessary database queries while providing flexible data access to various front-end applications.
As we can see in the diagram, the available data sources for CDA are:
• SQL over JDBC or JNDI.
• MDX queries over Mondrian or olap4j.
• MQL queries over a Pentaho metadata connection.
• Kettle transformations.
• Scripting (only Beanshell and JavaScript are currently supported).
• XPath over XML files.
• Compound queries.
CDA examples: /public/CTools-Dashboard/CDA
There are multiple ways to create CDA data sources. One of the ways is to use CDE, where no code or XML is needed, and we will cover this later in the CDE workshop.
There is another way, the hard way, which is editing the file by hand.
The CDA files that are XML files with a .cda extension. This way, Pentaho will recognize the file extension and will provide the capability to preview the results or edit the file. The main structure of a CDA file is the following:
xml
files begin with an XML declaration, followed by structured elements that define connections and data access configurations. The primary element is CDADescriptor
, which contains data source definitions that can be shared across multiple queries.
Rather than repeatedly defining database connections within individual queries, we centralize connection settings in the data source element. This makes sense since multiple queries often use the same connection parameters.
Each data access definition requires specific attributes to be set. Here's how the structure works:
• XML declaration at the top
• CDADescriptor
as the root element
• Data source elements containing shared connection settings
• Individual data access configurations with their required attributes
ID
• Defines a unique identifier
• Used to reference specific connections in queries
• Must be unique across all connections
Type
• Specifies the connection type
• Determines required internal elements
• In CDE: Automatically configured based on selected data source
Once connections are established, proceed to create data access queries. Each query should follow this structure:
id
This is used to define the data access identifier that will be used in the components.
connection
This is the identifier of the connection created previously. Different DataAccess id can share the same connection id.
access
This defines whether the data access is visible. Here we can have one of two values: private or public. Private will say that the data access will not be visible, and public says the opposite. You may want to define that a data source is private when it is just to be used inside compound queries to create unions or joins between queries.
cache
This defines whether the results of the query will be cached. Possible values are true and false. You should set it to true if you want your query to be cached. The default value is true.
cacheDuration
This defines the cache duration in seconds. The query will be executed again after the specified seconds have passed. The query will be executed and the results cached again. This attribute will be ignored when the cache is set to false. The default value is 3600, the same as one hour expressed in seconds.
type
We have the same goal when defining a connection and a Data Access, but they have different purposes, so we also need to specify the query type.
Here's an outline of a connection and data access using a JSON query.
Common Properties
There are some common properties that should or can be used when defining a Data Access. These properties are:
• Cache: The cache can also be defined as an attribute when defining a Data Access. When defining the cache as an element, we should also specify the two attributes, duration and enabled. The first attribute is used to define the time that the query will be cached since the last execution. The enabled attribute will be set to true or false depending on whether you want to enable it or disable it.
• Name: This is the friendly name of the data access being defined.
• Columns: This is an element that can create a different output by changing the name of a column or just by adding new ones using calculated columns. To change the name of columns, you would just need to specify the columns' idx, starting from 0, and the desired name, as shown in the following example:
To create a calculated column, we need to specify the name of the new column and the formula to be used. The formulas should match the open formula specification.
• Query: Almost all Data Access makes use of this element. Refer to each one of the the data sources types referred earlier to get more information.
• Parameters: These are the parameters to be sent/used in the query. This element lets us define a different output other than the one defined in the queries.