Building Semantic Models

Creating a Semantic Model

Create a new model

Navigate to Semantic Models in the sidebar and click New Model. Enter a name that describes the domain this model covers (e.g., “E-Commerce Analytics” or “Product Usage”). Select the database connection it will query against and set the default database.

Add entities from the schema browser

Click Add Entity to open the schema browser. Browse your database tables and select the ones you want to include. Each table becomes an entity - a business object like “Orders”, “Customers”, or “Products”.Give each entity a business-friendly name. For example, rename your raw table names to readable labels like “Customers” and “Orders”.

Configure columns

For each entity, configure its columns to make them meaningful for the AI and your team.

Define relationships

Connect your entities so the AI can write multi-table queries with correct JOINs.

Create measures

Define reusable aggregation formulas that the AI and visual builder can reference by name.

Configuring Columns

Each column in an entity can be customized with the following properties:

Property	Description
Name	Business-friendly display name (e.g., `cust_nm` becomes “Customer Name”)
Data type	`string`, `number`, `date`, `boolean`, `array`, or `object`
Column role	`dimension` (for grouping), `measure` (for aggregation), or `key` (for joining)
Description	Plain language explanation - this is what the AI reads to understand the column
Display format	Formatting rules like `$#,###.00` for currencies or `#.##%` for percentages
Hidden	Hide columns from the AI’s suggestions while keeping them available for joins and filters

Column descriptions are critical for AI accuracy. A column named status could mean anything - adding a description like “Order fulfillment status: pending, processing, shipped, delivered, cancelled” helps the AI generate correct filters.

Virtual Columns

Virtual columns use a SQL expression instead of referencing a source column. They are useful for:

Extracting nested values: JSONExtractString(metadata, 'campaign_source') to pull a value from a JSON field
Date transformations: toStartOfMonth(created_at) to bucket dates
Computed values: price * quantity to calculate line item totals

To create a virtual column, click Add Column on an entity and enter the SQL expression in the Expression field instead of selecting a source column.

Column Roles

Assigning the correct role to each column helps the AI choose appropriate query patterns:

Dimension: Columns used for grouping, filtering, and labeling (e.g., country, product name, status). These appear in GROUP BY clauses.
Measure: Columns used for aggregation (e.g., amount, quantity, price). These appear inside SUM, AVG, COUNT, and other aggregate functions.
Key: Columns used for joining entities (e.g., customer_id, order_id). These are used in ON clauses when the AI writes JOINs.

Defining Relationships

Relationships connect entities to enable multi-table queries. Each relationship requires:

From entity and column - the foreign key side (e.g., Orders.customer_id)
To entity and column - the primary key side (e.g., Customers.id)
Cardinality - the type of relationship

Cardinality	Example
One-to-one	A customer has one profile
One-to-many	A customer has many orders
Many-to-one	Many orders belong to one customer
Many-to-many	Products belong to many categories, and categories contain many products

Relationship Validation

After defining a relationship, click Validate to sample your actual data and verify the relationship integrity. The validation report includes:

Match rate - percentage of foreign key values that found a match in the target entity
Matched count - number of successfully joined rows
Orphaned count - foreign key values with no matching primary key
Null count - rows where the join column is null
Sample matches - a preview of successfully joined rows
Semantic warnings - potential issues like low match rates or high null counts

Validation status is tracked as one of:

Status	Meaning
`pending`	Not yet validated
`valid`	High match rate, relationship is reliable
`warning`	Some issues detected (e.g., moderate orphaned records)
`invalid`	Low match rate or significant data quality issues

Relationships with invalid status may cause the AI to generate queries that return incomplete or incorrect results. Fix the underlying data quality issue or adjust the relationship definition before using the model.

Creating Measures

Measures are reusable aggregation formulas that ensure consistent metric definitions across all queries, dashboards, and reports.

Simple Measures

Define a single aggregation expression:

SUM(amount) - Total Revenue
COUNT(DISTINCT customer_id) - Unique Customers
AVG(price) - Average Price
MIN(created_at) - First Order Date
MAX(updated_at) - Last Update

Each measure has a data type (number, currency, or percent) and an optional display format to control how values appear in results.

Compound Measures

Compound measures reference other measures by their IDs through dependent measure references. This lets you build derived calculations:

Average Order Value = Total Revenue / Total Orders
Profit Margin = (Total Revenue - Total Cost) / Total Revenue
Conversion Rate = Completed Orders / Total Sessions

Define measures for any metric your team asks about frequently. When everyone uses the same “Total Revenue” measure, you avoid discrepancies caused by different people writing slightly different aggregation queries.

Best Practices

Write detailed column descriptions

The AI relies on column descriptions to understand your data. Vague names like type or value need clear descriptions. Include possible values for categorical columns (e.g., “Payment method: credit_card, debit_card, paypal, wire_transfer”).

Hide internal columns

Mark implementation-specific columns (internal IDs, audit fields, denormalized caches) as hidden. This reduces noise in AI suggestions without removing the columns from the model.

Use virtual columns for JSON fields

If your tables store semi-structured data in JSON columns, create virtual columns that extract the commonly queried fields. This makes them available as first-class dimensions and measures.

Validate all relationships

Always run validation after defining a relationship. A relationship with a low match rate or high orphan count will lead to incorrect query results. Fix data quality issues at the source before relying on the relationship.

Keep models focused

Create separate semantic models for distinct domains (e.g., “Sales Analytics” and “Product Usage”) rather than one large model with every table. This improves AI accuracy by reducing the search space.

​Creating a Semantic Model

​Configuring Columns

​Virtual Columns

​Column Roles

​Defining Relationships

​Relationship Validation

​Creating Measures

​Simple Measures

​Compound Measures

​Best Practices

Creating a Semantic Model

Configuring Columns

Virtual Columns

Column Roles

Defining Relationships

Relationship Validation

Creating Measures

Simple Measures

Compound Measures

Best Practices