Skip to main content

Creating a Semantic Model

1

Create a new model

Navigate to Semantic Models in the sidebar and click New Model. Enter a name that describes the domain this model covers (e.g., “E-Commerce Analytics” or “Product Usage”). Select the database connection it will query against and set the default database.
2

Add entities from the schema browser

Click Add Entity to open the schema browser. Browse your database tables and select the ones you want to include. Each table becomes an entity - a business object like “Orders”, “Customers”, or “Products”.Give each entity a business-friendly name. For example, rename your raw table names to readable labels like “Customers” and “Orders”.
3

Configure columns

For each entity, configure its columns to make them meaningful for the AI and your team.
4

Define relationships

Connect your entities so the AI can write multi-table queries with correct JOINs.
5

Create measures

Define reusable aggregation formulas that the AI and visual builder can reference by name.

Configuring Columns

Each column in an entity can be customized with the following properties:
PropertyDescription
NameBusiness-friendly display name (e.g., cust_nm becomes “Customer Name”)
Data typestring, number, date, boolean, array, or object
Column roledimension (for grouping), measure (for aggregation), or key (for joining)
DescriptionPlain language explanation - this is what the AI reads to understand the column
Display formatFormatting rules like $#,###.00 for currencies or #.##% for percentages
HiddenHide columns from the AI’s suggestions while keeping them available for joins and filters
Column descriptions are critical for AI accuracy. A column named status could mean anything - adding a description like “Order fulfillment status: pending, processing, shipped, delivered, cancelled” helps the AI generate correct filters.

Virtual Columns

Virtual columns use a SQL expression instead of referencing a source column. They are useful for:
  • Extracting nested values: JSONExtractString(metadata, 'campaign_source') to pull a value from a JSON field
  • Date transformations: toStartOfMonth(created_at) to bucket dates
  • Computed values: price * quantity to calculate line item totals
To create a virtual column, click Add Column on an entity and enter the SQL expression in the Expression field instead of selecting a source column.

Column Roles

Assigning the correct role to each column helps the AI choose appropriate query patterns:
  • Dimension: Columns used for grouping, filtering, and labeling (e.g., country, product name, status). These appear in GROUP BY clauses.
  • Measure: Columns used for aggregation (e.g., amount, quantity, price). These appear inside SUM, AVG, COUNT, and other aggregate functions.
  • Key: Columns used for joining entities (e.g., customer_id, order_id). These are used in ON clauses when the AI writes JOINs.

Defining Relationships

Relationships connect entities to enable multi-table queries. Each relationship requires:
  • From entity and column - the foreign key side (e.g., Orders.customer_id)
  • To entity and column - the primary key side (e.g., Customers.id)
  • Cardinality - the type of relationship
CardinalityExample
One-to-oneA customer has one profile
One-to-manyA customer has many orders
Many-to-oneMany orders belong to one customer
Many-to-manyProducts belong to many categories, and categories contain many products

Relationship Validation

After defining a relationship, click Validate to sample your actual data and verify the relationship integrity. The validation report includes:
  • Match rate - percentage of foreign key values that found a match in the target entity
  • Matched count - number of successfully joined rows
  • Orphaned count - foreign key values with no matching primary key
  • Null count - rows where the join column is null
  • Sample matches - a preview of successfully joined rows
  • Semantic warnings - potential issues like low match rates or high null counts
Validation status is tracked as one of:
StatusMeaning
pendingNot yet validated
validHigh match rate, relationship is reliable
warningSome issues detected (e.g., moderate orphaned records)
invalidLow match rate or significant data quality issues
Relationships with invalid status may cause the AI to generate queries that return incomplete or incorrect results. Fix the underlying data quality issue or adjust the relationship definition before using the model.

Creating Measures

Measures are reusable aggregation formulas that ensure consistent metric definitions across all queries, dashboards, and reports.

Simple Measures

Define a single aggregation expression:
  • SUM(amount) - Total Revenue
  • COUNT(DISTINCT customer_id) - Unique Customers
  • AVG(price) - Average Price
  • MIN(created_at) - First Order Date
  • MAX(updated_at) - Last Update
Each measure has a data type (number, currency, or percent) and an optional display format to control how values appear in results.

Compound Measures

Compound measures reference other measures by their IDs through dependent measure references. This lets you build derived calculations:
  • Average Order Value = Total Revenue / Total Orders
  • Profit Margin = (Total Revenue - Total Cost) / Total Revenue
  • Conversion Rate = Completed Orders / Total Sessions
Define measures for any metric your team asks about frequently. When everyone uses the same “Total Revenue” measure, you avoid discrepancies caused by different people writing slightly different aggregation queries.

Best Practices

The AI relies on column descriptions to understand your data. Vague names like type or value need clear descriptions. Include possible values for categorical columns (e.g., “Payment method: credit_card, debit_card, paypal, wire_transfer”).
Mark implementation-specific columns (internal IDs, audit fields, denormalized caches) as hidden. This reduces noise in AI suggestions without removing the columns from the model.
If your tables store semi-structured data in JSON columns, create virtual columns that extract the commonly queried fields. This makes them available as first-class dimensions and measures.
Always run validation after defining a relationship. A relationship with a low match rate or high orphan count will lead to incorrect query results. Fix data quality issues at the source before relying on the relationship.
Create separate semantic models for distinct domains (e.g., “Sales Analytics” and “Product Usage”) rather than one large model with every table. This improves AI accuracy by reducing the search space.