Tag: Copilot Data Preparation

  • How to Prepare Your Data Model for Copilot in Power BI: The Analyst’s Checklist

    The single biggest factor in Copilot in Power BI output quality is not the AI model — it is your data model. A well-structured data model with clear naming and rich descriptions produces accurate, useful Copilot responses. A poorly structured model produces hallucinated metrics, wrong aggregations, and confused narratives that erode trust in the tool before it has a chance to prove its value.

    This checklist covers every data model preparation step required before enabling Copilot on your Power BI workspaces. Complete these items and Copilot becomes a reliable analyst assistant. Skip them and you will spend more time correcting Copilot than doing the work yourself.

    Why Data Model Quality Determines Copilot Quality

    Copilot in Power BI reads your data model the way a new analyst reads your documentation. It uses table names, column names, measure descriptions, relationships, and data types to understand what your data represents and how to answer questions about it. If your model is ambiguous, Copilot’s answers will be ambiguous.

    The difference is stark. In a well-prepared model, asking Copilot “What was total revenue by region last quarter?” returns an accurate table with correct aggregations. In a poorly prepared model, the same question might aggregate the wrong column, use the wrong date table, or return a number that nobody recognizes because it summed a column that should have been averaged.

    The 10-Point Pre-Copilot Data Model Audit

    1. Validate Star Schema Structure

    Copilot works best with star schema models — a central fact table surrounded by dimension tables. Flat, denormalized tables with dozens of columns confuse Copilot because it cannot distinguish between attributes for grouping and values for aggregating.

    What to check: Identify your fact tables (transactions, events, measures) and dimension tables (products, customers, dates, regions). Every fact table should connect to dimension tables through foreign key relationships. If you have a single flat table with 50+ columns, refactor it into a proper star schema before enabling Copilot.

    2. Fix Table and Column Naming

    Copilot reads names literally. A column named “Amt” means nothing to the AI. A column named “Sales Amount” is immediately understood.

    Naming rules for Copilot:

    • Use full, descriptive names: “Customer Name” not “CustNm”
    • Use spaces in display names, not underscores or camelCase
    • Prefix fact table columns with the metric type: “Total Sales,” “Count of Orders,” “Average Deal Size”
    • Name dimension tables as nouns: “Customers,” “Products,” “Dates”
    • Avoid abbreviations that are not universally known in your organization

    Renaming columns in Power BI Desktop does not affect your source queries — it only changes the display name in the model. This is a low-risk, high-impact change.

    3. Write Measure Descriptions

    This is the single highest-impact preparation step. Measure descriptions are natural language explanations of what each measure calculates, and Copilot uses them directly to understand your metrics.

    Where to add them: In Power BI Desktop, select a measure in the model view, then look at the Properties pane. The Description field accepts free text up to 500 characters.

    How to write them:

    • Bad: “Revenue” (just the name repeated)
    • Better: “Total revenue from all product sales”
    • Best: “Sum of net revenue from completed product sales, excluding returns and cancellations. Calculated from the Sales Amount column in the Sales fact table, filtered to Order Status = Completed. Currency: USD.”

    A good description tells Copilot what the measure calculates, which columns and filters it uses, and any important context about units or exclusions. Write descriptions for every measure, not just the complex ones — even simple SUM measures benefit from descriptions that specify the business meaning.

    4. Define Relationships Correctly

    Copilot uses relationships to understand how tables connect. Ambiguous or missing relationships cause Copilot to either guess (often wrong) or fail to answer cross-table questions.

    What to check:

    • Every fact-to-dimension relationship should be many-to-one (many rows in the fact table to one row in the dimension)
    • Avoid bidirectional cross-filtering unless absolutely necessary — it confuses Copilot’s aggregation logic
    • Remove inactive relationships that serve no current purpose
    • Ensure every dimension table has a single primary key column with no duplicates
    • If you have role-playing dimensions (e.g., Order Date and Ship Date both pointing to a Date table), use relationship management to clarify which is active

    5. Set Correct Data Types

    Copilot uses data types to determine how to display and aggregate values. A date stored as text will not support time intelligence. A currency stored as a plain decimal will not format correctly in Copilot responses.

    Critical data type checks:

    • Dates must be Date or DateTime type (not text strings like “2026-01-15”)
    • Currency values should use the Currency/Fixed Decimal type
    • Percentages should be formatted as percentages in the model (not just decimals that happen to represent percentages)
    • Integer IDs should be Whole Number type, not text
    • Boolean flags should be True/False type, not 0/1 integers

    6. Create a Proper Date Table

    Copilot’s time intelligence capabilities depend entirely on having a proper date table marked as a date table in the model.

    Requirements:

    • A dedicated date dimension table (not just a date column in your fact table)
    • Marked as a date table in Power BI (Table tools → Mark as date table)
    • Contains a continuous date range with no gaps
    • Includes standard calendar hierarchy columns: Year, Quarter, Month, Week
    • If your business uses a fiscal calendar, include fiscal year, fiscal quarter, and fiscal month columns

    Without a proper date table, questions like “What was revenue last quarter?” or “Show me the year-over-year trend” will fail or return incorrect results.

    7. Configure Summarization Defaults

    Every numeric column in your model has a default summarization (Sum, Average, Count, Min, Max, None). Copilot uses these defaults when a user asks a question without specifying the aggregation type.

    Common mistakes:

    • ID columns defaulting to Sum (Copilot will sum customer IDs if asked about customers)
    • Price columns defaulting to Sum instead of Average
    • Quantity columns defaulting to Count instead of Sum

    Review every numeric column and set the default summarization to match the most common business use. Set ID columns and non-aggregatable numbers to “Don’t summarize.”

    8. Organize with Display Folders

    Display folders help Copilot understand which measures and columns belong together conceptually. A model with 200 measures in a flat list is harder for Copilot to navigate than one organized into folders like “Revenue Metrics,” “Customer Metrics,” and “Operational KPIs.”

    In Power BI Desktop, select measures or columns and set the Display Folder property in the Properties pane. Use a clear, descriptive folder hierarchy.

    9. Test with Row-Level Security

    If your model uses Row-Level Security (RLS), test Copilot responses under each RLS role. Copilot respects RLS filters, which means different users may get different answers to the same question. This is correct behavior but can be confusing if not anticipated.

    Key considerations:

    • Copilot responses are filtered by the current user’s RLS role — a regional manager asking about “total revenue” will see only their region’s revenue
    • Test edge cases: what happens when an RLS-filtered user asks about data outside their scope?
    • Document which RLS roles exist and how they affect Copilot responses

    10. Run a Copilot Smoke Test

    After completing items 1 through 9, enable Copilot on a test workspace and run a standard set of questions:

    1. “What was total [primary metric] last month?” — Tests basic aggregation and time intelligence
    2. “Show me [primary metric] by [top dimension]” — Tests cross-table relationships
    3. “Compare [metric A] and [metric B] over time” — Tests multi-measure queries
    4. “What is the trend in [metric] this year?” — Tests time intelligence and visualization
    5. “Summarize this report page” — Tests Copilot’s ability to read your visualizations

    If any of these return incorrect or confusing results, trace the issue back to one of the nine preparation items above. The fix is always in the model, never in the question.

    Common DAX Patterns That Affect Copilot

    Copilot generates and interprets DAX, so certain patterns in your existing measures affect how well Copilot can work with your model.

    Patterns Copilot handles well: CALCULATE with simple filters, SUMX and AVERAGEX over a single table, basic time intelligence (SAMEPERIODLASTYEAR, DATEADD), DIVIDE for safe division.

    Patterns that confuse Copilot: Nested CALCULATE with complex filter context, CROSSFILTER modifications, dynamic security patterns, measures that reference other measures through multiple levels of indirection.

    If you have complex measures, write descriptions that explain what they calculate in plain language. Copilot may not be able to generate equivalent DAX, but it can reference the existing measure correctly if the description is clear.

    Performance Considerations

    Copilot adds query load to your capacity. Each Copilot interaction generates one or more DAX queries against your model. For large models (over 10 GB or 100 million rows), consider these adjustments:

    • Enable aggregations for large fact tables to speed up common query patterns
    • Use composite models strategically — DirectQuery tables add latency to Copilot responses
    • Monitor capacity utilization after enabling Copilot to ensure query performance remains acceptable
    • Set appropriate query timeout limits in the workspace settings

    Frequently Asked Questions

    What is the most important step to prepare a data model for Copilot in Power BI?

    Writing measure descriptions is the single highest-impact preparation step. Copilot uses measure descriptions to understand what each metric calculates, which directly determines the accuracy of its responses to natural language questions.

    Does Copilot work with flat table data models?

    Copilot works best with star schema models. Flat, denormalized tables with many columns make it difficult for Copilot to distinguish between attributes for grouping and values for aggregating, leading to inaccurate responses.

    How do column names affect Copilot in Power BI?

    Copilot reads column names literally to understand your data. Abbreviated names like “CustNm” or “Amt” confuse the AI, while descriptive names like “Customer Name” and “Sales Amount” produce accurate responses. Renaming columns in Power BI Desktop is low-risk as it only changes the display name.

    Does row-level security affect Copilot responses?

    Yes. Copilot respects row-level security filters, so different users may receive different answers to the same question based on their RLS role. A regional manager asking about total revenue will see only their region’s data. Test Copilot under each RLS role before deployment.

    What data types should I set for Copilot compatibility?

    Dates must be Date or DateTime type (not text strings), currency values should use the Currency/Fixed Decimal type, percentages should be formatted as percentages in the model, and ID columns should be set to “Don’t summarize” to prevent Copilot from aggregating them.