Metadata scanning is a feature of Microsoft Fabric that allows you to quickly catalog and report on all the metadata of your organization’s Fabric items, such as lakehouses, warehouses, notebooks, SQL scripts, and more. Metadata scanning uses a set of Admin REST APIs that are collectively known as the scanner APIs. With the scanner APIs, you can extract information such as item name, owner, sensitivity label, endorsement status, and last refresh. For lakehouses and warehouses, you can also extract the metadata of some of the objects they contain, such as table and column names, measures, DAX expressions, mashup queries, and so forth. The metadata of these internal objects is referred to as subartifact metadata. For a more extensive list of the item and subartifact metadata that metadata scanning returns, see the documentation for the Admin – WorkspaceInfo GetScanResult API.
Metadata scanning can help you with various governance scenarios, such as:
- Data discovery and cataloging: You can use metadata scanning to discover and catalog all the data assets in your organization and their properties. You can also use tools such as Azure Purview or Azure Data Catalog to integrate with metadata scanning and provide a rich data governance experience.
- Data quality and compliance: You can use metadata scanning to monitor and validate the quality and compliance of your data assets. You can also use tools such as Azure Data Factory or Databricks to perform data quality checks or remediation actions based on the metadata scanning results.
- Data lineage and impact analysis: You can use metadata scanning to track and analyze the lineage and impact of your data assets. You can also use tools such as Azure Synapse Analytics or Power BI to visualize and explore the data lineage and impact using the metadata scanning results.
Before you can use metadata scanning over your organization’s Fabric workspaces, you need to set it up by a Fabric administrator. Setting up metadata scanning involves two steps:
- Enabling service principal authentication for read-only admin APIs.
- Enabling tenant settings for detailed dataset metadata scanning.
What is Microsoft Fabric UI and How to Use It in Your SharePoint Projects
Enabling service principal authentication for read-only admin APIs
Service principal is an authentication method that can be used to let an Azure AD application access Fabric APIs. With this authentication method, you don’t have to maintain a service account with an admin role. Rather, to allow your app to use the Admin APIs, you just have to give your approval once as part of the tenant settings configuration.
To enable service principal access to read-only Admin APIs, follow these steps:
- Register an Azure AD application in your tenant using the Azure portal or PowerShell. For more information, see Register an application with Azure AD.
- Assign the application permissions for Fabric.Read.All using the Azure portal or PowerShell. For more information, see Add permissions.
- Grant admin consent for Fabric.Read.All using the Azure portal or PowerShell. For more information, see Grant tenant-wide admin consent.
- Go to Fabric portal > Settings > Admin portal > Tenant settings > Admin API settings.
- Enable Allow service principals to use read-only admin APIs.
- Enter the application ID of your Azure AD application in the Service principal IDs box.
If you don’t want to enable service principal authentication, metadata scanning can be performed with standard delegated admin access token authentication.
Enabling tenant settings for metadata scanning
Two tenant settings control metadata scanning:
- Enhance admin APIs responses with detailed metadata: This setting turns on Model caching and enhances API responses with low-level dataset metadata (for example, name and description) for tables, columns, and measures.
- Enhance admin APIs responses with DAX and mashup expressions: This setting allows the API response to include DAX expressions and Mashup queries. This setting can only be enabled if the first setting is also enabled.
To enable these settings, go to Fabric portal > Settings > Admin portal > Tenant settings > Admin API settings.
Model caching
Enhanced metadata scanning uses a caching mechanism to ensure that capacity resources are not impacted . Getting low-level metadata requires that the model be available in memory. To make sure Fabric shared or Premium capacity resources aren’t impacted by having to load the model for every API call, the enhanced metadata scanning feature uses successful dataset refreshes and republishing by creating a cache of the model that is loaded into memory on those occasions. Then, when enhanced metadata scanning takes place, API calls are made against the cached model. No load is placed on your capacity resources due to enhanced metadata scanning.
Caching happens every successful dataset refresh and republish only if the following conditions are met:
- The Enhance Admin APIs responses with detailed metadata admin tenant setting is enabled (see Enable tenant settings for metadata scanning ).
- There has been a call to the scanner APIs within the last 90 days.
If some metadata you expected to receive is not returned, check with your Fabric admin to make sure they have enabled all relevant admin switches.
Next steps
After you have set up metadata scanning in your organization, you can use the scanner APIs to retrieve metadata from your organizations items. For more information, see [Metadata scanning].
You can also use tools such as Azure Purview or Azure Data Catalog to integrate with metadata scanning and provide a rich data governance experience. For more information, see [Integrate Azure Purview with Microsoft Fabric] and [Integrate Azure Data Catalog with Microsoft Fabric].
How to manage Power BI visuals admin settings in Microsoft Fabric
Conclusion
Metadata scanning is a feature of Microsoft Fabric that allows you to quickly catalog and report on all the metadata of your organization’s Fabric items. It can help you with various governance scenarios, such as data discovery, data quality, and data lineage.
However, before you can use metadata scanning, you need to set it up by a Fabric administrator. Setting up metadata scanning involves enabling service principal authentication for read-only admin APIs and enabling tenant settings for detailed dataset metadata scanning