Skip to content

Data Diagnostics

Dataface includes a powerful diagnostics tool (face diagnose) that automatically profiles and visualizes tables, schemas, or dbt models. It helps you understand data structure, quality, and characteristics before you start building dashboards.

Note: The face diagnose command is currently in development. See the CLI Reference for available commands and plans/features/CLI_DIAGNOSTICS.md for the planned implementation.

Overview

The diagnostics feature generates visual dashboards that show: - Column Profiles: Null counts, distinct values, and data types. - Distributions: Histograms and frequency charts. - Data Quality: Identification of empty columns, high null rates, and potential issues. - Sample Data: Interactive previews of actual data rows.

Usage

Single Table Diagnostics

Profile a specific table in your warehouse:

face diagnose --table analytics.orders

Or profile a dbt model by name:

face diagnose --model orders

Output: Generates a YAML dashboard file (e.g., diagnostics/analytics_orders.yml) that you can view immediately in Dataface.

Schema-Level Diagnostics

Profile all tables in a specific schema:

face diagnose --schema analytics

Preview Mode

View a quick summary directly in your terminal without generating a file:

face diagnose --table analytics.orders --preview terminal

Diagnostic Metrics

The generated dashboard provides a comprehensive view of your data:

1. Overview

  • Row and column counts
  • Table size
  • Summary of data quality alerts

2. Column Statistics

For each column, Dataface calculates: - Completeness: Percentage of non-null values. - Uniqueness: Count of distinct values. - Range: Min, max, mean, and median (for numeric fields). - Patterns: Detection of common formats (email, dates, etc.).

3. Visualizations

  • Histograms: For numeric distributions.
  • Bar Charts: For categorical value frequencies.
  • Timelines: For date/time field distributions.

Why Use Diagnostics?

  • Speed: Instantly understand a new dataset without writing dozens of SQL queries.
  • Quality Assurance: Catch data quality issues (like empty columns or unexpected nulls) early in the modeling process.
  • Documentation: The generated dashboards serve as visual documentation for your dbt models.