Skip to main content

SQL Queries

Testing

CLI based Ingestion

Install the Plugin

pip install 'acryl-datahub[sql-queries]'

Config Details

Note that a . is used to denote nested fields in the YAML recipe.

FieldDescription
dialect 
string
The SQL dialect of the queries, e.g. snowflake
platform 
string
The platform for which to generate data, e.g. snowflake
query_file 
string
Path to file to ingest
default_db
string
The default database to use for unqualified table names
default_schema
string
The default schema to use for unqualified table names
platform_instance
string
The instance of the platform that all assets produced by this recipe belong to
env
string
The environment that all assets produced by this connector belong to
Default: PROD
usage
BaseUsageConfig
The usage config to use when generating usage statistics
Default: {'bucket_duration': 'DAY', 'end_time': '2023-07-25...
usage.bucket_duration
Enum
Size of the time window to aggregate usage stats.
Default: DAY
usage.end_time
string(date-time)
Latest date of usage to consider. Default: Current time in UTC
usage.format_sql_queries
boolean
Whether to format sql queries
Default: False
usage.include_operational_stats
boolean
Whether to display operational stats.
Default: True
usage.include_read_operational_stats
boolean
Whether to report read operational stats. Experimental.
Default: False
usage.include_top_n_queries
boolean
Whether to ingest the top_n_queries.
Default: True
usage.start_time
string(date-time)
Earliest date of usage to consider. Default: Last full day in UTC (or hour, depending on bucket_duration)
usage.top_n_queries
integer
Number of top queries to save to each table.
Default: 10
usage.user_email_pattern
AllowDenyPattern
regex patterns for user emails to filter in usage.
Default: {'allow': ['.*'], 'deny': [], 'ignoreCase': True}
usage.user_email_pattern.allow
array(string)
usage.user_email_pattern.deny
array(string)
usage.user_email_pattern.ignoreCase
boolean
Whether to ignore case sensitivity during pattern matching.
Default: True

Code Coordinates

  • Class Name: datahub.ingestion.source.sql_queries.SqlQueriesSource
  • Browse on GitHub

Questions

If you've got any questions on configuring ingestion for SQL Queries, feel free to ping us on our Slack.