Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
5591965
Merge branch 'main' into 'enterprise'
aarthy-dk Aug 13, 2025
7208e8a
misc(connection): trim leading and trailing whitespaces
luis-dk Aug 12, 2025
f171aa3
Merge branch 'conn-details-trim' into 'enterprise'
Aug 13, 2025
bad261e
fix(test definitions): bugs in validate test query
aarthy-dk Aug 13, 2025
6ee7e37
Merge branch 'aarthy/validate-test' into 'enterprise'
Aug 13, 2025
3620af0
fix: display a show/hide icon for password fields
luis-dk Aug 12, 2025
40546bc
Merge branch 'password-fields' into 'enterprise'
Aug 14, 2025
603828c
feat: add partial name filter to table group list
luis-dk Aug 12, 2025
0923d65
refactor(ui): extend input component to support hidding password sugg…
luis-dk Aug 14, 2025
d3bb59a
fix(table groups) remove non-intended gap from fitlters section
luis-dk Aug 14, 2025
83cedfc
Merge branch 'table-groups-filter' into 'enterprise'
Aug 14, 2025
6a54f2e
fix(runs): duration display incorrect when > 24 hours
aarthy-dk Aug 14, 2025
defb105
Merge branch 'aarthy/run-duration' into 'enterprise'
Aug 14, 2025
f2dfe40
misc: Consolidate demo into a single quick-start call.
diogodk Aug 8, 2025
f968a2f
Merge branch 'TG-921' into 'enterprise'
Aug 15, 2025
d4d198f
feat(forms): improve field validation
aarthy-dk Aug 19, 2025
ab190bd
feat(radio-group): support vertical layout
aarthy-dk Aug 19, 2025
a5e2a06
fix(styles): misc css improvements
aarthy-dk Aug 19, 2025
b5757d4
refactor(auth): implement as plugin class
aarthy-dk Aug 19, 2025
1a55681
refactor: remove dead code
aarthy-dk Aug 19, 2025
61e2719
Merge branch 'aarthy/auth-plugin' into 'enterprise'
Aug 19, 2025
e4fbf51
Initial history update/Stale_Table test
cbloche Jul 28, 2025
51dba8c
Test logic fixes
cbloche Jul 30, 2025
b7f1378
Test SQL tweaks
cbloche Jul 30, 2025
35120c4
refactor: database service and param replace for process execution
aarthy-dk Jul 18, 2025
2cdffe6
feat: add history related fields to test definitions
luis-dk Aug 5, 2025
614747e
feat(tests): add log severity
luis-dk Aug 11, 2025
bc83dcc
feat(tests): customize visualization for stale table test type
luis-dk Aug 15, 2025
3b3d3f3
fix(tests): add missing result_signal column
luis-dk Aug 18, 2025
80c644b
fix(tests): add flavor specific stale table test generation
luis-dk Aug 19, 2025
2b56a40
Merge branch 'luis/history_fingerprint_tests' into 'enterprise'
Aug 21, 2025
b76bc22
fix(test definitions): id missing for user-defined tests
aarthy-dk Aug 21, 2025
44e6bd7
Merge branch 'aarthy/test-definition-id' into 'enterprise'
Aug 22, 2025
49ac837
fix(test types): rename "stale table" to "table freshness"
aarthy-dk Aug 25, 2025
f9e7476
fix(test definitions): remove unnecessary validation
aarthy-dk Aug 25, 2025
cddaf91
fix(test results): make binary bars wider
aarthy-dk Aug 25, 2025
1783886
fix(quick-start): adjust dates to 30 days ago
aarthy-dk Aug 25, 2025
98daf49
Merge branch 'aarthy/qa-fixes' into 'enterprise'
Aug 25, 2025
eaeb1f7
release: 4.20.4 -> 4.22.2
aarthy-dk Aug 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,11 +168,7 @@ Verify that you can login to the UI with the `TESTGEN_USERNAME` and `TESTGEN_PAS
The [Data Observability quickstart](https://docs.datakitchen.io/articles/open-source-data-observability/data-observability-overview) walks you through DataOps Data Quality TestGen capabilities to demonstrate how it covers critical use cases for data and analytic teams.

```shell
testgen quick-start --delete-target-db
testgen run-profile --table-group-id 0ea85e17-acbe-47fe-8394-9970725ad37d
testgen run-test-generation --table-group-id 0ea85e17-acbe-47fe-8394-9970725ad37d
testgen run-tests --project-key DEFAULT --test-suite-key default-suite-1
testgen quick-start --simulate-fast-forward
testgen quick-start
```

In the TestGen UI, you will see that new data profiling and test results have been generated.
Expand Down
6 changes: 1 addition & 5 deletions docs/local_development.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,11 +93,7 @@ testgen setup-system-db --yes

Seed the demo data.
```shell
testgen quick-start --delete-target-db
testgen run-profile --table-group-id 0ea85e17-acbe-47fe-8394-9970725ad37d
testgen run-test-generation --table-group-id 0ea85e17-acbe-47fe-8394-9970725ad37d
testgen run-tests --project-key DEFAULT --test-suite-key default-suite-1
testgen quick-start --simulate-fast-forward
testgen quick-start
```

### Run the Application
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "dataops-testgen"
version = "4.20.4"
version = "4.22.2"
description = "DataKitchen's Data Quality DataOps TestGen"
authors = [
{ "name" = "DataKitchen, Inc.", "email" = "info@datakitchen.io" },
Expand Down
71 changes: 30 additions & 41 deletions testgen/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,9 @@ def cli(ctx: Context, verbose: bool):
@click.option(
"-tg",
"--table-group-id",
required=False,
required=True,
type=click.STRING,
help="The identifier for the table group used during a profile run. Use a table_group_id shown in list-table-groups.",
default=None,
)
def run_profile(configuration: Configuration, table_group_id: str):
click.echo(f"run-profile with table_group_id: {table_group_id}")
Expand All @@ -136,16 +135,15 @@ def run_profile(configuration: Configuration, table_group_id: str):
"-tg",
"--table-group-id",
help="The identifier for the table group used during a profile run. Use a table_group_id shown in list-table-groups.",
required=False,
required=True,
type=click.STRING,
default=None,
)
@click.option(
"-ts",
"--test-suite-key",
help="The identifier for a test suite. Use a test_suite_key shown in list-test-suites.",
required=False,
default=settings.DEFAULT_TEST_SUITE_KEY,
required=True,
type=click.STRING,
)
@click.option(
"-gs",
Expand Down Expand Up @@ -339,27 +337,6 @@ def list_test_runs(configuration: Configuration, project_key: str, test_suite_ke


@cli.command("quick-start", help="Use to generate sample target database, for demo purposes.")
@click.option(
"--delete-target-db",
help="Will delete the current target database, if it exists",
is_flag=True,
default=False,
)
@click.option(
"--iteration",
"-i",
default=0,
required=False,
help="The monthly data increment snapshot. Can be 0, 1, 2 or 3. 0 is the initial data.",
)
@click.option(
"--simulate-fast-forward",
"-s",
default=False,
is_flag=True,
required=False,
help="For demo purposes, simulates that some time pass by and the target data is changing. This will call the iterations in order.",
)
@click.option(
"--observability-api-url",
help="Observability API url to be able to export TestGen data to Observability using the command 'export-observability'",
Expand All @@ -375,11 +352,10 @@ def list_test_runs(configuration: Configuration, project_key: str, test_suite_ke
default="",
)
@pass_configuration
@click.pass_context
def quick_start(
ctx: Context,
configuration: Configuration,
delete_target_db: bool,
iteration: int,
simulate_fast_forward: bool,
observability_api_url: str,
observability_api_key: str,
):
Expand All @@ -388,19 +364,32 @@ def quick_start(
if observability_api_key:
settings.OBSERVABILITY_API_KEY = observability_api_key

# Check if this is an increment or the initial state
if iteration == 0 and not simulate_fast_forward:
click.echo("quick-start command")
run_quick_start(delete_target_db)
click.echo("quick-start command")
run_quick_start(delete_target_db=True)

click.echo("loading initial data")
run_quick_start_increment(0)
minutes_offset = -30*24*60 # 1 month ago
table_group_id="0ea85e17-acbe-47fe-8394-9970725ad37d"

if not simulate_fast_forward:
click.echo(f"run-profile with table_group_id: {table_group_id}")
spinner = None
if not configuration.verbose:
spinner = MoonSpinner("Processing ... ")
message = run_profiling_queries(table_group_id, spinner=spinner, minutes_offset=minutes_offset)
click.echo("\n" + message)

LOG.info(f"run-test-generation with table_group_id: {table_group_id} test_suite: {settings.DEFAULT_TEST_SUITE_KEY}")
message = run_test_gen_queries(table_group_id, settings.DEFAULT_TEST_SUITE_KEY)
click.echo("\n" + message)

run_execution_steps(settings.PROJECT_KEY, settings.DEFAULT_TEST_SUITE_KEY, minutes_offset=minutes_offset)

for iteration in range(1, 4):
click.echo(f"Running iteration: {iteration} / 3")
minutes_offset = -10*24*60 * (3-iteration)
run_quick_start_increment(iteration)
else:
for iteration in range(1, 4):
click.echo(f"Running iteration: {iteration} / 3")
minutes_offset = 2 * iteration
run_quick_start_increment(iteration)
run_execution_steps(settings.PROJECT_KEY, settings.DEFAULT_TEST_SUITE_KEY, minutes_offset=minutes_offset)
run_execution_steps(settings.PROJECT_KEY, settings.DEFAULT_TEST_SUITE_KEY, minutes_offset=minutes_offset)

click.echo("Quick start has successfully finished.")

Expand Down
8 changes: 5 additions & 3 deletions testgen/commands/queries/execute_tests_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,9 +143,11 @@ def GetTestsNonCAT(self) -> tuple[str, dict]:
query = CleanSQL(query)
return query, params

def AddTestRecordtoTestRunTable(self) -> tuple[str, dict]:
# Runs on App database
return self._get_query("ex_write_test_record_to_testrun_table.sql")
def GetHistoricThresholdUpdate(self) -> tuple[str, dict]:
query, params = self._get_query("ex_update_history_threshold_last_n.sql")
if self._use_clean:
query = CleanSQL(query)
return query, params

def PushTestRunStatusUpdateSQL(self) -> tuple[str, dict]:
# Runs on App database
Expand Down
37 changes: 31 additions & 6 deletions testgen/commands/queries/generate_tests_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
from typing import ClassVar, TypedDict

from testgen.common import CleanSQL, date_service, read_template_sql_file
from testgen.common.database.database_service import get_queries_for_command, replace_params
from testgen.common.database.database_service import replace_params
from testgen.common.read_file import get_template_files

LOG = logging.getLogger("testgen")

Expand Down Expand Up @@ -67,11 +68,35 @@ def GetTestTypesSQL(self) -> tuple[str, dict]:

def GetTestDerivationQueriesAsList(self, template_directory: str) -> list[tuple[str, dict]]:
# Runs on App database
params = self._get_params()
queries = get_queries_for_command(template_directory, params)
if self._use_clean:
queries = [ CleanSQL(query) for query in queries ]
return [ (query, params) for query in queries ]
generic_template_directory = template_directory
flavor_template_directory = f"flavors.{self.sql_flavor}.{template_directory}"

query_templates = {}
try:
for query_file in get_template_files(r"^.*sql$", generic_template_directory):
query_templates[query_file.name] = generic_template_directory
except:
LOG.debug(
f"query template '{generic_template_directory}' directory does not exist",
exc_info=True,
stack_info=True,
)

try:
for query_file in get_template_files(r"^.*sql$", flavor_template_directory):
query_templates[query_file.name] = flavor_template_directory
except:
LOG.debug(
f"query template '{generic_template_directory}' directory does not exist",
exc_info=True,
stack_info=True,
)

queries = []
for filename, sub_directory in query_templates.items():
queries.append(self._get_query(filename, sub_directory=sub_directory))

return queries

def GetTestQueriesFromGenericFile(self) -> tuple[str, dict]:
# Runs on App database
Expand Down
10 changes: 6 additions & 4 deletions testgen/commands/queries/profiling_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,16 +51,18 @@ class CProfilingSQL:
contingency_columns = ""

exception_message = ""
minutes_offset = 0

_data_chars_sql: CRefreshDataCharsSQL = None
_rollup_scores_sql: CRollupScoresSQL = None

def __init__(self, strProjectCode, flavor):
def __init__(self, strProjectCode, flavor, minutes_offset=0):
self.flavor = flavor
self.project_code = strProjectCode
# Defaults
self.run_date = date_service.get_now_as_string()
self.today = date_service.get_now_as_string()
self.run_date = date_service.get_now_as_string_with_offset(minutes_offset)
self.today = date_service.get_now_as_string_with_offset(minutes_offset)
self.minutes_offset = minutes_offset

def _get_data_chars_sql(self) -> CRefreshDataCharsSQL:
if not self._data_chars_sql:
Expand Down Expand Up @@ -102,7 +104,7 @@ def _get_params(self) -> dict:
"PROFILE_ID_COLUMN_MASK": self.profile_id_column_mask,
"PROFILE_SK_COLUMN_MASK": self.profile_sk_column_mask,
"START_TIME": self.today,
"NOW_TIMESTAMP": date_service.get_now_as_string(),
"NOW_TIMESTAMP": date_service.get_now_as_string_with_offset(minutes_offset=self.minutes_offset),
"EXCEPTION_MESSAGE": self.exception_message,
"SAMPLING_TABLE": self.sampling_table,
"SAMPLE_SIZE": int(self.parm_sample_size),
Expand Down
6 changes: 5 additions & 1 deletion testgen/commands/run_execute_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,10 @@ def run_test_queries(
clsExecute.process_id = process_service.get_current_process_id()

try:
# Update Historic Test Thresholds
LOG.info("CurrentStep: Updating Historic Test Thresholds")
execute_db_queries([clsExecute.GetHistoricThresholdUpdate()])

# Retrieve non-CAT Queries
LOG.info("CurrentStep: Retrieve Non-CAT Queries")
lstTestSet = fetch_dict_from_db(*clsExecute.GetTestsNonCAT())
Expand Down Expand Up @@ -123,7 +127,7 @@ def run_execution_steps_in_background(project_code, test_suite):
empty_cache()
background_thread = threading.Thread(
target=run_execution_steps,
args=(project_code, test_suite, session.username),
args=(project_code, test_suite, session.auth.user_display),
)
background_thread.start()
else:
Expand Down
6 changes: 3 additions & 3 deletions testgen/commands/run_profiling_bridge.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ def run_profiling_in_background(table_group_id):
empty_cache()
background_thread = threading.Thread(
target=run_profiling_queries,
args=(table_group_id, session.username),
args=(table_group_id, session.auth.user_display),
)
background_thread.start()
else:
Expand All @@ -221,7 +221,7 @@ def run_profiling_in_background(table_group_id):


@with_database_session
def run_profiling_queries(table_group_id: str, username: str | None = None, spinner: Spinner | None = None):
def run_profiling_queries(table_group_id: str, username: str | None = None, spinner: Spinner | None = None, minutes_offset: int = 0):
if table_group_id is None:
raise ValueError("Table Group ID was not specified")

Expand All @@ -240,7 +240,7 @@ def run_profiling_queries(table_group_id: str, username: str | None = None, spin
params = get_profiling_params(table_group_id)

LOG.info("CurrentStep: Initializing Query Generator")
clsProfiling = CProfilingSQL(params["project_code"], connection.sql_flavor)
clsProfiling = CProfilingSQL(params["project_code"], connection.sql_flavor, minutes_offset=minutes_offset)

# Set General Parms
clsProfiling.table_groups_id = table_group_id
Expand Down
3 changes: 1 addition & 2 deletions testgen/common/date_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,7 @@ def parse_now(value: str) -> datetime:

def get_now_as_string_with_offset(minutes_offset):
ret = datetime.utcnow()
if minutes_offset > 0:
ret = ret + timedelta(minutes=minutes_offset)
ret = ret + timedelta(minutes=minutes_offset)
return ret.strftime("%Y-%m-%d %H:%M:%S")


Expand Down
2 changes: 1 addition & 1 deletion testgen/common/mixpanel_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def send_event(self, event_name, include_usage=False, **properties):
properties.setdefault("instance_id", self.instance_id)
properties.setdefault("edition", settings.DOCKER_HUB_REPOSITORY)
properties.setdefault("version", settings.VERSION)
properties.setdefault("username", session.username)
properties.setdefault("username", session.auth.user_display)
properties.setdefault("distinct_id", self.get_distinct_id(properties["username"]))
if include_usage:
properties.update(self.get_usage())
Expand Down
3 changes: 2 additions & 1 deletion testgen/common/models/entity.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@
from sqlalchemy import delete, select
from sqlalchemy.dialects import postgresql
from sqlalchemy.orm import InstrumentedAttribute
from sqlalchemy.sql.elements import BinaryExpression
from sqlalchemy.sql.elements import BinaryExpression, BooleanClauseList

from testgen.common.models import Base, get_current_session
from testgen.utils import is_uuid4, make_json_safe

ENTITY_HASH_FUNCS = {
BinaryExpression: lambda x: str(x.compile(compile_kwargs={"literal_binds": True})),
BooleanClauseList: lambda x: str(x.compile(compile_kwargs={"literal_binds": True})),
tuple: lambda x: [str(y) for y in x],
}

Expand Down
4 changes: 2 additions & 2 deletions testgen/common/models/profiling_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,10 @@ class ProfilingRunMinimal(EntityMinimal):
class ProfilingRunSummary(EntityMinimal):
profiling_run_id: UUID
start_time: datetime
end_time: datetime
table_groups_name: str
status: ProfilingRunStatus
process_id: int
duration: str
log_message: str
schema_name: str
table_ct: int
Expand Down Expand Up @@ -177,10 +177,10 @@ def select_summary(
)
SELECT v_profiling_runs.profiling_run_id,
v_profiling_runs.start_time,
v_profiling_runs.end_time,
v_profiling_runs.table_groups_name,
v_profiling_runs.status,
v_profiling_runs.process_id,
v_profiling_runs.duration,
v_profiling_runs.log_message,
v_profiling_runs.schema_name,
v_profiling_runs.table_ct,
Expand Down
8 changes: 6 additions & 2 deletions testgen/common/models/test_definition.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from dataclasses import dataclass
from datetime import datetime
from typing import Literal
from uuid import UUID
from uuid import UUID, uuid4

import streamlit as st
from sqlalchemy import (
Expand Down Expand Up @@ -65,6 +65,8 @@ class TestDefinitionSummary(EntityMinimal):
match_groupby_names: str
match_having_condition: str
custom_query: str
history_calculation: str
history_lookback: int
test_active: str
test_definition_status: str
severity: str
Expand Down Expand Up @@ -144,7 +146,7 @@ class TestType(Entity):
class TestDefinition(Entity):
__tablename__ = "test_definitions"

id: UUID = Column(postgresql.UUID(as_uuid=True))
id: UUID = Column(postgresql.UUID(as_uuid=True), default=uuid4)
cat_test_id: int = Column(BigInteger, Identity(), primary_key=True)
table_groups_id: UUID = Column(postgresql.UUID(as_uuid=True))
profile_run_id: UUID = Column(postgresql.UUID(as_uuid=True))
Expand Down Expand Up @@ -177,6 +179,8 @@ class TestDefinition(Entity):
match_subset_condition: str = Column(NullIfEmptyString)
match_groupby_names: str = Column(NullIfEmptyString)
match_having_condition: str = Column(NullIfEmptyString)
history_calculation: str = Column(NullIfEmptyString)
history_lookback: int = Column(ZeroIfEmptyInteger, default=0)
test_mode: str = Column(String)
custom_query: str = Column(QueryString)
test_active: bool = Column(YNString, default="Y")
Expand Down
Loading