Link Search Menu Expand Document

Changelog

2.0.0 (currently in development)

  • BigQuery: Removed old csv_file_bq. Users could use csv_load or custom shell scripts instead.
  • Ability to specify BQ job timeout (existing timeout setting in was renamed to http_timeout).
  • New pause and soft quit spec execution features in Web and CLI UI interactive modes.
  • Soft quit when Ctrl+C, SIGINT and SIGTERM allowing for the current jobs to be finished.
  • Added interactive table selectors input to Web UI mode in dry-run.
  • Removed FailNow hook’s failure option and renamed default fail option from FailLater to Fail.
  • Multi-package restructure
  • simple-sql-parser library is now used for rendering.
  • Table and Global hooks are now introspected (except of IO global hooks), dump output contain assertions and queries executed during hooks.
  • Added BigQuery authorized views support.
  • BigQuery credentials are now stored in json file in user’s XDG cache directory.
  • Ability to leave json memos in spec and hook programs, which are accessible in global hooks.
  • Run metadata is now stored in files or AWS S3 compatible storage. napkin history command has been removed.
  • --report option for napkin run allows @TS and @RUN substitutions for start timestamp and run UUID respectively.
  • Added the --log-file FILENAME option to redirect logs to file, it also allows @TS and @RUN substitutions similar to --report.
  • Copy of Napkin logs has been added to HTML reports.
  • Gantt chart has been added to HTML reports.
  • Temporary table names in dump do not contain current timestamp and random string.
  • Add --jobs NUM_JOBS or -j NUM_JOBS option to limit task concurrency.
  • BigQuery query stats indicate reported slot time.
  • with_dependency update strategy now works across multiple specs if they use the same metadata location.

1.0.0 (released 2024-07-09)

  • Added new --enable-with-upstream and --enable-with-downstream table selectors.
  • Added hooks: and create: options to select hooks and create table actions correspondingly, e.g. spec:hooks:table1, db:create:table1.
  • Fixed bug in UpdateIfErroredLastRun update strategy that caused it always be applied.
  • Upstream selectors includes async hooks.
  • New DAG evaluation mode.
  • Removing lenses for partitioning and clustering in favour to overloaded labels.
  • Adding pretty-printing ability with :pp command in REPL.
  • Spec parsing fails, when unknown fields are present in the yaml specification.
  • UpdateStrategy period field can be alternatively specified by an object with human-readable properties, for example: period: { "weeks": 3 }.
  • mergeTable is now supported for BigQuery backend.
  • napkin repl now supports overrides with --arg* arguments.
  • Possibility to use multiple YAML spec files with different merge strategies.
  • dump writes spec.yaml file, which contains all overrides.
  • Napkin produces an error on yaml spec files with duplicate keys.
  • Napkin reports file name of the spec file which caused parsing error.
  • Add Web UI to observe progress as alternative to CLI UI. Activated with -w for napkin run. It can be used with --dry-run to inspect execution plan.
  • Add HTML report based on Web UI to napkin run activated with --report DIR that can be used to inspect runs.

0.5.14 (released 2024-03-13)

  • Remove OOB OAuth backup option on failure of loopback IP address.
  • Init command no longer creates a git repository.
  • Verbose flag now prints useful version information.
  • Fixed problem with table creation in the project, which differs from one Napkin connects to.
  • Updating base ubuntu image to be jammy release.
  • Add s key to --show-progress UI to switch between preprocessed and post-processed table names.
  • dump command accepts table selectors to narrow down the scope of tables.
  • Fix: history gantt command is able to display table dependencies.
  • Fix: some BigQuery commands were run in an invalid context.
  • BigQuery: new bigquery_defaults preprocessor to prepend a dataset to all tables.
  • BigQuery: new default implicit validator to check that all queries has fully qualified table names.
  • Fix: Reporting of the query statistics (per-table and total) in napkin run -p mode now.
  • external command support two modes: command with arguments and a shell script.
  • Ability to use CVS files as an table inputs.
  • BigQuery: new ability to import CSV files with csv_file_bq create_program (uses bigquery-tool).
  • Deprecating run_shell and run_process hook definition in favour of external.
  • Unify how Napkin gets and uses AppName. May require re-authentication for old entries in oauth database.
  • New set of assert*T functions to define hooks without specifying target table explicitly.
  • napkin dump command now prints table’s tags near each node in PDF report.
  • Assert-cardinalities hook can now be asserted against multiple columns from multiple tables.
  • BigQuery partition options now support specifying intervals, expiration and filter requirement flag.
  • Support for table and column description annotations.
  • Yaml spec can now source haskell extensions, dependencies and source dirs from the package.yaml file.
  • Yaml spec can now source haskell extensions from Cabal files.
  • Internal haskell libraries are now supported in haskell_package.

0.5.13 (released 2022-04-15)

  • Partitioning, clustering and refresh strategies support was added to materialized views in BigQuery.
  • Napkin now has arm64 architecture support: static-binary and docker images are available.
  • Thanks to arm64 support, napkin can be used in Apple M1 processors via docker containers.
  • Add possibility to issue UPDATE queries.
  • Fix several Postgres and Redshift SQL rendering issues.
  • Improve logging and assertion error reporting.
  • Napkin produces the error on duplicate table definition in a spec.
  • Table schema is now an array (instead of unordered Map) of backend-dependent types.
  • Ability to dump HTML representation of the execution graph (experimental).
  • Dump command does not crash on invalid SQL input in spec files.

0.5.12 (released 2022-01-07)

  • Dump command will report single execution path on checkTableExists call if behavior of two branches is the same.
  • externalCreate program does not perform mustache interpolation anymore on command and arguments, while YAML external program does.
  • Materialized views are now supported for BigQuery, Postgres, and Redshift.
  • TimescaleDB: added Haskell DSL combinators for Timescale-specific aggregates in Napkin.Untyped.Ops.Timescale.
  • TimescaleDB: continuous aggregates are now supported through materialized view options.
  • YAML Specs: table_options are now part of create_action arguments.
  • Fix: Mustache interpolation allows to nest section and use variables from outer context.
  • napkin templates no longer exists, list of all embedded templates are displayed on init --help screen.
  • Added a possibility to define tables by invoking external commands from spec.yaml file.

0.5.11 (released 2022-01-20)

  • CLI commands are combined into groups for visual clarity.
  • Revamped CLI options for partial spec runs.
  • Napkin will display original table names (in addition to processed table names) in the execution plan summary.
  • Napkin is now able to conditionally render mustache template sections.
    • Section is not rendered if condition variable is not defined or ‘false’ or ‘empty list’.
    • Section is rendered multiple times for ‘non empty list’, binding list element as a variable set.
    • Section is rendered once for a non empty value, binding it as a variable set.
    • Napkin produces an error when variable mentioned in section name is not defined.
  • Decoupled SQL dialect from backend:
    • SQL dialect can be selected on per-spec or per-table basis using parser_dialect option.
    • Available parsers are: napkin.bigquery, napkin.postgres, napkin.sqlite, napkin.ansi2011, generic.bigquery, generic.postgres, generic.sqlite, generic.ansi2011, postgres, raw.
    • When using dialect other than napkin.* Napkin will have limited capabilities (e.g. query optimization will not be available).
    • Dependency detection and renaming will not affect queries parsed with raw. Napkin will not attempt to parse or modify them.
    • postgres dialect can be used when some Postgres-specific features are used (e.g. JSON operator such as ->>).
  • Fix: Napkin would not detect dependencies in some sub-queries.
  • Napkin is able to perform strict validations (in run, ‘validate’, dump and optimize commands). The supported validations are:
    • Mustache variable, mentioned in section name is not defined.
    • Part of nested mustache variable path, mentioned in section name in not an object.
    • Complex (object, array) mustache value rendered directly.
  • Suppress reporting 0 rows affected for SQLite
  • REPL: Fix: helper macros did not work properly due to regression.
  • REPL: Spec and Hook programs can return values to the REPL environment for debugging purposes.
  • Internal: LocalFile Polysemy language got split into LocalFile and Template with umbrella LoadQuery language.
  • Query statistics is not printed as “unknown” if there are no pieces of information available (typical for Sqlite).
  • Source location (used in error reporting) is now able to print start of inline query for better context.
  • Post hooks in YAML will have implicit argument table with target table name.
  • BigQuery: Add support for STRUCTs.
  • Fix: renamed target table name was passed to programs instead of raw one.
  • SQLite version has been updated to 3.36.0 and now supports math and JSON functions.
  • Napkin now prints embedded Sqlite version as part of napkin version CLI command.
  • ‘Simple’ logging format now has table name in the context.
  • Added --use-spec-names to dump command.
  • Internal change: query transformers (e.g. table renamers) are not baked into programs – they are applied on runtime.
  • Fix: regression for assert_expression syntax in YAML specs.

0.5.10 (released 2021-12-29)

  • <nixpkgs> path in docker now points to the github commit instead of intermediate file.
  • Is is now possible to specify HTTP timeout for google BigQuery in the backend_options.
  • Napkin will report an error when sql file used with incremental_by_time strategy does not consume cutoff variable.
  • Fix: SQLite remove extra parents from SQL INSERT INTO SELECT Statement.
  • Change PostgreSQL parser dialect from ANSI to Postgres.
  • Fix: Napkin was using incorrect pipeline name for backends (was always “rs”).
  • Fix: Change CAST operator rendering to be ANSI complainant.
  • Docker image now uses napkin user by default, it has nicer bash and zsh prompts.
    • Docker image working directory changed from /project to /home/napkin/project.
    • Dev-container settings (produced by napkin init) now specify zsh as a default shell.
  • CI builds is DAG now (overall pipeline time reduced).
  • Meta arguments handling in programs is more consistent:
    • Reader MetaArguments is now Input MetaArguments in SpecProgram and HookProgram.
    • Input SpecMetaArgs is now Input MetaArguments in SpecPreprocessor.
    • Input MetaArguments helpers are now located in Napkin.Run.Effects.MetaArguments.
  • incrementalByTime now accepts incremental_reset as string or bool (--arg incremental_reset=true will work again).
  • napkin auth uses now subcommands to show (napkin auth show) and reset (napkin auth reset).
  • napkin auth show displays output in human-friendly format.
  • Fix: namespaceAllTables will no longer rename CTEs and break queries.
  • Fix: renaming tables will no longer affect aliases.
  • Fix: table aliases were not rendered correctly in JOIN queries (Postgres, Redshift).
  • Extensions to table_namespace and table_prefix preprocessors:
    • added scope parameter that can be either all, managed (default) or unmanaged to control which tables are renamed,
    • added only and except parameters for fine-grained control on which tables are renamed,
    • table_namespace has extra on_existing parameter that can be overwrite (default) or keep_original, which allows to keep original namespace if it has been explicitly provided in the spec.
  • Fix: tables now can be moved between schemas (Postgres, Redshift).
  • Fix: checkTableExists does not assume that default schema is public (Postgres, Redshift).

0.5.9 (released 2021-12-14)

  • Napkin now support different log formats though --log-format CLI option.
  • Added Napkin static binary (does not fully support haskell interpretation) for easier installation.
  • Renamed live validation option from ‘-l’ (–live) to ‘-i’ (–interactive) to avoid collision with –log-level.
  • Fix: Interactive validation now doesn’t complain on absent folders.
  • Added S3 bucket monitoring page.
  • Consistent log-level setting via command line options: one can use either -v or --log-level (-l). Options can be applied to all commands now.
  • Error reporting is more consistent.
  • Improved CLI UI (napkin run -p) look and feel.
  • Fix: CLI UI did not display query statistics properly when spec execution has been terminated by the user.
  • Fix: CLI UI did not terminate spec execution.
  • Napkin will print the information on execution plan (managed tables to be updated, unmanaged tables used as an input, managed tables used as an input, but not scheduled for updated) as well as ETA.
  • Skipped tables will no longer block dependent tables execution.
  • Fix: Haskell spec has now access to meta arguments.
  • Number of concurrent DB operations can be set with backend_options in YAML specs. Use concurrent_queries for Big Query or connection_pool for Postgres and Redshift. The setting defaults to 100.
  • #253 Support SQLite Builtin functions.

0.5.8 (released 2021-12-08)

  • Improved napkin init.
  • Docker tags are now consistent.
  • create_action syntax has changed and it is now consistent for built-in and custom programs.
    • added sql_query and long_to_wide built-in spec programs.
    • incremental combinators are now built-in programs.
    • update_strategy now explicitly defaults to always, empty list will not fall back to always.
    • it’s possible to call custom programs from yaml without arguments bu providing string (symbol name) instead of object.
    • deps and hidden_deps are now attribute of table (was part of create_action previously).