Google BigQueryGoogle BigQuery

BigQuery is a cloud-based data warehouse and relational database from Google. Ultorg has dedicated support for BigQuery and its associated SQL dialect (GoogleSQL).

Connecting

To connect to BigQuery from Ultorg:

  1. Make sure you are using Ultorg 2.1.2 or later (download page here).
  2. Download the BigQuery JDBC driver. See the first link under “Current JDBC drivers”.
  3. Unzip the downloaded ZIP file to a folder somewhere.
  4. In Ultorg, click Add Data Source in the panel to the left, and then Connect to Database.
    The Add Data Source button, with the Connect to Database action highlighted.
  5. In the Driver dropdown, select Google BigQuery.
  6. Click Add, then select all of the 66 or so JAR files that you unzipped in step 2. You can click in the file list and press Ctrl+A (⌘A on Mac) to select all files. Click Open/OK and Next.
  7. In the Instance Name field, enter your BigQuery Project ID.
    Database connection settings for Google BigQuery.
  8. Click Test to test your connection. A browser window will open to request you grant access through your Google account.

    Alternatively, you can modify the JDBC URL here to end with OAuthType=3; to use Application Default Credentials, see below. The default JDBC URL uses browser-based authentication (OAuthType=1).

  9. If all is well, Ultorg will manage to connect at this point. If not, examine the error message and consult the BigQuery JDBC driver documentation. You may need to adjust the JDBC URL, or log in with a different Google account.
  10. Press Finish. Your tables should become visible under a new database icon (Data Source icon) in the Folders sidebar.

Application Default Credentials

To avoid having to log into your Google Account every time you connect to BigQuery from Ultorg, you can set up so-called Application Default Credentials, as follows:

  1. First install Google's gcloud command-line tool, using the instructions here.

    For MacOS users that use the Homebrew package manager, the following may be easier:

    brew install --cask google-cloud-sdk
    
  2. From the terminal (command prompt), run the following two commands:
    gcloud init
    gcloud auth application-default login
    

    A web browser window should open to prompt for your credentials.

    See this page for more details. Some authentication schemes may require additional steps.

  3. In Ultorg, follow the instructions from the previous section to connect to BigQuery. But in step 7, modify the JDBC URL to end with OAuthType=3;. The connection should then succeed without the browser window popping up to confirm first.

    If you have an existing BigQuery connection already set up, you can modify it by right-clicking its icon (Data Source icon) in the Folders sidebar and clicking Edit Connection Details.

Primary Keys

BigQuery supports the definition of primary key columns in a table, but does not actually enforce the uniqueness of values within them. This is not usually a problem, as long as the data was loaded from an external source that did take care of such enforcement.

When generating SQL queries, Ultorg will assume uniqueness of primary key values.

For tables with no primary key, the number of rows retrieved for a given table may change as you hide or show columns. This is because Ultorg retrieves distinct rows only.

Cost Optimization

BigQuery is heavily optimized for column-oriented workloads, i.e. queries that involve only a few table columns at a time, but potentially with a very large number of rows.

For trivial “SELECT * FROM table”-style queries, Ultorg may include a TABLESAMPLE clause in the generated SQL code. This helps reduce costs for the common case where an Ultorg user double-clicks a table to see its contents. This works for tables (Data Table icon) but not for views (Data Table SQL VIEW icon).

For large tables, you can save costs by hiding unneeded columns before proceding with other query building actions. Data in unused columns can often be skipped completely by BigQuery.

Filters on Partitioned Tables

Some tables may require a filter to be set on a specific column, known as the partition column, before any query can be executed. This is due to BigQuery's require_partition_filter setting. When a filter on the partition column is required, Ultorg will show an informational message:

A Google BigQuery table that requires a filter to be set on the partition column, with Ultorg's informational message to that effect.

When you open the Filter on the partition column, Ultorg will show a list of available table partitions, as ranges to filter by. Alternatively, you can type a number, year, or month in the Find box to define a wider range.

Available BigQuery partitions listed in Ultorg's Filter interface.

Partition filters work with any of the partition types supported by BigQuery: time-unit or ingestion time with daily, hourly, monthly, or yearly partitioning, or integer ranges.

Once a filter is set on the partition column, you can interact with the table as normal.

Data Editing

Google BigQuery is typically used for read-only datasets. Still, Ultorg's data editing functionality may be useful on some occasions. For example, you might have smaller tables of editable data that are used as part of bigger join queries.

To enable data editing on a BigQuery connection in Ultorg, add the EnableSession=1 and Location properties at the end of the JDBC URL in the Connect to Database dialog. The Location property must be set to the BigQuery location of datasets you want to access.

For example, the following JDBC URL (on a single line) might be used to connect to BigQuery with support for data editing in the us-east4 location:

jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;
ProjectId=someProjectId;OAuthType=1;EnableSession=1;Location=us-east4