Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update BigQuery samples #244

Merged
merged 1 commit into from
Oct 26, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 43 additions & 42 deletions bigquery/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,22 +35,21 @@ __Usage:__ `node datasets --help`

```
Commands:
create <datasetId> Create a new dataset with the specified ID.
delete <datasetId> Delete the dataset with the specified ID.
list List datasets in the specified project.
size <datasetId> Calculate the size of the specified dataset.
create <datasetId> Creates a new dataset.
delete <datasetId> Deletes a dataset.
list [projectId] Lists all datasets in the specified project or the current project.
size <datasetId> [projectId] Calculates the size of a dataset.

Options:
--projectId, -p Optionally specify the project ID to use. [string] [default: "nodejs-docs-samples"]
--help Show help [boolean]
--help Show help [boolean]

Examples:
node datasets create my_dataset Create a new dataset with the ID "my_dataset".
node datasets delete my_dataset Delete a dataset identified as "my_dataset".
node datasets list List datasets.
node datasets list -p bigquery-public-data List datasets in the "bigquery-public-data" project.
node datasets size my_dataset Calculate the size of "my_dataset".
node datasets size hacker_news -p bigquery-public-data Calculate the size of "bigquery-public-data:hacker_news".
node datasets create my_dataset Creates a new dataset named "my_dataset".
node datasets delete my_dataset Deletes a dataset named "my_dataset".
node datasets list Lists all datasets in the current project.
node datasets list bigquery-public-data Lists all datasets in the "bigquery-public-data" project.
node datasets size my_dataset Calculates the size of "my_dataset" in the current project.
node datasets size hacker_news bigquery-public-data Calculates the size of "bigquery-public-data:hacker_news".

For more information, see https://cloud.google.com/bigquery/docs
```
Expand All @@ -68,17 +67,17 @@ __Usage:__ `node queries --help`
Commands:
sync <sqlQuery> Run the specified synchronous query.
async <sqlQuery> Start the specified asynchronous query.
wait <jobId> Wait for the specified job to complete and retrieve its results.
shakespeare Queries a public Shakespeare dataset.

Options:
--help Show help [boolean]

Examples:
node queries sync "SELECT * FROM
`publicdata.samples.natality` LIMIT 5;"
node queries async "SELECT * FROM
`publicdata.samples.natality` LIMIT 5;"
node queries wait job_VwckYXnR8yz54GBDMykIGnrc2
node queries sync "SELECT * FROM publicdata.samples.natality Synchronously queries the natality dataset.
LIMIT 5;"
node queries async "SELECT * FROM Queries the natality dataset as a job.
publicdata.samples.natality LIMIT 5;"
node queries shakespeare Queries a public Shakespeare dataset.

For more information, see https://cloud.google.com/bigquery/docs
```
Expand All @@ -94,39 +93,41 @@ __Usage:__ `node tables --help`

```
Commands:
create <datasetId> <tableId> Create a new table with the specified ID in the
specified dataset.
list <datasetId> List tables in the specified dataset.
delete <datasetId> <tableId> Delete the specified table from the specified dataset.
copy <srcDatasetId> <srcTableId> <destDatasetId> Make a copy of an existing table.
<destTableId>
browse <datasetId> <tableId> List the rows from the specified table.
import <datasetId> <tableId> <fileName> Import data from a local file or a Google Cloud Storage
file into the specified table.
create <datasetId> <tableId> <schema> [projectId] Creates a new table.
list <datasetId> [projectId] Lists all tables in a dataset.
delete <datasetId> <tableId> [projectId] Deletes a table.
copy <srcDatasetId> <srcTableId> <destDatasetId> Makes a copy of a table.
<destTableId> [projectId]
browse <datasetId> <tableId> [projectId] Lists rows in a table.
import <datasetId> <tableId> <fileName> [projectId] Imports data from a local file into a table.
import-gcs <datasetId> <tableId> <bucketName> <fileName> Imports data from a Google Cloud Storage file into a
[projectId] table.
export <datasetId> <tableId> <bucketName> <fileName> Export a table from BigQuery to Google Cloud Storage.
insert <datasetId> <tableId> <json_or_file> Insert a JSON array (as a string or newline-delimited
[projectId]
insert <datasetId> <tableId> <json_or_file> [projectId] Insert a JSON array (as a string or newline-delimited
file) into a BigQuery table.

Options:
--help Show help [boolean]

Examples:
node tables create my_dataset my_table Create table "my_table" in "my_dataset".
node tables list my_dataset List tables in "my_dataset".
node tables browse my_dataset my_table Display rows from "my_table" in "my_dataset".
node tables delete my_dataset my_table Delete "my_table" from "my_dataset".
node tables import my_dataset my_table ./data.csv Import a local file into a table.
node tables import my_dataset my_table data.csv --bucket Import a GCS file into a table.
my-bucket
node tables export my_dataset my_table my-bucket my-file Export my_dataset:my_table to gcs://my-bucket/my-file as
raw CSV.
node tables export my_dataset my_table my-bucket my-file -f Export my_dataset:my_table to gcs://my-bucket/my-file as
JSON --gzip gzipped JSON.
node tables insert my_dataset my_table json_string Insert the JSON array represented by json_string into
node tables create my_dataset my_table "Name:string, Createss a new table named "my_table" in "my_dataset".
Age:integer, Weight:float, IsMagic:boolean"
node tables list my_dataset Lists tables in "my_dataset".
node tables browse my_dataset my_table Displays rows from "my_table" in "my_dataset".
node tables delete my_dataset my_table Deletes "my_table" from "my_dataset".
node tables import my_dataset my_table ./data.csv Imports a local file into a table.
node tables import-gcs my_dataset my_table my-bucket Imports a GCS file into a table.
data.csv
node tables export my_dataset my_table my-bucket my-file Exports my_dataset:my_table to gcs://my-bucket/my-file
as raw CSV.
node tables export my_dataset my_table my-bucket my-file -f Exports my_dataset:my_table to gcs://my-bucket/my-file
JSON --gzip as gzipped JSON.
node tables insert my_dataset my_table json_string Inserts the JSON array represented by json_string into
my_dataset:my_table.
node tables insert my_dataset my_table json_file Insert the JSON objects contained in json_file (one per
node tables insert my_dataset my_table json_file Inserts the JSON objects contained in json_file (one per
line) into my_dataset:my_table.
node tables copy src_dataset src_table dest_dataset Copy src_dataset:src_table to dest_dataset:dest_table.
node tables copy src_dataset src_table dest_dataset Copies src_dataset:src_table to dest_dataset:dest_table.
dest_table

For more information, see https://cloud.google.com/bigquery/docs
Expand Down
215 changes: 95 additions & 120 deletions bigquery/datasets.js
Original file line number Diff line number Diff line change
@@ -1,161 +1,136 @@
// Copyright 2016, Google, Inc.
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
/**
* Copyright 2016, Google, Inc.
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

'use strict';

// [START setup]
// By default, the client will authenticate using the service account file
// specified by the GOOGLE_APPLICATION_CREDENTIALS environment variable and use
// the project specified by the GCLOUD_PROJECT environment variable. See
// https://googlecloudplatform.github.io/google-cloud-node/#/docs/google-cloud/latest/guides/authentication
var BigQuery = require('@google-cloud/bigquery');
// [END setup]

function createDataset (datasetId, callback) {
var bigquery = BigQuery();
var dataset = bigquery.dataset(datasetId);

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/dataset?method=create
dataset.create(function (err, dataset, apiResponse) {
if (err) {
return callback(err);
}

console.log('Created dataset: %s', datasetId);
return callback(null, dataset, apiResponse);
});
const BigQuery = require('@google-cloud/bigquery');

// [START bigquery_create_dataset]
function createDataset (datasetId) {
// Instantiates a client
const bigquery = BigQuery();

// Creates a new dataset, e.g. "my_new_dataset"
return bigquery.createDataset(datasetId)
.then((results) => {
const dataset = results[0];
console.log(`Dataset ${dataset.id} created.`);
return dataset;
});
}
// [END bigquery_create_dataset]

function deleteDataset (datasetId, callback) {
var bigquery = BigQuery();
var dataset = bigquery.dataset(datasetId);
// [START bigquery_delete_dataset]
function deleteDataset (datasetId) {
// Instantiates a client
const bigquery = BigQuery();

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/dataset?method=delete
dataset.delete(function (err) {
if (err) {
return callback(err);
}
// References an existing dataset, e.g. "my_dataset"
const dataset = bigquery.dataset(datasetId);

console.log('Deleted dataset: %s', datasetId);
return callback(null);
});
// Deletes the dataset
return dataset.delete()
.then(() => {
console.log(`Dataset ${dataset.id} deleted.`);
});
}
// [END bigquery_delete_dataset]

function listDatasets (projectId, callback) {
var bigquery = BigQuery({
// [START bigquery_list_datasets]
function listDatasets (projectId) {
// Instantiates a client
const bigquery = BigQuery({
projectId: projectId
});

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery?method=getDatasets
bigquery.getDatasets(function (err, datasets) {
if (err) {
return callback(err);
}

console.log('Found %d dataset(s)!', datasets.length);
return callback(null, datasets);
});
// Lists all datasets in the specified project
return bigquery.getDatasets()
.then((results) => {
const datasets = results[0];
console.log('Datasets:');
datasets.forEach((dataset) => console.log(dataset.id));
return datasets;
});
}
// [END bigquery_list_datasets]

// [START get_dataset_size]
// Control-flow helper library
var async = require('async');

function getDatasetSize (datasetId, projectId, callback) {
// Instantiate a bigquery client
var bigquery = BigQuery({
// [START bigquery_get_dataset_size]
function getDatasetSize (datasetId, projectId) {
// Instantiate a client
const bigquery = BigQuery({
projectId: projectId
});
var dataset = bigquery.dataset(datasetId);

// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/dataset?method=getTables
dataset.getTables(function (err, tables) {
if (err) {
return callback(err);
}

return async.map(tables, function (table, cb) {
// Fetch more detailed info for each table
// See https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/latest/bigquery/table?method=get
table.get(function (err, tableInfo) {
if (err) {
return cb(err);
}
// Return numBytes converted to Megabytes
var numBytes = tableInfo.metadata.numBytes;
return cb(null, (parseInt(numBytes, 10) / 1000) / 1000);
});
}, function (err, sizes) {
if (err) {
return callback(err);
}
var sum = sizes.reduce(function (cur, prev) {
return cur + prev;
}, 0);

console.log('Size of %s: %d MB', datasetId, sum);
return callback(null, sum);

// References an existing dataset, e.g. "my_dataset"
const dataset = bigquery.dataset(datasetId);

// Lists all tables in the dataset
return dataset.getTables()
.then((results) => results[0])
// Retrieve the metadata for each table
.then((tables) => Promise.all(tables.map((table) => table.get())))
.then((results) => results.map((result) => result[0]))
// Select the size of each table
.then((tables) => tables.map((table) => (parseInt(table.metadata.numBytes, 10) / 1000) / 1000))
// Sum up the sizes
.then((sizes) => sizes.reduce((cur, prev) => cur + prev, 0))
// Print and return the size
.then((sum) => {
console.log(`Size of ${dataset.id}: ${sum} MB`);
return sum;
});
});
}
// [END get_dataset_size]
// [END bigquery_get_dataset_size]

// The command-line program
var cli = require('yargs');
var makeHandler = require('../utils').makeHandler;
const cli = require(`yargs`);

var program = module.exports = {
const program = module.exports = {
createDataset: createDataset,
deleteDataset: deleteDataset,
listDatasets: listDatasets,
getDatasetSize: getDatasetSize,
main: function (args) {
main: (args) => {
// Run the command-line program
cli.help().strict().parse(args).argv;
}
};

cli
.demand(1)
.command('create <datasetId>', 'Create a new dataset with the specified ID.', {}, function (options) {
program.createDataset(options.datasetId, makeHandler());
})
.command('delete <datasetId>', 'Delete the dataset with the specified ID.', {}, function (options) {
program.deleteDataset(options.datasetId, makeHandler());
.command(`create <datasetId>`, `Creates a new dataset.`, {}, (opts) => {
program.createDataset(opts.datasetId);
})
.command('list', 'List datasets in the specified project.', {}, function (options) {
program.listDatasets(options.projectId, makeHandler(true, 'id'));
.command(`delete <datasetId>`, `Deletes a dataset.`, {}, (opts) => {
program.deleteDataset(opts.datasetId);
})
.command('size <datasetId>', 'Calculate the size of the specified dataset.', {}, function (options) {
program.getDatasetSize(options.datasetId, options.projectId, makeHandler());
.command(`list [projectId]`, `Lists all datasets in the specified project or the current project.`, {}, (opts) => {
program.listDatasets(opts.projectId || process.env.GCLOUD_PROJECT);
})
.option('projectId', {
alias: 'p',
requiresArg: true,
type: 'string',
default: process.env.GCLOUD_PROJECT,
description: 'Optionally specify the project ID to use.',
global: true
.command(`size <datasetId> [projectId]`, `Calculates the size of a dataset.`, {}, (opts) => {
program.getDatasetSize(opts.datasetId, opts.projectId || process.env.GCLOUD_PROJECT);
})
.example('node $0 create my_dataset', 'Create a new dataset with the ID "my_dataset".')
.example('node $0 delete my_dataset', 'Delete a dataset identified as "my_dataset".')
.example('node $0 list', 'List datasets.')
.example('node $0 list -p bigquery-public-data', 'List datasets in the "bigquery-public-data" project.')
.example('node $0 size my_dataset', 'Calculate the size of "my_dataset".')
.example('node $0 size hacker_news -p bigquery-public-data', 'Calculate the size of "bigquery-public-data:hacker_news".')
.example(`node $0 create my_dataset`, `Creates a new dataset named "my_dataset".`)
.example(`node $0 delete my_dataset`, `Deletes a dataset named "my_dataset".`)
.example(`node $0 list`, `Lists all datasets in the current project.`)
.example(`node $0 list bigquery-public-data`, `Lists all datasets in the "bigquery-public-data" project.`)
.example(`node $0 size my_dataset`, `Calculates the size of "my_dataset" in the current project.`)
.example(`node $0 size hacker_news bigquery-public-data`, `Calculates the size of "bigquery-public-data:hacker_news".`)
.wrap(120)
.recommendCommands()
.epilogue('For more information, see https://cloud.google.com/bigquery/docs');
.epilogue(`For more information, see https://cloud.google.com/bigquery/docs`);

if (module === require.main) {
program.main(process.argv.slice(2));
Expand Down
Loading