Entry point for reading data from Cloud BigQuery.
tfio.bigquery.BigQueryReadSession(
parent, project_id, table_id, dataset_id, selected_fields,
selected_fields_repeated, output_types, default_values, row_restriction,
requested_streams, data_format, streams, schema, client_resource
)
Methods
get_streams
View source
get_streams()
Returns Tensor with stream names for reading data from BigQuery.
Returns |
Tensor with stream names.
|
parallel_read_rows
View source
parallel_read_rows(
cycle_length=None, sloppy=False, block_length=1, num_parallel_calls=None
)
Retrieves rows from the BigQuery service in parallel streams.
bq_client = BigQueryClient()
bq_read_session = bq_client.read_session(...)
ds1 = bq_read_session.parallel_read_rows(...)
Args:
cycle_length: number of streams to process in parallel. If not specified, it
is defaulted to the number of streams in the read session.
sloppy: If false, elements are produced in deterministic order. If true,
the implementation is allowed, for the sake of expediency, to produce
elements in a non-deterministic order.
When reading from multiple BigQuery streams, setting sloppy=True usually
yields a better performance.
block_length: The number of consecutive elements to pull from a session stream
before advancing to the next one.
num_parallel_calls: Number of threads to use for processing input streams.
If the value tf.data.experimental.AUTOTUNE
is used, then the number of
parallel calls is set dynamically based on available CPU.
Defaulted to the number of streams in the read session.
Raises |
ValueError
|
If the configured probability is unexpected.
|
read_rows
View source
read_rows(
stream, offset=0
)
Retrieves rows (including values) from the BigQuery service.
Args |
stream
|
name of the stream to read from.
|
offset
|
Position in the stream.
|
Raises |
ValueError
|
If the configured probability is unexpected.
|