このノートブックベースのチュートリアルでは、TFXパイプラインを作成して実行します。このパイプラインは、単純な分類モデルを作成し、複数の実行にわたるパフォーマンスを分析します。このノートブックは、私たちが組み込まTFXパイプラインに基づいていますシンプルなTFXパイプラインのチュートリアル。そのチュートリアルをまだ読んでいない場合は、このノートブックに進む前に読んでください。
モデルを微調整したり、新しいデータセットでトレーニングしたりするときは、モデルが改善されたか、悪化したかを確認する必要があります。精度などのトップレベルのメトリックをチェックするだけでは不十分な場合があります。トレーニングされたすべてのモデルは、本番環境にプッシュする前に評価する必要があります。
私たちは、追加されますEvaluator
前のチュートリアルで作成したパイプラインにコンポーネントを。 Evaluatorコンポーネントは、モデルの詳細な分析を実行し、新しいモデルをベースラインと比較して、モデルが「十分」であるかどうかを判断します。これは、使用して実装されTensorFlowモデル解析ライブラリを。
参照してくださいTFXパイプラインが理解TFXの様々な概念についての詳細を学ぶために。
設定
セットアッププロセスは、前のチュートリアルと同じです。
まず、TFX Pythonパッケージをインストールし、モデルに使用するデータセットをダウンロードする必要があります。
アップグレードピップ
ローカルで実行しているときにシステムでPipをアップグレードしないようにするには、Colabで実行していることを確認してください。もちろん、ローカルシステムは個別にアップグレードできます。
try:
import colab
!pip install --upgrade pip
except:
pass
TFXをインストールする
pip install -U tfx
ランタイムを再起動しましたか?
上記のセルを初めて実行するときにGoogleColabを使用している場合は、[ランタイムの再起動]ボタンをクリックするか、[ランタイム]> [ランタイムの再起動...]メニューを使用してランタイムを再起動する必要があります。これは、Colabがパッケージをロードする方法が原因です。
TensorFlowとTFXのバージョンを確認してください。
import tensorflow as tf
print('TensorFlow version: {}'.format(tf.__version__))
from tfx import v1 as tfx
print('TFX version: {}'.format(tfx.__version__))
TensorFlow version: 2.6.2 TFX version: 1.4.0
変数を設定する
パイプラインを定義するために使用されるいくつかの変数があります。これらの変数は必要に応じてカスタマイズできます。デフォルトでは、パイプラインからのすべての出力は現在のディレクトリの下に生成されます。
import os
PIPELINE_NAME = "penguin-tfma"
# Output directory to store artifacts generated from the pipeline.
PIPELINE_ROOT = os.path.join('pipelines', PIPELINE_NAME)
# Path to a SQLite DB file to use as an MLMD storage.
METADATA_PATH = os.path.join('metadata', PIPELINE_NAME, 'metadata.db')
# Output directory where created models from the pipeline will be exported.
SERVING_MODEL_DIR = os.path.join('serving_model', PIPELINE_NAME)
from absl import logging
logging.set_verbosity(logging.INFO) # Set default logging level.
サンプルデータを準備する
私たちは、同じ使用しますパーマーペンギンデータセットを。
このデータセットには、範囲[0,1]を持つようにすでに正規化されている4つの数値特徴があります。私たちは、予測、分類モデル構築するspecies
のペンギンのを。
TFX ExampleGenはディレクトリから入力を読み取るため、ディレクトリを作成してデータセットをそこにコピーする必要があります。
import urllib.request
import tempfile
DATA_ROOT = tempfile.mkdtemp(prefix='tfx-data') # Create a temporary directory.
_data_url = 'https://raw.githubusercontent.com/tensorflow/tfx/master/tfx/examples/penguin/data/labelled/penguins_processed.csv'
_data_filepath = os.path.join(DATA_ROOT, "data.csv")
urllib.request.urlretrieve(_data_url, _data_filepath)
('/tmp/tfx-datal5lxy_yw/data.csv', <http.client.HTTPMessage at 0x7fa18a9da150>)
パイプラインを作成する
私たちは、追加されますEvaluator
私たちが作成したパイプラインに部品をシンプルTFXパイプラインのチュートリアル。
ANエバリュエータコンポーネントからの入力データを必要とExampleGen
成分とからモデルTrainer
成分とtfma.EvalConfig
オブジェクト。オプションで、メトリックを新しくトレーニングされたモデルと比較するために使用できるベースラインモデルを提供できます。
評価者は、出力成果物の2種類作成しModelEvaluation
とModelBlessing
。 ModelEvaluationには、TFMAライブラリを使用してさらに調査および視覚化できる詳細な評価結果が含まれています。 ModelBlessingには、モデルが特定の基準に合格したかどうかのブール結果が含まれており、プッシャーなどの後のコンポーネントで信号として使用できます。
モデルトレーニングコードを書く
私たちは、のように同じモデルのコードを使用するシンプルなTFXパイプラインのチュートリアル。
_trainer_module_file = 'penguin_trainer.py'
%%writefile {_trainer_module_file}
# Copied from https://www.tensorflow.org/tfx/tutorials/tfx/penguin_simple
from typing import List
from absl import logging
import tensorflow as tf
from tensorflow import keras
from tensorflow_transform.tf_metadata import schema_utils
from tfx.components.trainer.executor import TrainerFnArgs
from tfx.components.trainer.fn_args_utils import DataAccessor
from tfx_bsl.tfxio import dataset_options
from tensorflow_metadata.proto.v0 import schema_pb2
_FEATURE_KEYS = [
'culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g'
]
_LABEL_KEY = 'species'
_TRAIN_BATCH_SIZE = 20
_EVAL_BATCH_SIZE = 10
# Since we're not generating or creating a schema, we will instead create
# a feature spec. Since there are a fairly small number of features this is
# manageable for this dataset.
_FEATURE_SPEC = {
**{
feature: tf.io.FixedLenFeature(shape=[1], dtype=tf.float32)
for feature in _FEATURE_KEYS
},
_LABEL_KEY: tf.io.FixedLenFeature(shape=[1], dtype=tf.int64)
}
def _input_fn(file_pattern: List[str],
data_accessor: DataAccessor,
schema: schema_pb2.Schema,
batch_size: int = 200) -> tf.data.Dataset:
"""Generates features and label for training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
schema: schema of the input data.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, indices) tuple where features is a
dictionary of Tensors, and indices is a single Tensor of label indices.
"""
return data_accessor.tf_dataset_factory(
file_pattern,
dataset_options.TensorFlowDatasetOptions(
batch_size=batch_size, label_key=_LABEL_KEY),
schema=schema).repeat()
def _build_keras_model() -> tf.keras.Model:
"""Creates a DNN Keras model for classifying penguin data.
Returns:
A Keras Model.
"""
# The model below is built with Functional API, please refer to
# https://www.tensorflow.org/guide/keras/overview for all API options.
inputs = [keras.layers.Input(shape=(1,), name=f) for f in _FEATURE_KEYS]
d = keras.layers.concatenate(inputs)
for _ in range(2):
d = keras.layers.Dense(8, activation='relu')(d)
outputs = keras.layers.Dense(3)(d)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(
optimizer=keras.optimizers.Adam(1e-2),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy()])
model.summary(print_fn=logging.info)
return model
# TFX Trainer will call this function.
def run_fn(fn_args: TrainerFnArgs):
"""Train the model based on given args.
Args:
fn_args: Holds args used to train the model as name/value pairs.
"""
# This schema is usually either an output of SchemaGen or a manually-curated
# version provided by pipeline author. A schema can also derived from TFT
# graph if a Transform component is used. In the case when either is missing,
# `schema_from_feature_spec` could be used to generate schema from very simple
# feature_spec, but the schema returned would be very primitive.
schema = schema_utils.schema_from_feature_spec(_FEATURE_SPEC)
train_dataset = _input_fn(
fn_args.train_files,
fn_args.data_accessor,
schema,
batch_size=_TRAIN_BATCH_SIZE)
eval_dataset = _input_fn(
fn_args.eval_files,
fn_args.data_accessor,
schema,
batch_size=_EVAL_BATCH_SIZE)
model = _build_keras_model()
model.fit(
train_dataset,
steps_per_epoch=fn_args.train_steps,
validation_data=eval_dataset,
validation_steps=fn_args.eval_steps)
# The result of the training should be saved in `fn_args.serving_model_dir`
# directory.
model.save(fn_args.serving_model_dir, save_format='tf')
Writing penguin_trainer.py
パイプライン定義を書く
TFXパイプラインを作成する関数を定義します。私たちは、上記の評価者のコンポーネントに加えて、我々はと呼ばれる1つの以上のノードを追加しますResolver
。新しいモデルが以前のモデルよりも良くなっていることを確認するには、ベースラインと呼ばれる以前に公開されたモデルと比較する必要があります。 MLメタデータ(MLMD)は、パイプラインの以前のすべての成果物を追跡し、 Resolver
、最新の祝福のモデルだったものを見つけることができます-モデルが正常に評価者を合格- MLMDから呼び出さ戦略クラス使用LatestBlessedModelStrategy
。
import tensorflow_model_analysis as tfma
def _create_pipeline(pipeline_name: str, pipeline_root: str, data_root: str,
module_file: str, serving_model_dir: str,
metadata_path: str) -> tfx.dsl.Pipeline:
"""Creates a three component penguin pipeline with TFX."""
# Brings data into the pipeline.
example_gen = tfx.components.CsvExampleGen(input_base=data_root)
# Uses user-provided Python function that trains a model.
trainer = tfx.components.Trainer(
module_file=module_file,
examples=example_gen.outputs['examples'],
train_args=tfx.proto.TrainArgs(num_steps=100),
eval_args=tfx.proto.EvalArgs(num_steps=5))
# NEW: Get the latest blessed model for Evaluator.
model_resolver = tfx.dsl.Resolver(
strategy_class=tfx.dsl.experimental.LatestBlessedModelStrategy,
model=tfx.dsl.Channel(type=tfx.types.standard_artifacts.Model),
model_blessing=tfx.dsl.Channel(
type=tfx.types.standard_artifacts.ModelBlessing)).with_id(
'latest_blessed_model_resolver')
# NEW: Uses TFMA to compute evaluation statistics over features of a model and
# perform quality validation of a candidate model (compared to a baseline).
eval_config = tfma.EvalConfig(
model_specs=[tfma.ModelSpec(label_key='species')],
slicing_specs=[
# An empty slice spec means the overall slice, i.e. the whole dataset.
tfma.SlicingSpec(),
# Calculate metrics for each penguin species.
tfma.SlicingSpec(feature_keys=['species']),
],
metrics_specs=[
tfma.MetricsSpec(per_slice_thresholds={
'sparse_categorical_accuracy':
tfma.PerSliceMetricThresholds(thresholds=[
tfma.PerSliceMetricThreshold(
slicing_specs=[tfma.SlicingSpec()],
threshold=tfma.MetricThreshold(
value_threshold=tfma.GenericValueThreshold(
lower_bound={'value': 0.6}),
# Change threshold will be ignored if there is no
# baseline model resolved from MLMD (first run).
change_threshold=tfma.GenericChangeThreshold(
direction=tfma.MetricDirection.HIGHER_IS_BETTER,
absolute={'value': -1e-10}))
)]),
})],
)
evaluator = tfx.components.Evaluator(
examples=example_gen.outputs['examples'],
model=trainer.outputs['model'],
baseline_model=model_resolver.outputs['model'],
eval_config=eval_config)
# Checks whether the model passed the validation steps and pushes the model
# to a file destination if check passed.
pusher = tfx.components.Pusher(
model=trainer.outputs['model'],
model_blessing=evaluator.outputs['blessing'], # Pass an evaluation result.
push_destination=tfx.proto.PushDestination(
filesystem=tfx.proto.PushDestination.Filesystem(
base_directory=serving_model_dir)))
components = [
example_gen,
trainer,
# Following two components were added to the pipeline.
model_resolver,
evaluator,
pusher,
]
return tfx.dsl.Pipeline(
pipeline_name=pipeline_name,
pipeline_root=pipeline_root,
metadata_connection_config=tfx.orchestration.metadata
.sqlite_metadata_connection_config(metadata_path),
components=components)
私たちは、経由して評価者に以下の情報を提供する必要がありeval_config
:
- 構成する追加のメトリック(モデルで定義されているよりも多くのメトリックが必要な場合)。
- 構成するスライス
- 検証が含まれるかどうかを検証するためのモデル検証しきい値
のでSparseCategoricalAccuracy
すでに含まれていたmodel.compile()
呼び出し、それが自動的に分析に含まれます。したがって、ここではメトリックを追加しません。 SparseCategoricalAccuracy
モデルがあまりにも、良い十分であるかどうかを決定するために使用されます。
データセット全体と各ペンギン種の指標を計算します。 SlicingSpec
私たちが宣言したメトリックを集約方法を指定します。
新しいモデルが通過する必要がある2つのしきい値があります。1つは0.6の絶対しきい値であり、もう1つはベースラインモデルよりも高くなければならない相対しきい値です。初めてのパイプラインを実行すると、 change_threshold
無視され、唯一のvalue_thresholdがチェックされます。あなたが複数回パイプラインを実行すると、 Resolver
前回の実行からモデルを見つけるだろうし、それは比較のためのベースラインモデルとして使用されます。
参照してください。エバリュエータコンポーネントガイドの詳細については。
パイプラインを実行する
私たちは、使用するLocalDagRunner
前のチュートリアルのように。
tfx.orchestration.LocalDagRunner().run(
_create_pipeline(
pipeline_name=PIPELINE_NAME,
pipeline_root=PIPELINE_ROOT,
data_root=DATA_ROOT,
module_file=_trainer_module_file,
serving_model_dir=SERVING_MODEL_DIR,
metadata_path=METADATA_PATH))
INFO:absl:Generating ephemeral wheel package for '/tmpfs/src/temp/docs/tutorials/tfx/penguin_trainer.py' (including modules: ['penguin_trainer']). INFO:absl:User module package has hash fingerprint version 1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703. INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '/tmp/tmpr3anh67s/_tfx_generated_setup.py', 'bdist_wheel', '--bdist-dir', '/tmp/tmp6s2sw4dj', '--dist-dir', '/tmp/tmp6jr76e54'] /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. setuptools.SetuptoolsDeprecationWarning, listing git files failed - pretending there aren't any INFO:absl:Successfully built user code wheel distribution at 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'; target user module is 'penguin_trainer'. INFO:absl:Full user module path is 'penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl' INFO:absl:Using deployment config: executor_specs { key: "CsvExampleGen" value { beam_executable_spec { python_executor_spec { class_path: "tfx.components.example_gen.csv_example_gen.executor.Executor" } } } } executor_specs { key: "Evaluator" value { beam_executable_spec { python_executor_spec { class_path: "tfx.components.evaluator.executor.Executor" } } } } executor_specs { key: "Pusher" value { python_class_executable_spec { class_path: "tfx.components.pusher.executor.Executor" } } } executor_specs { key: "Trainer" value { python_class_executable_spec { class_path: "tfx.components.trainer.executor.GenericExecutor" } } } custom_driver_specs { key: "CsvExampleGen" value { python_class_executable_spec { class_path: "tfx.components.example_gen.driver.FileBasedDriver" } } } metadata_connection_config { sqlite { filename_uri: "metadata/penguin-tfma/metadata.db" connection_mode: READWRITE_OPENCREATE } } INFO:absl:Using connection config: sqlite { filename_uri: "metadata/penguin-tfma/metadata.db" connection_mode: READWRITE_OPENCREATE } INFO:absl:Component CsvExampleGen is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen" } id: "CsvExampleGen" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } } outputs { outputs { key: "examples" value { artifact_spec { type { name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } } } } } } parameters { parameters { key: "input_base" value { field_value { string_value: "/tmp/tfx-datal5lxy_yw" } } } parameters { key: "input_config" value { field_value { string_value: "{\n \"splits\": [\n {\n \"name\": \"single_split\",\n \"pattern\": \"*\"\n }\n ]\n}" } } } parameters { key: "output_config" value { field_value { string_value: "{\n \"split_config\": {\n \"splits\": [\n {\n \"hash_buckets\": 2,\n \"name\": \"train\"\n },\n {\n \"hash_buckets\": 1,\n \"name\": \"eval\"\n }\n ]\n }\n}" } } } parameters { key: "output_data_format" value { field_value { int_value: 6 } } } parameters { key: "output_file_format" value { field_value { int_value: 5 } } } } downstream_nodes: "Evaluator" downstream_nodes: "Trainer" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized running bdist_wheel running build running build_py creating build creating build/lib copying penguin_trainer.py -> build/lib installing to /tmp/tmp6s2sw4dj running install running install_lib copying build/lib/penguin_trainer.py -> /tmp/tmp6s2sw4dj running install_egg_info running egg_info creating tfx_user_code_Trainer.egg-info writing tfx_user_code_Trainer.egg-info/PKG-INFO writing dependency_links to tfx_user_code_Trainer.egg-info/dependency_links.txt writing top-level names to tfx_user_code_Trainer.egg-info/top_level.txt writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt' reading manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt' writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt' Copying tfx_user_code_Trainer.egg-info to /tmp/tmp6s2sw4dj/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3.7.egg-info running install_scripts creating /tmp/tmp6s2sw4dj/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/WHEEL creating '/tmp/tmp6jr76e54/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl' and adding '/tmp/tmp6s2sw4dj' to it adding 'penguin_trainer.py' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/METADATA' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/WHEEL' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/top_level.txt' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/RECORD' removing /tmp/tmp6s2sw4dj WARNING: Logging before InitGoogleLogging() is written to STDERR I1205 10:34:23.723806 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:23.730262 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:23.736788 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:23.744907 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:select span and version = (0, None) INFO:absl:latest span and version = (0, None) INFO:absl:MetadataStore with DB connection initialized I1205 10:34:23.758380 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Going to run a new execution 1 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=1, input_dict={}, output_dict=defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "span" value { int_value: 0 } } , artifact_type: name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )]}), exec_properties={'output_file_format': 5, 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_data_format': 6, 'input_base': '/tmp/tfx-datal5lxy_yw', 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463'}, execution_output_uri='pipelines/penguin-tfma/CsvExampleGen/.system/executor_execution/1/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/CsvExampleGen/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/CsvExampleGen/.system/executor_execution/1/.temp/', pipeline_node=node_info { type { name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen" } id: "CsvExampleGen" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } } outputs { outputs { key: "examples" value { artifact_spec { type { name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } } } } } } parameters { parameters { key: "input_base" value { field_value { string_value: "/tmp/tfx-datal5lxy_yw" } } } parameters { key: "input_config" value { field_value { string_value: "{\n \"splits\": [\n {\n \"name\": \"single_split\",\n \"pattern\": \"*\"\n }\n ]\n}" } } } parameters { key: "output_config" value { field_value { string_value: "{\n \"split_config\": {\n \"splits\": [\n {\n \"hash_buckets\": 2,\n \"name\": \"train\"\n },\n {\n \"hash_buckets\": 1,\n \"name\": \"eval\"\n }\n ]\n }\n}" } } } parameters { key: "output_data_format" value { field_value { int_value: 6 } } } parameters { key: "output_file_format" value { field_value { int_value: 5 } } } } downstream_nodes: "Evaluator" downstream_nodes: "Trainer" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:Generating examples. WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features. INFO:absl:Processing input csv data /tmp/tfx-datal5lxy_yw/* to TFExample. WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter. WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be. INFO:absl:Examples generated. INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 1 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "span" value { int_value: 0 } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )]}) for execution 1 INFO:absl:MetadataStore with DB connection initialized INFO:absl:Component CsvExampleGen is finished. INFO:absl:Component latest_blessed_model_resolver is running. INFO:absl:Running launcher for node_info { type { name: "tfx.dsl.components.common.resolver.Resolver" } id: "latest_blessed_model_resolver" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.latest_blessed_model_resolver" } } } } inputs { inputs { key: "model" value { channels { context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } artifact_query { type { name: "Model" } } } } } inputs { key: "model_blessing" value { channels { context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } artifact_query { type { name: "ModelBlessing" } } } } } resolver_config { resolver_steps { class_path: "tfx.dsl.input_resolution.strategies.latest_blessed_model_strategy.LatestBlessedModelStrategy" config_json: "{}" input_keys: "model" input_keys: "model_blessing" } } } downstream_nodes: "Evaluator" execution_options { caching_options { } } INFO:absl:Running as an resolver node. INFO:absl:MetadataStore with DB connection initialized WARNING:absl:Artifact type Model is not found in MLMD. WARNING:absl:Artifact type ModelBlessing is not found in MLMD. I1205 10:34:24.899447 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component latest_blessed_model_resolver is finished. INFO:absl:Component Trainer is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.trainer.component.Trainer" } id: "Trainer" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } } inputs { inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } } outputs { outputs { key: "model" value { artifact_spec { type { name: "Model" } } } } outputs { key: "model_run" value { artifact_spec { type { name: "ModelRun" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "eval_args" value { field_value { string_value: "{\n \"num_steps\": 5\n}" } } } parameters { key: "module_path" value { field_value { string_value: "penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl" } } } parameters { key: "train_args" value { field_value { string_value: "{\n \"num_steps\": 100\n}" } } } } upstream_nodes: "CsvExampleGen" downstream_nodes: "Evaluator" downstream_nodes: "Pusher" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized INFO:absl:MetadataStore with DB connection initialized I1205 10:34:24.924589 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Going to run a new execution 3 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=3, input_dict={'examples': [Artifact(artifact: id: 1 type_id: 15 uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" properties { key: "split_names" value { string_value: "[\"train\", \"eval\"]" } } custom_properties { key: "file_format" value { string_value: "tfrecords_gzip" } } custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "payload_format" value { string_value: "FORMAT_TF_EXAMPLE" } } custom_properties { key: "span" value { int_value: 0 } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700464882 last_update_time_since_epoch: 1638700464882 , artifact_type: id: 15 name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )]}, output_dict=defaultdict(<class 'list'>, {'model_run': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model_run/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model_run:0" } } , artifact_type: name: "ModelRun" )], 'model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } , artifact_type: name: "Model" )]}), exec_properties={'train_args': '{\n "num_steps": 100\n}', 'custom_config': 'null', 'eval_args': '{\n "num_steps": 5\n}', 'module_path': 'penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'}, execution_output_uri='pipelines/penguin-tfma/Trainer/.system/executor_execution/3/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/Trainer/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/Trainer/.system/executor_execution/3/.temp/', pipeline_node=node_info { type { name: "tfx.components.trainer.component.Trainer" } id: "Trainer" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } } inputs { inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } } outputs { outputs { key: "model" value { artifact_spec { type { name: "Model" } } } } outputs { key: "model_run" value { artifact_spec { type { name: "ModelRun" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "eval_args" value { field_value { string_value: "{\n \"num_steps\": 5\n}" } } } parameters { key: "module_path" value { field_value { string_value: "penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl" } } } parameters { key: "train_args" value { field_value { string_value: "{\n \"num_steps\": 100\n}" } } } } upstream_nodes: "CsvExampleGen" downstream_nodes: "Evaluator" downstream_nodes: "Pusher" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:Train on the 'train' split when train_args.splits is not set. INFO:absl:Evaluate on the 'eval' split when eval_args.splits is not set. INFO:absl:udf_utils.get_fn {'train_args': '{\n "num_steps": 100\n}', 'custom_config': 'null', 'eval_args': '{\n "num_steps": 5\n}', 'module_path': 'penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'} 'run_fn' INFO:absl:Installing 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl' to a temporary directory. INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmpc97ini82', 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'] Processing ./pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl INFO:absl:Successfully installed 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'. INFO:absl:Training model. INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. Installing collected packages: tfx-user-code-Trainer Successfully installed tfx-user-code-Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703 INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Model: "model" INFO:absl:__________________________________________________________________________________________________ INFO:absl:Layer (type) Output Shape Param # Connected to INFO:absl:================================================================================================== INFO:absl:culmen_length_mm (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:culmen_depth_mm (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:flipper_length_mm (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:body_mass_g (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:concatenate (Concatenate) (None, 4) 0 culmen_length_mm[0][0] INFO:absl: culmen_depth_mm[0][0] INFO:absl: flipper_length_mm[0][0] INFO:absl: body_mass_g[0][0] INFO:absl:__________________________________________________________________________________________________ INFO:absl:dense (Dense) (None, 8) 40 concatenate[0][0] INFO:absl:__________________________________________________________________________________________________ INFO:absl:dense_1 (Dense) (None, 8) 72 dense[0][0] INFO:absl:__________________________________________________________________________________________________ INFO:absl:dense_2 (Dense) (None, 3) 27 dense_1[0][0] INFO:absl:================================================================================================== INFO:absl:Total params: 139 INFO:absl:Trainable params: 139 INFO:absl:Non-trainable params: 0 INFO:absl:__________________________________________________________________________________________________ 100/100 [==============================] - 1s 3ms/step - loss: 0.5273 - sparse_categorical_accuracy: 0.8175 - val_loss: 0.2412 - val_sparse_categorical_accuracy: 0.9600 2021-12-05 10:34:29.879208: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. INFO:tensorflow:Assets written to: pipelines/penguin-tfma/Trainer/model/3/Format-Serving/assets INFO:tensorflow:Assets written to: pipelines/penguin-tfma/Trainer/model/3/Format-Serving/assets INFO:absl:Training complete. Model written to pipelines/penguin-tfma/Trainer/model/3/Format-Serving. ModelRun written to pipelines/penguin-tfma/Trainer/model_run/3 INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 3 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'model_run': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model_run/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model_run:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "ModelRun" )], 'model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "Model" )]}) for execution 3 INFO:absl:MetadataStore with DB connection initialized I1205 10:34:30.399760 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:30.404250 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component Trainer is finished. INFO:absl:Component Evaluator is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.evaluator.component.Evaluator" } id: "Evaluator" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } } inputs { inputs { key: "baseline_model" value { channels { producer_node_query { id: "latest_blessed_model_resolver" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.latest_blessed_model_resolver" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } } outputs { outputs { key: "blessing" value { artifact_spec { type { name: "ModelBlessing" } } } } outputs { key: "evaluation" value { artifact_spec { type { name: "ModelEvaluation" } } } } } parameters { parameters { key: "eval_config" value { field_value { string_value: "{\n \"metrics_specs\": [\n {\n \"per_slice_thresholds\": {\n \"sparse_categorical_accuracy\": {\n \"thresholds\": [\n {\n \"slicing_specs\": [\n {}\n ],\n \"threshold\": {\n \"change_threshold\": {\n \"absolute\": -1e-10,\n \"direction\": \"HIGHER_IS_BETTER\"\n },\n \"value_threshold\": {\n \"lower_bound\": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n \"model_specs\": [\n {\n \"label_key\": \"species\"\n }\n ],\n \"slicing_specs\": [\n {},\n {\n \"feature_keys\": [\n \"species\"\n ]\n }\n ]\n}" } } } parameters { key: "example_splits" value { field_value { string_value: "null" } } } parameters { key: "fairness_indicator_thresholds" value { field_value { string_value: "null" } } } } upstream_nodes: "CsvExampleGen" upstream_nodes: "Trainer" upstream_nodes: "latest_blessed_model_resolver" downstream_nodes: "Pusher" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized I1205 10:34:30.428037 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:MetadataStore with DB connection initialized INFO:absl:Going to run a new execution 4 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=4, input_dict={'examples': [Artifact(artifact: id: 1 type_id: 15 uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" properties { key: "split_names" value { string_value: "[\"train\", \"eval\"]" } } custom_properties { key: "file_format" value { string_value: "tfrecords_gzip" } } custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "payload_format" value { string_value: "FORMAT_TF_EXAMPLE" } } custom_properties { key: "span" value { int_value: 0 } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700464882 last_update_time_since_epoch: 1638700464882 , artifact_type: id: 15 name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )], 'model': [Artifact(artifact: id: 3 type_id: 19 uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700470409 last_update_time_since_epoch: 1638700470409 , artifact_type: id: 19 name: "Model" )], 'baseline_model': []}, output_dict=defaultdict(<class 'list'>, {'blessing': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/blessing/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:blessing:0" } } , artifact_type: name: "ModelBlessing" )], 'evaluation': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/evaluation/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:evaluation:0" } } , artifact_type: name: "ModelEvaluation" )]}), exec_properties={'example_splits': 'null', 'eval_config': '{\n "metrics_specs": [\n {\n "per_slice_thresholds": {\n "sparse_categorical_accuracy": {\n "thresholds": [\n {\n "slicing_specs": [\n {}\n ],\n "threshold": {\n "change_threshold": {\n "absolute": -1e-10,\n "direction": "HIGHER_IS_BETTER"\n },\n "value_threshold": {\n "lower_bound": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n "model_specs": [\n {\n "label_key": "species"\n }\n ],\n "slicing_specs": [\n {},\n {\n "feature_keys": [\n "species"\n ]\n }\n ]\n}', 'fairness_indicator_thresholds': 'null'}, execution_output_uri='pipelines/penguin-tfma/Evaluator/.system/executor_execution/4/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/Evaluator/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/Evaluator/.system/executor_execution/4/.temp/', pipeline_node=node_info { type { name: "tfx.components.evaluator.component.Evaluator" } id: "Evaluator" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } } inputs { inputs { key: "baseline_model" value { channels { producer_node_query { id: "latest_blessed_model_resolver" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.latest_blessed_model_resolver" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } } outputs { outputs { key: "blessing" value { artifact_spec { type { name: "ModelBlessing" } } } } outputs { key: "evaluation" value { artifact_spec { type { name: "ModelEvaluation" } } } } } parameters { parameters { key: "eval_config" value { field_value { string_value: "{\n \"metrics_specs\": [\n {\n \"per_slice_thresholds\": {\n \"sparse_categorical_accuracy\": {\n \"thresholds\": [\n {\n \"slicing_specs\": [\n {}\n ],\n \"threshold\": {\n \"change_threshold\": {\n \"absolute\": -1e-10,\n \"direction\": \"HIGHER_IS_BETTER\"\n },\n \"value_threshold\": {\n \"lower_bound\": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n \"model_specs\": [\n {\n \"label_key\": \"species\"\n }\n ],\n \"slicing_specs\": [\n {},\n {\n \"feature_keys\": [\n \"species\"\n ]\n }\n ]\n}" } } } parameters { key: "example_splits" value { field_value { string_value: "null" } } } parameters { key: "fairness_indicator_thresholds" value { field_value { string_value: "null" } } } } upstream_nodes: "CsvExampleGen" upstream_nodes: "Trainer" upstream_nodes: "latest_blessed_model_resolver" downstream_nodes: "Pusher" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:udf_utils.get_fn {'example_splits': 'null', 'eval_config': '{\n "metrics_specs": [\n {\n "per_slice_thresholds": {\n "sparse_categorical_accuracy": {\n "thresholds": [\n {\n "slicing_specs": [\n {}\n ],\n "threshold": {\n "change_threshold": {\n "absolute": -1e-10,\n "direction": "HIGHER_IS_BETTER"\n },\n "value_threshold": {\n "lower_bound": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n "model_specs": [\n {\n "label_key": "species"\n }\n ],\n "slicing_specs": [\n {},\n {\n "feature_keys": [\n "species"\n ]\n }\n ]\n}', 'fairness_indicator_thresholds': 'null'} 'custom_eval_shared_model' INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } INFO:absl:Using pipelines/penguin-tfma/Trainer/model/3/Format-Serving as model. INFO:absl:The 'example_splits' parameter is not set, using 'eval' split. INFO:absl:Evaluating model. INFO:absl:udf_utils.get_fn {'example_splits': 'null', 'eval_config': '{\n "metrics_specs": [\n {\n "per_slice_thresholds": {\n "sparse_categorical_accuracy": {\n "thresholds": [\n {\n "slicing_specs": [\n {}\n ],\n "threshold": {\n "change_threshold": {\n "absolute": -1e-10,\n "direction": "HIGHER_IS_BETTER"\n },\n "value_threshold": {\n "lower_bound": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n "model_specs": [\n {\n "label_key": "species"\n }\n ],\n "slicing_specs": [\n {},\n {\n "feature_keys": [\n "species"\n ]\n }\n ]\n}', 'fairness_indicator_thresholds': 'null'} 'custom_extractors' INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { model_names: "" per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { model_names: "" per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { model_names: "" per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter. INFO:absl:Evaluation complete. Results written to pipelines/penguin-tfma/Evaluator/evaluation/4. INFO:absl:Checking validation results. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: `tf.data.TFRecordDataset(path)` WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: `tf.data.TFRecordDataset(path)` INFO:absl:Blessing result True written to pipelines/penguin-tfma/Evaluator/blessing/4. INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 4 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'blessing': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/blessing/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:blessing:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "ModelBlessing" )], 'evaluation': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/evaluation/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:evaluation:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "ModelEvaluation" )]}) for execution 4 INFO:absl:MetadataStore with DB connection initialized I1205 10:34:35.040588 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:35.045548 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component Evaluator is finished. INFO:absl:Component Pusher is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.pusher.component.Pusher" } id: "Pusher" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Pusher" } } } } inputs { inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "model_blessing" value { channels { producer_node_query { id: "Evaluator" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } artifact_query { type { name: "ModelBlessing" } } output_key: "blessing" } } } } outputs { outputs { key: "pushed_model" value { artifact_spec { type { name: "PushedModel" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "push_destination" value { field_value { string_value: "{\n \"filesystem\": {\n \"base_directory\": \"serving_model/penguin-tfma\"\n }\n}" } } } } upstream_nodes: "Evaluator" upstream_nodes: "Trainer" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized I1205 10:34:35.068168 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:MetadataStore with DB connection initialized INFO:absl:Going to run a new execution 5 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=5, input_dict={'model': [Artifact(artifact: id: 3 type_id: 19 uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700470409 last_update_time_since_epoch: 1638700470409 , artifact_type: id: 19 name: "Model" )], 'model_blessing': [Artifact(artifact: id: 4 type_id: 21 uri: "pipelines/penguin-tfma/Evaluator/blessing/4" custom_properties { key: "blessed" value { int_value: 1 } } custom_properties { key: "current_model" value { string_value: "pipelines/penguin-tfma/Trainer/model/3" } } custom_properties { key: "current_model_id" value { int_value: 3 } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:blessing:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700475049 last_update_time_since_epoch: 1638700475049 , artifact_type: id: 21 name: "ModelBlessing" )]}, output_dict=defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Pusher/pushed_model/5" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Pusher:pushed_model:0" } } , artifact_type: name: "PushedModel" )]}), exec_properties={'custom_config': 'null', 'push_destination': '{\n "filesystem": {\n "base_directory": "serving_model/penguin-tfma"\n }\n}'}, execution_output_uri='pipelines/penguin-tfma/Pusher/.system/executor_execution/5/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/Pusher/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/Pusher/.system/executor_execution/5/.temp/', pipeline_node=node_info { type { name: "tfx.components.pusher.component.Pusher" } id: "Pusher" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Pusher" } } } } inputs { inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "model_blessing" value { channels { producer_node_query { id: "Evaluator" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } artifact_query { type { name: "ModelBlessing" } } output_key: "blessing" } } } } outputs { outputs { key: "pushed_model" value { artifact_spec { type { name: "PushedModel" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "push_destination" value { field_value { string_value: "{\n \"filesystem\": {\n \"base_directory\": \"serving_model/penguin-tfma\"\n }\n}" } } } } upstream_nodes: "Evaluator" upstream_nodes: "Trainer" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:Model version: 1638700475 INFO:absl:Model written to serving path serving_model/penguin-tfma/1638700475. INFO:absl:Model pushed to pipelines/penguin-tfma/Pusher/pushed_model/5. INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 5 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Pusher/pushed_model/5" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Pusher:pushed_model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "PushedModel" )]}) for execution 5 INFO:absl:MetadataStore with DB connection initialized I1205 10:34:35.098553 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component Pusher is finished.
パイプラインが完了すると、次のようなものが表示されるはずです。
INFO:absl:Blessing result True written to pipelines/penguin-tfma/Evaluator/blessing/4.
または、生成されたアーティファクトが保存されている出力ディレクトリを手動で確認することもできます。あなたが訪問した場合pipelines/penguin-tfma/Evaluator/blessing/
ファイルbroswerでは、あなたが名前のファイルを参照することができますBLESSED
かNOT_BLESSED
評価結果に応じて。
祝福の結果である場合はFalse
、プッシャーはにモデルをプッシュすることを拒否しますserving_model_dir
モデルが生産に使用されるための良い十分ではありませんので、。
おそらく異なる評価構成を使用して、パイプラインを再度実行できます。あなたは正確に同じ設定とデータセットとのパイプラインを実行している場合でも、訓練されたモデルは、につながる可能モデル訓練の固有のランダム性に起因する多少異なる場合がありますNOT_BLESSED
モデル。
パイプラインの出力を調べます
TFMAを使用して、ModelEvaluationアーティファクトの評価結果を調査および視覚化できます。
出力アーティファクトから分析結果を取得する
MLMD APIを使用して、これらの出力をプログラムで見つけることができます。最初に、生成されたばかりの出力アーティファクトを検索するためのいくつかの効用関数を定義します。
from ml_metadata.proto import metadata_store_pb2
# Non-public APIs, just for showcase.
from tfx.orchestration.portable.mlmd import execution_lib
# TODO(b/171447278): Move these functions into the TFX library.
def get_latest_artifacts(metadata, pipeline_name, component_id):
"""Output artifacts of the latest run of the component."""
context = metadata.store.get_context_by_type_and_name(
'node', f'{pipeline_name}.{component_id}')
executions = metadata.store.get_executions_by_context(context.id)
latest_execution = max(executions,
key=lambda e:e.last_update_time_since_epoch)
return execution_lib.get_artifacts_dict(metadata, latest_execution.id,
[metadata_store_pb2.Event.OUTPUT])
私たちは、最新の実行を見つけることができますEvaluator
コンポーネントを、それの出力アーティファクトを取得します。
# Non-public APIs, just for showcase.
from tfx.orchestration.metadata import Metadata
from tfx.types import standard_component_specs
metadata_connection_config = tfx.orchestration.metadata.sqlite_metadata_connection_config(
METADATA_PATH)
with Metadata(metadata_connection_config) as metadata_handler:
# Find output artifacts from MLMD.
evaluator_output = get_latest_artifacts(metadata_handler, PIPELINE_NAME,
'Evaluator')
eval_artifact = evaluator_output[standard_component_specs.EVALUATION_KEY][0]
INFO:absl:MetadataStore with DB connection initialized
Evaluator
、常に1つの評価アーティファクトを返し、私たちはTensorFlowモデル解析ライブラリを使用して、それを可視化することができます。たとえば、次のコードは、各ペンギン種の精度メトリックをレンダリングします。
import tensorflow_model_analysis as tfma
eval_result = tfma.load_eval_result(eval_artifact.uri)
tfma.view.render_slicing_metrics(eval_result, slicing_column='species')
SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'species:0', 'metrics…
あなたが「sparse_categorical_accuracy」を選択した場合はShow
ドロップダウンリストには、種ごとの精度の値を見ることができます。さらにスライスを追加して、モデルがすべての分布に適しているかどうか、およびバイアスの可能性があるかどうかを確認することをお勧めします。
次のステップ
で、モデル分析に詳細情報TensorFlowモデル解析ライブラリのチュートリアル。
あなたは上でより多くのリソースを見つけることができますhttps://www.tensorflow.org/tfx/tutorials
参照してくださいTFXパイプラインが理解TFXの様々な概念についての詳細を学ぶために。