در این آموزش مبتنی بر نوتبوک، یک خط لوله TFX ایجاد و اجرا میکنیم که یک مدل طبقهبندی ساده ایجاد میکند و عملکرد آن را در چندین اجرا تحلیل میکند. این نوت بوک است در خط لوله TFX ما در ساخته شده بر اساس ساده TFX خط لوله آموزش . اگر هنوز آن آموزش را نخوانده اید، باید قبل از ادامه با این دفترچه آن را بخوانید.
همانطور که مدل خود را تغییر می دهید یا آن را با یک مجموعه داده جدید آموزش می دهید، باید بررسی کنید که آیا مدل شما بهبود یافته یا بدتر شده است. فقط بررسی معیارهای سطح بالا مانند دقت ممکن است کافی نباشد. هر مدل آموزش دیده باید قبل از تولید آن مورد ارزیابی قرار گیرد.
ما یک اضافه Evaluator
جزء به خط لوله در آموزش های قبلی ایجاد شده است. مؤلفه Evaluator تجزیه و تحلیل عمیقی را برای مدلهای شما انجام میدهد و مدل جدید را با یک خط مبنا مقایسه میکند تا مشخص کند که آنها به اندازه کافی خوب هستند. این است که به استفاده از اجرا TensorFlow تجزیه و تحلیل مدل کتابخانه.
لطفا نگاه کنید به درک TFX خط لوله برای کسب اطلاعات بیشتر در مورد مفاهیم مختلف در TFX.
برپایی
مراحل Setup مانند آموزش قبلی است.
ابتدا باید بسته TFX Python را نصب کنیم و مجموعه داده ای را که برای مدل خود استفاده خواهیم کرد دانلود کنیم.
پیپ را ارتقا دهید
برای جلوگیری از ارتقاء Pip در سیستم هنگام اجرای محلی، بررسی کنید که در Colab در حال اجرا هستیم. البته سیستم های محلی را می توان به طور جداگانه ارتقا داد.
try:
import colab
!pip install --upgrade pip
except:
pass
TFX را نصب کنید
pip install -U tfx
آیا زمان اجرا را مجدداً راه اندازی کردید؟
اگر از Google Colab استفاده میکنید، اولین باری که سلول بالا را اجرا میکنید، باید با کلیک کردن روی دکمه «راهاندازی مجدد زمان اجرا» یا با استفاده از منوی «زمان اجرا > زمان اجرا مجدد ...» زمان اجرا را مجدداً راهاندازی کنید. این به دلیل روشی است که Colab بسته ها را بارگذاری می کند.
نسخه های TensorFlow و TFX را بررسی کنید.
import tensorflow as tf
print('TensorFlow version: {}'.format(tf.__version__))
from tfx import v1 as tfx
print('TFX version: {}'.format(tfx.__version__))
TensorFlow version: 2.6.2 TFX version: 1.4.0
متغیرها را تنظیم کنید
برای تعریف خط لوله از متغیرهایی استفاده می شود. شما می توانید این متغیرها را به دلخواه شخصی سازی کنید. به طور پیش فرض تمام خروجی از خط لوله تحت دایرکتوری فعلی تولید می شود.
import os
PIPELINE_NAME = "penguin-tfma"
# Output directory to store artifacts generated from the pipeline.
PIPELINE_ROOT = os.path.join('pipelines', PIPELINE_NAME)
# Path to a SQLite DB file to use as an MLMD storage.
METADATA_PATH = os.path.join('metadata', PIPELINE_NAME, 'metadata.db')
# Output directory where created models from the pipeline will be exported.
SERVING_MODEL_DIR = os.path.join('serving_model', PIPELINE_NAME)
from absl import logging
logging.set_verbosity(logging.INFO) # Set default logging level.
داده های نمونه را آماده کنید
ما همان استفاده خواهد کرد پالمر پنگوئن مجموعه داده .
چهار ویژگی عددی در این مجموعه داده وجود دارد که قبلاً برای داشتن محدوده [0،1] نرمال شده بودند. ما یک مدل طبقه بندی که پیش بینی ساخت species
از پنگوئن ها.
از آنجایی که TFX ExampleGen ورودی های یک دایرکتوری را می خواند، باید یک دایرکتوری ایجاد کنیم و مجموعه داده را در آن کپی کنیم.
import urllib.request
import tempfile
DATA_ROOT = tempfile.mkdtemp(prefix='tfx-data') # Create a temporary directory.
_data_url = 'https://raw.githubusercontent.com/tensorflow/tfx/master/tfx/examples/penguin/data/labelled/penguins_processed.csv'
_data_filepath = os.path.join(DATA_ROOT, "data.csv")
urllib.request.urlretrieve(_data_url, _data_filepath)
('/tmp/tfx-datal5lxy_yw/data.csv', <http.client.HTTPMessage at 0x7fa18a9da150>)
یک خط لوله ایجاد کنید
ما یک اضافه Evaluator
جزء به خط لوله ما در ایجاد ساده TFX خط لوله آموزش .
جزء یک ارزیاب نیاز به داده های ورودی از یک ExampleGen
جزء و یک مدل از یک Trainer
جزء و tfma.EvalConfig
شی. ما می توانیم به صورت اختیاری یک مدل پایه ارائه کنیم که می تواند برای مقایسه معیارها با مدل تازه آموزش دیده استفاده شود.
یک ارزیاب دو نوع مصنوعات خروجی، ایجاد ModelEvaluation
و ModelBlessing
. ModelEvaluation شامل نتیجه ارزیابی دقیق است که می تواند با کتابخانه TFMA بیشتر بررسی و تجسم شود. ModelBlessing حاوی یک نتیجه بولی است که آیا مدل از معیارهای داده شده عبور کرده و می تواند در اجزای بعدی مانند Pusher به عنوان سیگنال استفاده شود.
کد آموزشی مدل را بنویسید
ما را به کد مدل همان است که در استفاده از ساده TFX خط لوله آموزش .
_trainer_module_file = 'penguin_trainer.py'
%%writefile {_trainer_module_file}
# Copied from https://www.tensorflow.org/tfx/tutorials/tfx/penguin_simple
from typing import List
from absl import logging
import tensorflow as tf
from tensorflow import keras
from tensorflow_transform.tf_metadata import schema_utils
from tfx.components.trainer.executor import TrainerFnArgs
from tfx.components.trainer.fn_args_utils import DataAccessor
from tfx_bsl.tfxio import dataset_options
from tensorflow_metadata.proto.v0 import schema_pb2
_FEATURE_KEYS = [
'culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g'
]
_LABEL_KEY = 'species'
_TRAIN_BATCH_SIZE = 20
_EVAL_BATCH_SIZE = 10
# Since we're not generating or creating a schema, we will instead create
# a feature spec. Since there are a fairly small number of features this is
# manageable for this dataset.
_FEATURE_SPEC = {
**{
feature: tf.io.FixedLenFeature(shape=[1], dtype=tf.float32)
for feature in _FEATURE_KEYS
},
_LABEL_KEY: tf.io.FixedLenFeature(shape=[1], dtype=tf.int64)
}
def _input_fn(file_pattern: List[str],
data_accessor: DataAccessor,
schema: schema_pb2.Schema,
batch_size: int = 200) -> tf.data.Dataset:
"""Generates features and label for training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
schema: schema of the input data.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, indices) tuple where features is a
dictionary of Tensors, and indices is a single Tensor of label indices.
"""
return data_accessor.tf_dataset_factory(
file_pattern,
dataset_options.TensorFlowDatasetOptions(
batch_size=batch_size, label_key=_LABEL_KEY),
schema=schema).repeat()
def _build_keras_model() -> tf.keras.Model:
"""Creates a DNN Keras model for classifying penguin data.
Returns:
A Keras Model.
"""
# The model below is built with Functional API, please refer to
# https://www.tensorflow.org/guide/keras/overview for all API options.
inputs = [keras.layers.Input(shape=(1,), name=f) for f in _FEATURE_KEYS]
d = keras.layers.concatenate(inputs)
for _ in range(2):
d = keras.layers.Dense(8, activation='relu')(d)
outputs = keras.layers.Dense(3)(d)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(
optimizer=keras.optimizers.Adam(1e-2),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy()])
model.summary(print_fn=logging.info)
return model
# TFX Trainer will call this function.
def run_fn(fn_args: TrainerFnArgs):
"""Train the model based on given args.
Args:
fn_args: Holds args used to train the model as name/value pairs.
"""
# This schema is usually either an output of SchemaGen or a manually-curated
# version provided by pipeline author. A schema can also derived from TFT
# graph if a Transform component is used. In the case when either is missing,
# `schema_from_feature_spec` could be used to generate schema from very simple
# feature_spec, but the schema returned would be very primitive.
schema = schema_utils.schema_from_feature_spec(_FEATURE_SPEC)
train_dataset = _input_fn(
fn_args.train_files,
fn_args.data_accessor,
schema,
batch_size=_TRAIN_BATCH_SIZE)
eval_dataset = _input_fn(
fn_args.eval_files,
fn_args.data_accessor,
schema,
batch_size=_EVAL_BATCH_SIZE)
model = _build_keras_model()
model.fit(
train_dataset,
steps_per_epoch=fn_args.train_steps,
validation_data=eval_dataset,
validation_steps=fn_args.eval_steps)
# The result of the training should be saved in `fn_args.serving_model_dir`
# directory.
model.save(fn_args.serving_model_dir, save_format='tf')
Writing penguin_trainer.py
یک تعریف خط لوله بنویسید
ما تابعی را برای ایجاد خط لوله TFX تعریف می کنیم. علاوه بر این به جزء ارزیاب در بالا ذکر شد، ما یک گره به نام اضافه Resolver
. برای بررسی بهتر شدن مدل جدید نسبت به مدل قبلی، باید آن را با مدل منتشر شده قبلی، به نام خط پایه، مقایسه کنیم. ML متاداده (MLMD) آهنگ تمام مصنوعات قبلی این خط لوله و Resolver
می توانید پیدا کردن آنچه آخرین مدل خوشبخت بودم - یک مدل ارزیاب با موفقیت به تصویب - از MLMD استفاده از یک کلاس استراتژی به نام LatestBlessedModelStrategy
.
import tensorflow_model_analysis as tfma
def _create_pipeline(pipeline_name: str, pipeline_root: str, data_root: str,
module_file: str, serving_model_dir: str,
metadata_path: str) -> tfx.dsl.Pipeline:
"""Creates a three component penguin pipeline with TFX."""
# Brings data into the pipeline.
example_gen = tfx.components.CsvExampleGen(input_base=data_root)
# Uses user-provided Python function that trains a model.
trainer = tfx.components.Trainer(
module_file=module_file,
examples=example_gen.outputs['examples'],
train_args=tfx.proto.TrainArgs(num_steps=100),
eval_args=tfx.proto.EvalArgs(num_steps=5))
# NEW: Get the latest blessed model for Evaluator.
model_resolver = tfx.dsl.Resolver(
strategy_class=tfx.dsl.experimental.LatestBlessedModelStrategy,
model=tfx.dsl.Channel(type=tfx.types.standard_artifacts.Model),
model_blessing=tfx.dsl.Channel(
type=tfx.types.standard_artifacts.ModelBlessing)).with_id(
'latest_blessed_model_resolver')
# NEW: Uses TFMA to compute evaluation statistics over features of a model and
# perform quality validation of a candidate model (compared to a baseline).
eval_config = tfma.EvalConfig(
model_specs=[tfma.ModelSpec(label_key='species')],
slicing_specs=[
# An empty slice spec means the overall slice, i.e. the whole dataset.
tfma.SlicingSpec(),
# Calculate metrics for each penguin species.
tfma.SlicingSpec(feature_keys=['species']),
],
metrics_specs=[
tfma.MetricsSpec(per_slice_thresholds={
'sparse_categorical_accuracy':
tfma.PerSliceMetricThresholds(thresholds=[
tfma.PerSliceMetricThreshold(
slicing_specs=[tfma.SlicingSpec()],
threshold=tfma.MetricThreshold(
value_threshold=tfma.GenericValueThreshold(
lower_bound={'value': 0.6}),
# Change threshold will be ignored if there is no
# baseline model resolved from MLMD (first run).
change_threshold=tfma.GenericChangeThreshold(
direction=tfma.MetricDirection.HIGHER_IS_BETTER,
absolute={'value': -1e-10}))
)]),
})],
)
evaluator = tfx.components.Evaluator(
examples=example_gen.outputs['examples'],
model=trainer.outputs['model'],
baseline_model=model_resolver.outputs['model'],
eval_config=eval_config)
# Checks whether the model passed the validation steps and pushes the model
# to a file destination if check passed.
pusher = tfx.components.Pusher(
model=trainer.outputs['model'],
model_blessing=evaluator.outputs['blessing'], # Pass an evaluation result.
push_destination=tfx.proto.PushDestination(
filesystem=tfx.proto.PushDestination.Filesystem(
base_directory=serving_model_dir)))
components = [
example_gen,
trainer,
# Following two components were added to the pipeline.
model_resolver,
evaluator,
pusher,
]
return tfx.dsl.Pipeline(
pipeline_name=pipeline_name,
pipeline_root=pipeline_root,
metadata_connection_config=tfx.orchestration.metadata
.sqlite_metadata_connection_config(metadata_path),
components=components)
ما نیاز به تامین اطلاعات زیر را به ارزیاب از طریق eval_config
:
- معیارهای اضافی برای پیکربندی (اگر معیارهای بیشتری از آنچه در مدل تعریف شده می خواهید).
- برش هایی برای پیکربندی
- آستانه اعتبارسنجی مدل برای بررسی اینکه آیا اعتبار سنجی گنجانده شده است یا خیر
از آنجا SparseCategoricalAccuracy
در حال حاضر در نظر گرفته شد model.compile()
پاسخ، از آن خواهد شد در تجزیه و تحلیل شامل به صورت خودکار. بنابراین ما هیچ معیار دیگری را در اینجا اضافه نمی کنیم. SparseCategoricalAccuracy
شود تصمیم بگیرد که آیا مدل اندازه کافی خوب است، بیش از حد استفاده خواهد شد.
ما معیارها را برای کل مجموعه داده و برای هر گونه پنگوئن محاسبه می کنیم. SlicingSpec
مشخص چگونه ما جمع معیارهای اعلام کرد.
دو آستانه وجود دارد که یک مدل جدید باید از آن عبور کند، یکی آستانه مطلق 0.6 و دیگری آستانه نسبی است که باید بالاتر از مدل پایه باشد. هنگامی که شما اجرا خط لوله برای اولین بار، change_threshold
نادیده گرفته می شود و تنها value_threshold بررسی می شود. اگر شما این خط لوله بیش از یک بار، Resolver
خواهد یک مدل از اجرای قبلی را پیدا کرده و آن را به عنوان یک مدل پایه برای مقایسه استفاده می شود.
مشاهده راهنمای جزء ارزیاب برای اطلاعات بیشتر.
خط لوله را اجرا کنید
ما استفاده خواهد کرد LocalDagRunner
در آموزش های قبلی.
tfx.orchestration.LocalDagRunner().run(
_create_pipeline(
pipeline_name=PIPELINE_NAME,
pipeline_root=PIPELINE_ROOT,
data_root=DATA_ROOT,
module_file=_trainer_module_file,
serving_model_dir=SERVING_MODEL_DIR,
metadata_path=METADATA_PATH))
INFO:absl:Generating ephemeral wheel package for '/tmpfs/src/temp/docs/tutorials/tfx/penguin_trainer.py' (including modules: ['penguin_trainer']). INFO:absl:User module package has hash fingerprint version 1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703. INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '/tmp/tmpr3anh67s/_tfx_generated_setup.py', 'bdist_wheel', '--bdist-dir', '/tmp/tmp6s2sw4dj', '--dist-dir', '/tmp/tmp6jr76e54'] /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. setuptools.SetuptoolsDeprecationWarning, listing git files failed - pretending there aren't any INFO:absl:Successfully built user code wheel distribution at 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'; target user module is 'penguin_trainer'. INFO:absl:Full user module path is 'penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl' INFO:absl:Using deployment config: executor_specs { key: "CsvExampleGen" value { beam_executable_spec { python_executor_spec { class_path: "tfx.components.example_gen.csv_example_gen.executor.Executor" } } } } executor_specs { key: "Evaluator" value { beam_executable_spec { python_executor_spec { class_path: "tfx.components.evaluator.executor.Executor" } } } } executor_specs { key: "Pusher" value { python_class_executable_spec { class_path: "tfx.components.pusher.executor.Executor" } } } executor_specs { key: "Trainer" value { python_class_executable_spec { class_path: "tfx.components.trainer.executor.GenericExecutor" } } } custom_driver_specs { key: "CsvExampleGen" value { python_class_executable_spec { class_path: "tfx.components.example_gen.driver.FileBasedDriver" } } } metadata_connection_config { sqlite { filename_uri: "metadata/penguin-tfma/metadata.db" connection_mode: READWRITE_OPENCREATE } } INFO:absl:Using connection config: sqlite { filename_uri: "metadata/penguin-tfma/metadata.db" connection_mode: READWRITE_OPENCREATE } INFO:absl:Component CsvExampleGen is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen" } id: "CsvExampleGen" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } } outputs { outputs { key: "examples" value { artifact_spec { type { name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } } } } } } parameters { parameters { key: "input_base" value { field_value { string_value: "/tmp/tfx-datal5lxy_yw" } } } parameters { key: "input_config" value { field_value { string_value: "{\n \"splits\": [\n {\n \"name\": \"single_split\",\n \"pattern\": \"*\"\n }\n ]\n}" } } } parameters { key: "output_config" value { field_value { string_value: "{\n \"split_config\": {\n \"splits\": [\n {\n \"hash_buckets\": 2,\n \"name\": \"train\"\n },\n {\n \"hash_buckets\": 1,\n \"name\": \"eval\"\n }\n ]\n }\n}" } } } parameters { key: "output_data_format" value { field_value { int_value: 6 } } } parameters { key: "output_file_format" value { field_value { int_value: 5 } } } } downstream_nodes: "Evaluator" downstream_nodes: "Trainer" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized running bdist_wheel running build running build_py creating build creating build/lib copying penguin_trainer.py -> build/lib installing to /tmp/tmp6s2sw4dj running install running install_lib copying build/lib/penguin_trainer.py -> /tmp/tmp6s2sw4dj running install_egg_info running egg_info creating tfx_user_code_Trainer.egg-info writing tfx_user_code_Trainer.egg-info/PKG-INFO writing dependency_links to tfx_user_code_Trainer.egg-info/dependency_links.txt writing top-level names to tfx_user_code_Trainer.egg-info/top_level.txt writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt' reading manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt' writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt' Copying tfx_user_code_Trainer.egg-info to /tmp/tmp6s2sw4dj/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3.7.egg-info running install_scripts creating /tmp/tmp6s2sw4dj/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/WHEEL creating '/tmp/tmp6jr76e54/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl' and adding '/tmp/tmp6s2sw4dj' to it adding 'penguin_trainer.py' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/METADATA' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/WHEEL' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/top_level.txt' adding 'tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703.dist-info/RECORD' removing /tmp/tmp6s2sw4dj WARNING: Logging before InitGoogleLogging() is written to STDERR I1205 10:34:23.723806 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:23.730262 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:23.736788 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:23.744907 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:select span and version = (0, None) INFO:absl:latest span and version = (0, None) INFO:absl:MetadataStore with DB connection initialized I1205 10:34:23.758380 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Going to run a new execution 1 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=1, input_dict={}, output_dict=defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "span" value { int_value: 0 } } , artifact_type: name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )]}), exec_properties={'output_file_format': 5, 'output_config': '{\n "split_config": {\n "splits": [\n {\n "hash_buckets": 2,\n "name": "train"\n },\n {\n "hash_buckets": 1,\n "name": "eval"\n }\n ]\n }\n}', 'input_config': '{\n "splits": [\n {\n "name": "single_split",\n "pattern": "*"\n }\n ]\n}', 'output_data_format': 6, 'input_base': '/tmp/tfx-datal5lxy_yw', 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463'}, execution_output_uri='pipelines/penguin-tfma/CsvExampleGen/.system/executor_execution/1/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/CsvExampleGen/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/CsvExampleGen/.system/executor_execution/1/.temp/', pipeline_node=node_info { type { name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen" } id: "CsvExampleGen" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } } outputs { outputs { key: "examples" value { artifact_spec { type { name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } } } } } } parameters { parameters { key: "input_base" value { field_value { string_value: "/tmp/tfx-datal5lxy_yw" } } } parameters { key: "input_config" value { field_value { string_value: "{\n \"splits\": [\n {\n \"name\": \"single_split\",\n \"pattern\": \"*\"\n }\n ]\n}" } } } parameters { key: "output_config" value { field_value { string_value: "{\n \"split_config\": {\n \"splits\": [\n {\n \"hash_buckets\": 2,\n \"name\": \"train\"\n },\n {\n \"hash_buckets\": 1,\n \"name\": \"eval\"\n }\n ]\n }\n}" } } } parameters { key: "output_data_format" value { field_value { int_value: 6 } } } parameters { key: "output_file_format" value { field_value { int_value: 5 } } } } downstream_nodes: "Evaluator" downstream_nodes: "Trainer" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:Generating examples. WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features. INFO:absl:Processing input csv data /tmp/tfx-datal5lxy_yw/* to TFExample. WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter. WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be. INFO:absl:Examples generated. INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 1 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "span" value { int_value: 0 } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )]}) for execution 1 INFO:absl:MetadataStore with DB connection initialized INFO:absl:Component CsvExampleGen is finished. INFO:absl:Component latest_blessed_model_resolver is running. INFO:absl:Running launcher for node_info { type { name: "tfx.dsl.components.common.resolver.Resolver" } id: "latest_blessed_model_resolver" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.latest_blessed_model_resolver" } } } } inputs { inputs { key: "model" value { channels { context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } artifact_query { type { name: "Model" } } } } } inputs { key: "model_blessing" value { channels { context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } artifact_query { type { name: "ModelBlessing" } } } } } resolver_config { resolver_steps { class_path: "tfx.dsl.input_resolution.strategies.latest_blessed_model_strategy.LatestBlessedModelStrategy" config_json: "{}" input_keys: "model" input_keys: "model_blessing" } } } downstream_nodes: "Evaluator" execution_options { caching_options { } } INFO:absl:Running as an resolver node. INFO:absl:MetadataStore with DB connection initialized WARNING:absl:Artifact type Model is not found in MLMD. WARNING:absl:Artifact type ModelBlessing is not found in MLMD. I1205 10:34:24.899447 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component latest_blessed_model_resolver is finished. INFO:absl:Component Trainer is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.trainer.component.Trainer" } id: "Trainer" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } } inputs { inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } } outputs { outputs { key: "model" value { artifact_spec { type { name: "Model" } } } } outputs { key: "model_run" value { artifact_spec { type { name: "ModelRun" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "eval_args" value { field_value { string_value: "{\n \"num_steps\": 5\n}" } } } parameters { key: "module_path" value { field_value { string_value: "penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl" } } } parameters { key: "train_args" value { field_value { string_value: "{\n \"num_steps\": 100\n}" } } } } upstream_nodes: "CsvExampleGen" downstream_nodes: "Evaluator" downstream_nodes: "Pusher" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized INFO:absl:MetadataStore with DB connection initialized I1205 10:34:24.924589 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Going to run a new execution 3 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=3, input_dict={'examples': [Artifact(artifact: id: 1 type_id: 15 uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" properties { key: "split_names" value { string_value: "[\"train\", \"eval\"]" } } custom_properties { key: "file_format" value { string_value: "tfrecords_gzip" } } custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "payload_format" value { string_value: "FORMAT_TF_EXAMPLE" } } custom_properties { key: "span" value { int_value: 0 } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700464882 last_update_time_since_epoch: 1638700464882 , artifact_type: id: 15 name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )]}, output_dict=defaultdict(<class 'list'>, {'model_run': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model_run/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model_run:0" } } , artifact_type: name: "ModelRun" )], 'model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } , artifact_type: name: "Model" )]}), exec_properties={'train_args': '{\n "num_steps": 100\n}', 'custom_config': 'null', 'eval_args': '{\n "num_steps": 5\n}', 'module_path': 'penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'}, execution_output_uri='pipelines/penguin-tfma/Trainer/.system/executor_execution/3/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/Trainer/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/Trainer/.system/executor_execution/3/.temp/', pipeline_node=node_info { type { name: "tfx.components.trainer.component.Trainer" } id: "Trainer" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } } inputs { inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } } outputs { outputs { key: "model" value { artifact_spec { type { name: "Model" } } } } outputs { key: "model_run" value { artifact_spec { type { name: "ModelRun" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "eval_args" value { field_value { string_value: "{\n \"num_steps\": 5\n}" } } } parameters { key: "module_path" value { field_value { string_value: "penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl" } } } parameters { key: "train_args" value { field_value { string_value: "{\n \"num_steps\": 100\n}" } } } } upstream_nodes: "CsvExampleGen" downstream_nodes: "Evaluator" downstream_nodes: "Pusher" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:Train on the 'train' split when train_args.splits is not set. INFO:absl:Evaluate on the 'eval' split when eval_args.splits is not set. INFO:absl:udf_utils.get_fn {'train_args': '{\n "num_steps": 100\n}', 'custom_config': 'null', 'eval_args': '{\n "num_steps": 5\n}', 'module_path': 'penguin_trainer@pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'} 'run_fn' INFO:absl:Installing 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl' to a temporary directory. INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmpc97ini82', 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'] Processing ./pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl INFO:absl:Successfully installed 'pipelines/penguin-tfma/_wheels/tfx_user_code_Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703-py3-none-any.whl'. INFO:absl:Training model. INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. Installing collected packages: tfx-user-code-Trainer Successfully installed tfx-user-code-Trainer-0.0+1e19049dced0ccb21e0af60dae1c6e0ef09b63d1ff0e370d7f699920c2735703 INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature body_mass_g has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_depth_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature culmen_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature flipper_length_mm has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Feature species has a shape dim { size: 1 } . Setting to DenseTensor. INFO:absl:Model: "model" INFO:absl:__________________________________________________________________________________________________ INFO:absl:Layer (type) Output Shape Param # Connected to INFO:absl:================================================================================================== INFO:absl:culmen_length_mm (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:culmen_depth_mm (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:flipper_length_mm (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:body_mass_g (InputLayer) [(None, 1)] 0 INFO:absl:__________________________________________________________________________________________________ INFO:absl:concatenate (Concatenate) (None, 4) 0 culmen_length_mm[0][0] INFO:absl: culmen_depth_mm[0][0] INFO:absl: flipper_length_mm[0][0] INFO:absl: body_mass_g[0][0] INFO:absl:__________________________________________________________________________________________________ INFO:absl:dense (Dense) (None, 8) 40 concatenate[0][0] INFO:absl:__________________________________________________________________________________________________ INFO:absl:dense_1 (Dense) (None, 8) 72 dense[0][0] INFO:absl:__________________________________________________________________________________________________ INFO:absl:dense_2 (Dense) (None, 3) 27 dense_1[0][0] INFO:absl:================================================================================================== INFO:absl:Total params: 139 INFO:absl:Trainable params: 139 INFO:absl:Non-trainable params: 0 INFO:absl:__________________________________________________________________________________________________ 100/100 [==============================] - 1s 3ms/step - loss: 0.5273 - sparse_categorical_accuracy: 0.8175 - val_loss: 0.2412 - val_sparse_categorical_accuracy: 0.9600 2021-12-05 10:34:29.879208: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. INFO:tensorflow:Assets written to: pipelines/penguin-tfma/Trainer/model/3/Format-Serving/assets INFO:tensorflow:Assets written to: pipelines/penguin-tfma/Trainer/model/3/Format-Serving/assets INFO:absl:Training complete. Model written to pipelines/penguin-tfma/Trainer/model/3/Format-Serving. ModelRun written to pipelines/penguin-tfma/Trainer/model_run/3 INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 3 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'model_run': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model_run/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model_run:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "ModelRun" )], 'model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "Model" )]}) for execution 3 INFO:absl:MetadataStore with DB connection initialized I1205 10:34:30.399760 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:30.404250 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component Trainer is finished. INFO:absl:Component Evaluator is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.evaluator.component.Evaluator" } id: "Evaluator" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } } inputs { inputs { key: "baseline_model" value { channels { producer_node_query { id: "latest_blessed_model_resolver" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.latest_blessed_model_resolver" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } } outputs { outputs { key: "blessing" value { artifact_spec { type { name: "ModelBlessing" } } } } outputs { key: "evaluation" value { artifact_spec { type { name: "ModelEvaluation" } } } } } parameters { parameters { key: "eval_config" value { field_value { string_value: "{\n \"metrics_specs\": [\n {\n \"per_slice_thresholds\": {\n \"sparse_categorical_accuracy\": {\n \"thresholds\": [\n {\n \"slicing_specs\": [\n {}\n ],\n \"threshold\": {\n \"change_threshold\": {\n \"absolute\": -1e-10,\n \"direction\": \"HIGHER_IS_BETTER\"\n },\n \"value_threshold\": {\n \"lower_bound\": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n \"model_specs\": [\n {\n \"label_key\": \"species\"\n }\n ],\n \"slicing_specs\": [\n {},\n {\n \"feature_keys\": [\n \"species\"\n ]\n }\n ]\n}" } } } parameters { key: "example_splits" value { field_value { string_value: "null" } } } parameters { key: "fairness_indicator_thresholds" value { field_value { string_value: "null" } } } } upstream_nodes: "CsvExampleGen" upstream_nodes: "Trainer" upstream_nodes: "latest_blessed_model_resolver" downstream_nodes: "Pusher" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized I1205 10:34:30.428037 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:MetadataStore with DB connection initialized INFO:absl:Going to run a new execution 4 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=4, input_dict={'examples': [Artifact(artifact: id: 1 type_id: 15 uri: "pipelines/penguin-tfma/CsvExampleGen/examples/1" properties { key: "split_names" value { string_value: "[\"train\", \"eval\"]" } } custom_properties { key: "file_format" value { string_value: "tfrecords_gzip" } } custom_properties { key: "input_fingerprint" value { string_value: "split:single_split,num_files:1,total_bytes:25648,xor_checksum:1638700463,sum_checksum:1638700463" } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:CsvExampleGen:examples:0" } } custom_properties { key: "payload_format" value { string_value: "FORMAT_TF_EXAMPLE" } } custom_properties { key: "span" value { int_value: 0 } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700464882 last_update_time_since_epoch: 1638700464882 , artifact_type: id: 15 name: "Examples" properties { key: "span" value: INT } properties { key: "split_names" value: STRING } properties { key: "version" value: INT } )], 'model': [Artifact(artifact: id: 3 type_id: 19 uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700470409 last_update_time_since_epoch: 1638700470409 , artifact_type: id: 19 name: "Model" )], 'baseline_model': []}, output_dict=defaultdict(<class 'list'>, {'blessing': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/blessing/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:blessing:0" } } , artifact_type: name: "ModelBlessing" )], 'evaluation': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/evaluation/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:evaluation:0" } } , artifact_type: name: "ModelEvaluation" )]}), exec_properties={'example_splits': 'null', 'eval_config': '{\n "metrics_specs": [\n {\n "per_slice_thresholds": {\n "sparse_categorical_accuracy": {\n "thresholds": [\n {\n "slicing_specs": [\n {}\n ],\n "threshold": {\n "change_threshold": {\n "absolute": -1e-10,\n "direction": "HIGHER_IS_BETTER"\n },\n "value_threshold": {\n "lower_bound": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n "model_specs": [\n {\n "label_key": "species"\n }\n ],\n "slicing_specs": [\n {},\n {\n "feature_keys": [\n "species"\n ]\n }\n ]\n}', 'fairness_indicator_thresholds': 'null'}, execution_output_uri='pipelines/penguin-tfma/Evaluator/.system/executor_execution/4/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/Evaluator/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/Evaluator/.system/executor_execution/4/.temp/', pipeline_node=node_info { type { name: "tfx.components.evaluator.component.Evaluator" } id: "Evaluator" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } } inputs { inputs { key: "baseline_model" value { channels { producer_node_query { id: "latest_blessed_model_resolver" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.latest_blessed_model_resolver" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "examples" value { channels { producer_node_query { id: "CsvExampleGen" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.CsvExampleGen" } } } artifact_query { type { name: "Examples" } } output_key: "examples" } min_count: 1 } } inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } } outputs { outputs { key: "blessing" value { artifact_spec { type { name: "ModelBlessing" } } } } outputs { key: "evaluation" value { artifact_spec { type { name: "ModelEvaluation" } } } } } parameters { parameters { key: "eval_config" value { field_value { string_value: "{\n \"metrics_specs\": [\n {\n \"per_slice_thresholds\": {\n \"sparse_categorical_accuracy\": {\n \"thresholds\": [\n {\n \"slicing_specs\": [\n {}\n ],\n \"threshold\": {\n \"change_threshold\": {\n \"absolute\": -1e-10,\n \"direction\": \"HIGHER_IS_BETTER\"\n },\n \"value_threshold\": {\n \"lower_bound\": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n \"model_specs\": [\n {\n \"label_key\": \"species\"\n }\n ],\n \"slicing_specs\": [\n {},\n {\n \"feature_keys\": [\n \"species\"\n ]\n }\n ]\n}" } } } parameters { key: "example_splits" value { field_value { string_value: "null" } } } parameters { key: "fairness_indicator_thresholds" value { field_value { string_value: "null" } } } } upstream_nodes: "CsvExampleGen" upstream_nodes: "Trainer" upstream_nodes: "latest_blessed_model_resolver" downstream_nodes: "Pusher" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:udf_utils.get_fn {'example_splits': 'null', 'eval_config': '{\n "metrics_specs": [\n {\n "per_slice_thresholds": {\n "sparse_categorical_accuracy": {\n "thresholds": [\n {\n "slicing_specs": [\n {}\n ],\n "threshold": {\n "change_threshold": {\n "absolute": -1e-10,\n "direction": "HIGHER_IS_BETTER"\n },\n "value_threshold": {\n "lower_bound": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n "model_specs": [\n {\n "label_key": "species"\n }\n ],\n "slicing_specs": [\n {},\n {\n "feature_keys": [\n "species"\n ]\n }\n ]\n}', 'fairness_indicator_thresholds': 'null'} 'custom_eval_shared_model' INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } INFO:absl:Using pipelines/penguin-tfma/Trainer/model/3/Format-Serving as model. INFO:absl:The 'example_splits' parameter is not set, using 'eval' split. INFO:absl:Evaluating model. INFO:absl:udf_utils.get_fn {'example_splits': 'null', 'eval_config': '{\n "metrics_specs": [\n {\n "per_slice_thresholds": {\n "sparse_categorical_accuracy": {\n "thresholds": [\n {\n "slicing_specs": [\n {}\n ],\n "threshold": {\n "change_threshold": {\n "absolute": -1e-10,\n "direction": "HIGHER_IS_BETTER"\n },\n "value_threshold": {\n "lower_bound": 0.6\n }\n }\n }\n ]\n }\n }\n }\n ],\n "model_specs": [\n {\n "label_key": "species"\n }\n ],\n "slicing_specs": [\n {},\n {\n "feature_keys": [\n "species"\n ]\n }\n ]\n}', 'fairness_indicator_thresholds': 'null'} 'custom_extractors' INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { model_names: "" per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { model_names: "" per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config= model_specs { label_key: "species" } slicing_specs { } slicing_specs { feature_keys: "species" } metrics_specs { model_names: "" per_slice_thresholds { key: "sparse_categorical_accuracy" value { thresholds { slicing_specs { } threshold { value_threshold { lower_bound { value: 0.6 } } } } } } } WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter. INFO:absl:Evaluation complete. Results written to pipelines/penguin-tfma/Evaluator/evaluation/4. INFO:absl:Checking validation results. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: `tf.data.TFRecordDataset(path)` WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:114: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: `tf.data.TFRecordDataset(path)` INFO:absl:Blessing result True written to pipelines/penguin-tfma/Evaluator/blessing/4. INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 4 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'blessing': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/blessing/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:blessing:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "ModelBlessing" )], 'evaluation': [Artifact(artifact: uri: "pipelines/penguin-tfma/Evaluator/evaluation/4" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:evaluation:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "ModelEvaluation" )]}) for execution 4 INFO:absl:MetadataStore with DB connection initialized I1205 10:34:35.040588 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type I1205 10:34:35.045548 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component Evaluator is finished. INFO:absl:Component Pusher is running. INFO:absl:Running launcher for node_info { type { name: "tfx.components.pusher.component.Pusher" } id: "Pusher" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Pusher" } } } } inputs { inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "model_blessing" value { channels { producer_node_query { id: "Evaluator" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } artifact_query { type { name: "ModelBlessing" } } output_key: "blessing" } } } } outputs { outputs { key: "pushed_model" value { artifact_spec { type { name: "PushedModel" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "push_destination" value { field_value { string_value: "{\n \"filesystem\": {\n \"base_directory\": \"serving_model/penguin-tfma\"\n }\n}" } } } } upstream_nodes: "Evaluator" upstream_nodes: "Trainer" execution_options { caching_options { } } INFO:absl:MetadataStore with DB connection initialized I1205 10:34:35.068168 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:MetadataStore with DB connection initialized INFO:absl:Going to run a new execution 5 INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=5, input_dict={'model': [Artifact(artifact: id: 3 type_id: 19 uri: "pipelines/penguin-tfma/Trainer/model/3" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Trainer:model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700470409 last_update_time_since_epoch: 1638700470409 , artifact_type: id: 19 name: "Model" )], 'model_blessing': [Artifact(artifact: id: 4 type_id: 21 uri: "pipelines/penguin-tfma/Evaluator/blessing/4" custom_properties { key: "blessed" value { int_value: 1 } } custom_properties { key: "current_model" value { string_value: "pipelines/penguin-tfma/Trainer/model/3" } } custom_properties { key: "current_model_id" value { int_value: 3 } } custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Evaluator:blessing:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } state: LIVE create_time_since_epoch: 1638700475049 last_update_time_since_epoch: 1638700475049 , artifact_type: id: 21 name: "ModelBlessing" )]}, output_dict=defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Pusher/pushed_model/5" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Pusher:pushed_model:0" } } , artifact_type: name: "PushedModel" )]}), exec_properties={'custom_config': 'null', 'push_destination': '{\n "filesystem": {\n "base_directory": "serving_model/penguin-tfma"\n }\n}'}, execution_output_uri='pipelines/penguin-tfma/Pusher/.system/executor_execution/5/executor_output.pb', stateful_working_dir='pipelines/penguin-tfma/Pusher/.system/stateful_working_dir/2021-12-05T10:34:23.517028', tmp_dir='pipelines/penguin-tfma/Pusher/.system/executor_execution/5/.temp/', pipeline_node=node_info { type { name: "tfx.components.pusher.component.Pusher" } id: "Pusher" } contexts { contexts { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } contexts { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } contexts { type { name: "node" } name { field_value { string_value: "penguin-tfma.Pusher" } } } } inputs { inputs { key: "model" value { channels { producer_node_query { id: "Trainer" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Trainer" } } } artifact_query { type { name: "Model" } } output_key: "model" } } } inputs { key: "model_blessing" value { channels { producer_node_query { id: "Evaluator" } context_queries { type { name: "pipeline" } name { field_value { string_value: "penguin-tfma" } } } context_queries { type { name: "pipeline_run" } name { field_value { string_value: "2021-12-05T10:34:23.517028" } } } context_queries { type { name: "node" } name { field_value { string_value: "penguin-tfma.Evaluator" } } } artifact_query { type { name: "ModelBlessing" } } output_key: "blessing" } } } } outputs { outputs { key: "pushed_model" value { artifact_spec { type { name: "PushedModel" } } } } } parameters { parameters { key: "custom_config" value { field_value { string_value: "null" } } } parameters { key: "push_destination" value { field_value { string_value: "{\n \"filesystem\": {\n \"base_directory\": \"serving_model/penguin-tfma\"\n }\n}" } } } } upstream_nodes: "Evaluator" upstream_nodes: "Trainer" execution_options { caching_options { } } , pipeline_info=id: "penguin-tfma" , pipeline_run_id='2021-12-05T10:34:23.517028') INFO:absl:Model version: 1638700475 INFO:absl:Model written to serving path serving_model/penguin-tfma/1638700475. INFO:absl:Model pushed to pipelines/penguin-tfma/Pusher/pushed_model/5. INFO:absl:Cleaning up stateless execution info. INFO:absl:Execution 5 succeeded. INFO:absl:Cleaning up stateful execution info. INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-tfma/Pusher/pushed_model/5" custom_properties { key: "name" value { string_value: "penguin-tfma:2021-12-05T10:34:23.517028:Pusher:pushed_model:0" } } custom_properties { key: "tfx_version" value { string_value: "1.4.0" } } , artifact_type: name: "PushedModel" )]}) for execution 5 INFO:absl:MetadataStore with DB connection initialized I1205 10:34:35.098553 28099 rdbms_metadata_access_object.cc:686] No property is defined for the Type INFO:absl:Component Pusher is finished.
وقتی خط لوله تکمیل شد، باید بتوانید چیزی شبیه به زیر را ببینید:
INFO:absl:Blessing result True written to pipelines/penguin-tfma/Evaluator/blessing/4.
یا می توانید به صورت دستی دایرکتوری خروجی را که مصنوعات تولید شده در آن ذخیره می شوند، بررسی کنید. اگر شما بازدید pipelines/penguin-tfma/Evaluator/blessing/
با یک مرورگر بازخواهد فایل، شما می توانید یک فایل را با یک نام را ببینید BLESSED
یا NOT_BLESSED
با توجه به نتیجه ارزیابی.
اگر نتیجه برکت است False
، فروشنده حاضر به فشار مدل به serving_model_dir
، چرا که مدل است به اندازه کافی خوب در تولید استفاده می شود.
شما می توانید خط لوله را مجدداً با تنظیمات ارزیابی مختلف اجرا کنید. حتی اگر شما از خط لوله با پیکربندی دقیق و داده ها، مدل آموزش دیده ممکن است کمی متفاوت با توجه به اتفاقی ذاتی آموزش مدل که می تواند به منجر NOT_BLESSED
مدل.
خروجی های خط لوله را بررسی کنید
می توانید از TFMA برای بررسی و تجسم نتیجه ارزیابی در آرتیفکت ModelEvaluation استفاده کنید.
نتیجه تحلیل را از مصنوعات خروجی دریافت کنید
شما می توانید از API های MLMD برای مکان یابی این خروجی ها به صورت برنامه ای استفاده کنید. ابتدا، برخی از توابع کاربردی را برای جستجوی مصنوعات خروجی که به تازگی تولید شده اند، تعریف می کنیم.
from ml_metadata.proto import metadata_store_pb2
# Non-public APIs, just for showcase.
from tfx.orchestration.portable.mlmd import execution_lib
# TODO(b/171447278): Move these functions into the TFX library.
def get_latest_artifacts(metadata, pipeline_name, component_id):
"""Output artifacts of the latest run of the component."""
context = metadata.store.get_context_by_type_and_name(
'node', f'{pipeline_name}.{component_id}')
executions = metadata.store.get_executions_by_context(context.id)
latest_execution = max(executions,
key=lambda e:e.last_update_time_since_epoch)
return execution_lib.get_artifacts_dict(metadata, latest_execution.id,
[metadata_store_pb2.Event.OUTPUT])
ما می توانید آخرین اجرای پیدا Evaluator
جزء و مصنوعات خروجی از آن.
# Non-public APIs, just for showcase.
from tfx.orchestration.metadata import Metadata
from tfx.types import standard_component_specs
metadata_connection_config = tfx.orchestration.metadata.sqlite_metadata_connection_config(
METADATA_PATH)
with Metadata(metadata_connection_config) as metadata_handler:
# Find output artifacts from MLMD.
evaluator_output = get_latest_artifacts(metadata_handler, PIPELINE_NAME,
'Evaluator')
eval_artifact = evaluator_output[standard_component_specs.EVALUATION_KEY][0]
INFO:absl:MetadataStore with DB connection initialized
Evaluator
همیشه می گرداند یک مصنوع ارزیابی، و ما می توانیم آن با استفاده از کتابخانه TensorFlow تجزیه و تحلیل مدل تجسم. به عنوان مثال، کد زیر معیارهای دقت را برای هر گونه پنگوئن ارائه می کند.
import tensorflow_model_analysis as tfma
eval_result = tfma.load_eval_result(eval_artifact.uri)
tfma.view.render_slicing_metrics(eval_result, slicing_column='species')
SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'species:0', 'metrics…
اگر شما 'sparse_categorical_accuracy در انتخاب Show
لیست کشویی، شما می توانید مقادیر دقت از هر گونه را ببینید. ممکن است بخواهید برش های بیشتری اضافه کنید و بررسی کنید که آیا مدل شما برای همه توزیع ها مناسب است و آیا هرگونه سوگیری احتمالی وجود دارد یا خیر.
مراحل بعدی
بیشتر بدانید در تجزیه و تحلیل مدل در آموزش کتابخانه TensorFlow تجزیه و تحلیل مدل .
شما می توانید منابع بیشتری در پیدا https://www.tensorflow.org/tfx/tutorials
لطفا نگاه کنید به درک TFX خط لوله برای کسب اطلاعات بیشتر در مورد مفاهیم مختلف در TFX.