6646 – AMX&AI tensorflow容器镜像，使用resnet50 基于bf16精度测试，并且intel_extension_for_tensorflow开启 Auto Mixed Precision模式，报错No registered '_ITEXCast' OpKernel for 'CPU' devices compatible with node {{node resnet50/Cast}}

Bug 6646 - AMX&AI tensorflow容器镜像，使用resnet50 基于bf16精度测试，并且intel_extension_for_tensorflow开启 Auto Mixed Precision模式，报错No registered '_ITEXCast' OpKernel for 'CPU' devices compatible with node {{node resnet50/Cast}}

Summary: AMX&AI tensorflow容器镜像，使用resnet50 基于bf16精度测试，并且intel_extension_for_tensorflow开...

Status:	NEW

Alias:	None

Product:	Anolis OS 23
Classification:	Anolis OS
Component:	Images&Installations (show other bugs)	Images&Installations
Sub Component:
Version:	23.0
Hardware:	x86_64 Linux

Importance:	P3-Medium S2-major
Target Milestone:	---
Assignee:	Jacob
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2023-09-20 11:38 UTC by feitian200603
Modified:	2023-09-20 12:01 UTC (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Comment 1 feitian200603 alibaba_cloud_group

2023-09-20 11:52:14 UTC

(In reply to feitian200603 from comment #0)
> ntel_extension_for_tensorflow开启 Auto Mixed
> Precision模式，环境变量ITEX_AUTO_MIXED_PRECISION_DATA_TYPE设置方式与官网文档不符,
> 官方文档https://github.com/intel/intel-extension-for-tensorflow/blob/main/docs/
> guide/aamp_tune.md#tuning-performance-example-by-advanced-amp-configure-list-
> manually，提示开启自动混合精度模式优化：
> 两种方式：
> 1，Python API	
> Basic (Default configuration)	import intel_extension_for_tensorflow as itex
> 
> auto_mixed_precision_options = itex.AutoMixedPrecisionOptions()
> auto_mixed_precision_options.data_type = itex.BFLOAT16 #itex.FLOAT16
> 
> graph_options = itex.GraphOptions()
> graph_options.auto_mixed_precision_options=auto_mixed_precision_options
> graph_options.auto_mixed_precision = itex.ON
> 
> config = itex.ConfigProto(graph_options=graph_options)
> itex.set_config(config)
> 2，Environment Variable
> export ITEX_AUTO_MIXED_PRECISION=1
> export ITEX_AUTO_MIXED_PRECISION_DATA_TYPE="BFLOAT16" #"FLOAT16"
> 
> 实际测试效果：
> Node: 'resnet50/Cast'
> No registered '_ITEXCast' OpKernel for 'CPU' devices compatible with node
> {{node resnet50/Cast}}
> 	 (OpKernel was found, but attributes didn't match) Requested Attributes:
> DstT=DT_BFLOAT16, SrcT=DT_BFLOAT16, T=DT_BFLOAT16, Truncate=false,
> _XlaHasReferenceVars=false,
> _device="/job:localhost/replica:0/task:0/device:CPU:0"
> 	.  Registered:  device='CPU'; SrcT in [DT_HALF]; DstT in [DT_BFLOAT16]
>   device='CPU'; SrcT in [DT_HALF]; DstT in [DT_FLOAT]
>   device='CPU'; SrcT in [DT_BFLOAT16]; DstT in [DT_HALF]
>   device='CPU'; SrcT in [DT_BFLOAT16]; DstT in [DT_FLOAT]
>   device='CPU'; SrcT in [DT_FLOAT]; DstT in [DT_HALF]
>   device='CPU'; SrcT in [DT_FLOAT]; DstT in [DT_BFLOAT16]

Comment 2 feitian200603 alibaba_cloud_group

2023-09-20 12:00:11 UTC

复现步骤：模型使用bf16精度测试
for inputs, labels in data_iterator:
        cur = cur + 1
        if (cur*batch_size > iterations):
            break
        print("inference dataset batch_size cur",cur)
        # 设置精度
        policy = mixed_precision.Policy('mixed_bfloat16')
        mixed_precision.set_global_policy(policy)
      
        # 将输入数据转换为指定精度 
        inputs=tf.dtypes.cast(inputs, tf.bfloat16)
        
        tick = time.time()
        #inputs = inputs.to(input_device)
        #labels = labels.to(input_device)
        # 进行推理
        with strategy.scope():
            outputs = model.predict(inputs)
        tock = time.time()
        times.append(tock - tick)