Bug 6646 - AMX&AI tensorflow容器镜像,使用resnet50 基于bf16精度测试,并且intel_extension_for_tensorflow开启 Auto Mixed Precision模式,报错No registered '_ITEXCast' OpKernel for 'CPU' devices compatible with node {{node resnet50/Cast}}
Summary: AMX&AI tensorflow容器镜像,使用resnet50 基于bf16精度测试,并且intel_extension_for_tensorflow开...
Status: NEW
Alias: None
Product: Anolis OS 23
Classification: Anolis OS
Component: Images&Installations (show other bugs) Images&Installations
Version: 23.0
Hardware: x86_64 Linux
: P3-Medium S2-major
Target Milestone: ---
Assignee: Jacob
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-20 11:38 UTC by feitian200603
Modified: 2023-09-20 12:01 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 feitian200603 alibaba_cloud_group 2023-09-20 11:52:14 UTC
(In reply to feitian200603 from comment #0)
> ntel_extension_for_tensorflow开启 Auto Mixed
> Precision模式,环境变量ITEX_AUTO_MIXED_PRECISION_DATA_TYPE设置方式与官网文档不符,
> 官方文档https://github.com/intel/intel-extension-for-tensorflow/blob/main/docs/
> guide/aamp_tune.md#tuning-performance-example-by-advanced-amp-configure-list-
> manually,提示开启自动混合精度模式优化:
> 两种方式:
> 1,Python API	
> Basic (Default configuration)	import intel_extension_for_tensorflow as itex
> 
> auto_mixed_precision_options = itex.AutoMixedPrecisionOptions()
> auto_mixed_precision_options.data_type = itex.BFLOAT16 #itex.FLOAT16
> 
> graph_options = itex.GraphOptions()
> graph_options.auto_mixed_precision_options=auto_mixed_precision_options
> graph_options.auto_mixed_precision = itex.ON
> 
> config = itex.ConfigProto(graph_options=graph_options)
> itex.set_config(config)
> 2,Environment Variable
> export ITEX_AUTO_MIXED_PRECISION=1
> export ITEX_AUTO_MIXED_PRECISION_DATA_TYPE="BFLOAT16" #"FLOAT16"
> 
> 实际测试效果:
> Node: 'resnet50/Cast'
> No registered '_ITEXCast' OpKernel for 'CPU' devices compatible with node
> {{node resnet50/Cast}}
> 	 (OpKernel was found, but attributes didn't match) Requested Attributes:
> DstT=DT_BFLOAT16, SrcT=DT_BFLOAT16, T=DT_BFLOAT16, Truncate=false,
> _XlaHasReferenceVars=false,
> _device="/job:localhost/replica:0/task:0/device:CPU:0"
> 	.  Registered:  device='CPU'; SrcT in [DT_HALF]; DstT in [DT_BFLOAT16]
>   device='CPU'; SrcT in [DT_HALF]; DstT in [DT_FLOAT]
>   device='CPU'; SrcT in [DT_BFLOAT16]; DstT in [DT_HALF]
>   device='CPU'; SrcT in [DT_BFLOAT16]; DstT in [DT_FLOAT]
>   device='CPU'; SrcT in [DT_FLOAT]; DstT in [DT_HALF]
>   device='CPU'; SrcT in [DT_FLOAT]; DstT in [DT_BFLOAT16]
Comment 2 feitian200603 alibaba_cloud_group 2023-09-20 12:00:11 UTC
复现步骤:模型使用bf16精度测试
for inputs, labels in data_iterator:
        cur = cur + 1
        if (cur*batch_size > iterations):
            break
        print("inference dataset batch_size cur",cur)
        # 设置精度
        policy = mixed_precision.Policy('mixed_bfloat16')
        mixed_precision.set_global_policy(policy)
      
        # 将输入数据转换为指定精度 
        inputs=tf.dtypes.cast(inputs, tf.bfloat16)
        
        tick = time.time()
        #inputs = inputs.to(input_device)
        #labels = labels.to(input_device)
        # 进行推理
        with strategy.scope():
            outputs = model.predict(inputs)
        tock = time.time()
        times.append(tock - tick)