tensorflow手写识别，在进行测试时，循环最后一次batch数发生变化导致报错，如何解决？

我在自己学习写人工神经网络的结构，代码放在最后面。

但是运行出来的结果报错，显示：

ValueError: cannot reshape array of size 784 into shape (32,1,newaxis)

另外就是，我试着在感觉有问题的地方，控制台输出检查了一下，发现测试部分的循环到了最后一次，会把batch数量减半，我找不到原因。x_train和 x_test使用的卷积处理函数都是相同的，输入的数据格式也是相同的。

x测试 (32, 28, 28)
x测试 (32, 28, 28)
x测试 (32, 28, 28)
x测试 (16, 28, 28)

最后还有两个小问题，一个是目前前面训练的过程，我感觉误差极大，可能是目前模型太简单只有一层的缘故，我想问一下有什么方法可以提高训练的效率吗，是不是增多神经元层数就会好些？
另一个问题是，我为了符合tf.nn.conv2d()函数的输入格式，采用了tf.squeeze()来处理张量维度的方法是否正确，会不会对输入的数据造成影响？

def output(input, get1, get2, batch):
    x = tf.expand_dims(input, 3)
    output = tf.nn.conv2d(x, get1, strides=[1, 2, 2, 1], padding='SAME')
    output = tf.nn.conv2d(output, get2, strides=[1, 2, 2, 1], padding='SAME')
    output = tf.squeeze(output, 3)
    output = np.reshape(output, (batch, 1, -1))  # 处理和输出的数据，（组数，1，-1）表示一行与n列
    output = tf.cast(output, tf.float64)
    # print(output)
    return output

程序的代码

import os
from sklearn import datasets
from matplotlib import pyplot as plt

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import tensorflow as tf
from keras import models
import numpy as np

mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

x_train = tf.cast(x_train, tf.float64)
y_train = tf.cast(y_train, tf.int32)

x_test = tf.cast(x_test, tf.float64)
y_test = tf.cast(y_test, tf.int32)

train_db = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32)
test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

# 特征类型
get_1 = tf.constant(value=np.ones((3, 3), dtype=np.float64), shape=(3, 3, 1, 1))
get_2 = tf.constant(value=np.eye(3, dtype=np.float64), shape=(3, 3, 1, 1))
# 迭代次数
epoch = 50
# 学习率
lr = 0.1
# 每轮分成4个step，loss_all由于存放每轮4个loss的和
loss_all = 0
# 记录每一轮的损失函数于列表
train_loss_results = []
# 记录测试时候的准确率于列表
test_acc = []
# 第一层权重、偏置，并且设置可以更新
w1 = tf.Variable(tf.random.truncated_normal([49, 10], stddev=0.1, seed=1, dtype=np.float64))
b1 = tf.Variable(tf.random.truncated_normal([10], stddev=0.1, seed=1, dtype=np.float64))


# 特征输出函数
def output(input, get1, get2, batch):
    x = tf.expand_dims(input, 3)
    output = tf.nn.conv2d(x, get1, strides=[1, 2, 2, 1], padding='SAME')
    output = tf.nn.conv2d(output, get2, strides=[1, 2, 2, 1], padding='SAME')
    output = tf.squeeze(output, 3)
    output = np.reshape(output, (batch, 1, -1))  # 处理和输出的数据，（组数，1，-1）表示一行与n列
    output = tf.cast(output, tf.float64)
    # print(output)
    return output


print(x_train.shape)
print(x_test.shape)
# 训练阶段
print('训练开始')
for epoch in range(epoch):
    for step1, (x_train, y_train) in enumerate(train_db):
        print('x训练', x_train.shape)
        # print(step, x_train.shape, y_train.shape)
        # print('.......分割线........')
        # print(output(x_train, get_1, get_2, 32).shape)
        # print('.......分割线........')
        # print((tf.matmul(output(x_train, get_1, get_2, 32), w1) + b1).shape)
        with tf.GradientTape() as tape:
            x = output(x_train, get_1, get_2, 32)
            y = tf.matmul(x, w1) + b1
            y = tf.nn.softmax(y)
            y_ = tf.one_hot(y_train, depth=10)
            y_ = tf.cast(y_, tf.float64)
            # 计算神经网络误差
            loss = tf.reduce_mean(tf.square(y_ - y))
            loss_all += loss.numpy()
        grads = tape.gradient(loss, [w1, b1])  # 求loss关于【w1，b1】的导数，也就是梯度
        # 实现w1、b1的自更新，更新公式为w1 = w1 - lr * w1_gard
        w1.assign_sub(lr * grads[0])
        b1.assign_sub(lr * grads[1])
    print("Epoch {},lodd: {}".format(epoch, loss_all / 4))  # 打印每个epoch的平均误差
    train_loss_results.append(loss_all / 4)  # 添加入list中，方便绘制曲线
    loss_all = 0
    total_correct, total_number = 0, 0
    for x_test, y_test in test_db:  # 测试
        print('x测试',x_test.shape)
        x = output(x_test, get_1, get_2, 32)
        y = tf.matmul(x, w1) + b1
        y = tf.nn.softmax(y)  # 转化为预测概率矩阵
        # print(y.shape)
        y = tf.squeeze(y, 1)
        # print(y.shape)
        pred = tf.argmax(y, axis=1)
        # print(pred.shape)
        pred = tf.cast(pred, dtype=y_test.dtype)
        correct = tf.cast(tf.equal(pred, y_test), dtype=tf.int32)
        correct = tf.reduce_sum(correct)
        total_correct += int(correct)
        total_number += x_test.shape[0]
    acc = total_correct / total_number
    test_acc.append(acc)
    print("Test_acc", acc)
    print("............................")

# 绘制损失曲线
plt.title('Loss Curve')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.plot(train_loss_results, label='$Loss$')
plt.legend()
plt.show()
# 绘制准确率曲线
plt.title('Acc Curve')
plt.xlabel('Epoch')
plt.ylabel('Acc')
plt.plot(train_loss_results, label='$Accuracy$')
plt.legend()
plt.show()

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
奋斗的番茄 2022-05-19 19:30
关注
batch代表的是每次从训练集取的样本数，你的训练集样本应该最后一次只剩下16个了。

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决 1
无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

用PyTorch实现MNIST手写数字识别(非常详细)
2020-04-30 22:25

小锋学长生活大爆炸的博客 hello大家好！我又来搬文章了！我就不信还有比这更详细的？...在本文中，我们将在PyTorch中构建一个简单的卷积神经网络，并使用MNIST数据集训练它识别手写数字。在MNIST数据集上训练分类器可以看作是......
【深度学习 & 测试】基于 Keras 的手写数字识别训练 | 人工智能 面试题：请解释一下批归一化（Batch Normalization）的原理和作用
2022-11-04 16:42

追光者♂的博客【深度学习】基于 Keras 的手写数字识别训练（卷积神经网络）| 人工智能 面试题：请解释一下批归一化（Batch Normalization）的原理和作用。
Deep Learning - CNN（手写字识别 - Pytorch）
2025-05-14 21:53

pflik-sj的博客 Deep Learning（手写字识别）
matlab编写识别手写数字_[Python人工智能] 十一.循环神经网络RNN和LSTM原理详解及TensorFlow编写RNN分类案例...
2020-11-17 18:46

weixin_39851279的博客循环神经网络RNN和LSTM原理详解及TensorFlow编写RNN分类案例从本专栏开始，正式开始研究Python深度学习、神经网络及人工智能相关知识。前一篇讲解了TensorFlow如何保存变量和神经网络参数，通过Saver保存神经网络，...
卷积神经网络与循环神经网络实战 --- 手写数字识别及诗词创作
2023-02-07 23:52

Python-AI Xenon的博客人工神经网络（Artificial Neural Networks，简写为ANNs）也简称为神经网络（NNs）或称作连接模型（Connection Model），它是一种模仿动物神经网络行为特征，进行分布式并行信息处理的算法数学模型。这种网络依靠...
【深度学习】基于卷积神经网络（tensorflow）的人脸识别项目（四）
2022-08-23 21:12

林夕07的博客实现一个基于界面化的一个人脸识别。... 测试人脸识别效果，通过OpenCV捕获人脸照片然后对图片进行预处理最后传入模型中，然后将识别的结果通过文字的形式打印在屏幕上，以此循环，直到输入q退出。............
OpenCV+TensorFlow图片手写数字识别(附源码)
2019-11-10 21:17

Color Space的博客初次接触TensorFlow，而手写数字训练识别是其最基本的入门教程，网上关于训练的教程很多，但是模型的测试大多都是官方提供的一些素材，能不能自己随便写一串数字让机器识别出来呢？纸上得来终觉浅，带着这个疑问昨晚...
深度学习：从手写数字识别案例认识pytorch框架
2025-08-25 21:15

山烛的博客在深度学习领域，PyTorch 凭借动态图机制、简洁 API 和...本文以 MNIST 手写数字识别任务为核心，结合完整 PyTorch 代码与关键理论知识，从数据加载、模型构建到训练测试，带你掌握 PyTorch 深度学习实战的核心流程。
从0到1，AI我来了- （1）从AI手写数字识别开始
2024-07-25 22:52

zhm6422107的博客先把MNIST手写数字，自动识别程序跑起来。
深度学习第P1周：实现mnist手写数字识别
2022-09-27 13:45

sdnugao的博客 365天深度学习训练营-第P1周：实现mnist手写数字识别
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
系统已结题 5月27日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
已采纳回答 5月19日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
赞助了问题酬金20元 5月19日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
创建了问题 5月19日

tensorflow手写识别，在进行测试时，循环最后一次batch数发生变化导致报错，如何解决？

2条回答 默认 最新

问题事件

2条回答默认最新