只将批量大小设置1为演示数据集。在实际训练和测试过程中,应使用Kaggle竞赛的完整数据集,并将批次大小设置为更大的整数,例如128。使用10%作为调整超参数的验证集。
batch_size = 1 if demo else 128
valid_ratio = 0.1
reorg_cifar10_data(data_dir, valid_ratio)
2. Image Augmentation
为了解决过度拟合的问题,使用图像增强技术。例如,通过添加transforms.RandomFlipLeftRight(),图像可以随机翻转。还可以使用transforms.Normalize()。下面,将列出其中一些操作,可以根据需要选择使用或修改这些操作。
transform_train = gluon.data.vision.transforms.Compose([
# Magnify the image to a square of 40 pixels in both height and width
gluon.data.vision.transforms.Resize(40),
# Randomly crop a square image of 40 pixels in both height and width to
# produce a small square of 0.64 to 1 times the area of the original
# image, and then shrink it to a square of 32 pixels in both height and
# width
gluon.data.vision.transforms.RandomResizedCrop(32, scale=(0.64, 1.0),
ratio=(1.0, 1.0)),
gluon.data.vision.transforms.RandomFlipLeftRight(),
gluon.data.vision.transforms.ToTensor(),
# Normalize each channel of the image
gluon.data.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],
[0.2023, 0.1994, 0.2010])])
为了保证测试过程中输出的确定性,只对图像进行归一化处理。
transform_test = gluon.data.vision.transforms.Compose([
gluon.data.vision.transforms.ToTensor(),
gluon.data.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],
[0.2023, 0.1994, 0.2010])])
3. Reading the Dataset
接下来,可以创建ImageFolderDataset实例来读取包含原始图像文件的有组织的数据集,其中每个示例都包含图像和标签。
train_ds, valid_ds, train_valid_ds, test_ds = [
gluon.data.vision.ImageFolderDataset(
os.path.join(data_dir, \'train_valid_test\', folder))
for folder in [\'train\', \'valid\', \'train_valid\', \'test\']]
在DataLoader中指定定义的图像增强操作。在训练过程中,只使用验证集来评估模型,所以需要确保输出的确定性。在预测过程中,将在组合训练集和验证集上训练模型,以充分利用所有标记数据。
train_iter, train_valid_iter = [gluon.data.DataLoader(
dataset.transform_first(transform_train), batch_size, shuffle=True,
last_batch=\'keep\') for dataset in (train_ds, train_valid_ds)]
valid_iter, test_iter = [gluon.data.DataLoader(
dataset.transform_first(transform_test), batch_size, shuffle=False,
last_batch=\'keep\') for dataset in (valid_ds, test_ds)]
4. Defining the Model
基于HybridBlock类构建剩余块,这样做是为了提高执行效率。
class Residual(nn.HybridBlock):
def __init__(self, num_channels, use_1x1conv=False, strides=1, **kwargs):
super(Residual, self).__init__(**kwargs)
self.conv1 = nn.Conv2D(num_channels, kernel_size=3, padding=1,
strides=strides)
self.conv2 = nn.Conv2D(num_channels, kernel_size=3, padding=1)
if use_1x1conv:
self.conv3 = nn.Conv2D(num_channels, kernel_size=1,
strides=strides)
else:
self.conv3 = None
self.bn1 = nn.BatchNorm()
self.bn2 = nn.BatchNorm()
def hybrid_forward(self, F, X):
Y = F.npx.relu(self.bn1(self.conv1(X)))
Y = self.bn2(self.conv2(Y))
if self.conv3:
X = self.conv3(X)
return F.npx.relu(Y + X)
定义ResNet-18模型。
def resnet18(num_classes):
net = nn.HybridSequential()
net.add(nn.Conv2D(64, kernel_size=3, strides=1, padding=1),
nn.BatchNorm(), nn.Activation(\'relu\'))
def resnet_block(num_channels, num_residuals, first_block=False):
blk = nn.HybridSequential()
for i in range(num_residuals):
if i == 0 and not first_block:
blk.add(Residual(num_channels, use_1x1conv=True, strides=2))
else:
blk.add(Residual(num_channels))
return blk
net.add(resnet_block(64, 2, first_block=True),
resnet_block(128, 2),
resnet_block(256, 2),
resnet_block(512, 2))
net.add(nn.GlobalAvgPool2D(), nn.Dense(num_classes))
return net
CIFAR-10图像分类挑战赛使用10个类别。在训练开始之前,将对模型执行Xavier随机初始化。
def get_net(ctx):
num_classes = 10
net = resnet18(num_classes)
net.initialize(ctx=ctx, init=init.Xavier())
return net
loss = gluon.loss.SoftmaxCrossEntropyLoss()
5. Defining the Training Functions