基于Kaggle的图像分类(CIFAR-10) (2)

只将批量大小设置1为演示数据集。在实际训练和测试过程中,应使用Kaggle竞赛的完整数据集,并将批次大小设置为更大的整数,例如128。使用10%作为调整超参数的验证集。

batch_size = 1 if demo else 128

valid_ratio = 0.1

reorg_cifar10_data(data_dir, valid_ratio)

2. Image Augmentation

为了解决过度拟合的问题,使用图像增强技术。例如,通过添加transforms.RandomFlipLeftRight(),图像可以随机翻转。还可以使用transforms.Normalize()。下面,将列出其中一些操作,可以根据需要选择使用或修改这些操作。

transform_train = gluon.data.vision.transforms.Compose([

# Magnify the image to a square of 40 pixels in both height and width

gluon.data.vision.transforms.Resize(40),

# Randomly crop a square image of 40 pixels in both height and width to

# produce a small square of 0.64 to 1 times the area of the original

# image, and then shrink it to a square of 32 pixels in both height and

# width

gluon.data.vision.transforms.RandomResizedCrop(32, scale=(0.64, 1.0),

ratio=(1.0, 1.0)),

gluon.data.vision.transforms.RandomFlipLeftRight(),

gluon.data.vision.transforms.ToTensor(),

# Normalize each channel of the image

gluon.data.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],

[0.2023, 0.1994, 0.2010])])

为了保证测试过程中输出的确定性,只对图像进行归一化处理。

transform_test = gluon.data.vision.transforms.Compose([

gluon.data.vision.transforms.ToTensor(),

gluon.data.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],

[0.2023, 0.1994, 0.2010])])

3. Reading the Dataset

接下来,可以创建ImageFolderDataset实例来读取包含原始图像文件的有组织的数据集,其中每个示例都包含图像和标签。

train_ds, valid_ds, train_valid_ds, test_ds = [

gluon.data.vision.ImageFolderDataset(

os.path.join(data_dir, \'train_valid_test\', folder))

for folder in [\'train\', \'valid\', \'train_valid\', \'test\']]

在DataLoader中指定定义的图像增强操作。在训练过程中,只使用验证集来评估模型,所以需要确保输出的确定性。在预测过程中,将在组合训练集和验证集上训练模型,以充分利用所有标记数据。

train_iter, train_valid_iter = [gluon.data.DataLoader(

dataset.transform_first(transform_train), batch_size, shuffle=True,

last_batch=\'keep\') for dataset in (train_ds, train_valid_ds)]

valid_iter, test_iter = [gluon.data.DataLoader(

dataset.transform_first(transform_test), batch_size, shuffle=False,

last_batch=\'keep\') for dataset in (valid_ds, test_ds)]

4. Defining the Model

基于HybridBlock类构建剩余块,这样做是为了提高执行效率。

class Residual(nn.HybridBlock):

def __init__(self, num_channels, use_1x1conv=False, strides=1, **kwargs):

super(Residual, self).__init__(**kwargs)

self.conv1 = nn.Conv2D(num_channels, kernel_size=3, padding=1,

strides=strides)

self.conv2 = nn.Conv2D(num_channels, kernel_size=3, padding=1)

if use_1x1conv:

self.conv3 = nn.Conv2D(num_channels, kernel_size=1,

strides=strides)

else:

self.conv3 = None

self.bn1 = nn.BatchNorm()

self.bn2 = nn.BatchNorm()

def hybrid_forward(self, F, X):

Y = F.npx.relu(self.bn1(self.conv1(X)))

Y = self.bn2(self.conv2(Y))

if self.conv3:

X = self.conv3(X)

return F.npx.relu(Y + X)

定义ResNet-18模型。

def resnet18(num_classes):

net = nn.HybridSequential()

net.add(nn.Conv2D(64, kernel_size=3, strides=1, padding=1),

nn.BatchNorm(), nn.Activation(\'relu\'))

def resnet_block(num_channels, num_residuals, first_block=False):

blk = nn.HybridSequential()

for i in range(num_residuals):

if i == 0 and not first_block:

blk.add(Residual(num_channels, use_1x1conv=True, strides=2))

else:

blk.add(Residual(num_channels))

return blk

net.add(resnet_block(64, 2, first_block=True),

resnet_block(128, 2),

resnet_block(256, 2),

resnet_block(512, 2))

net.add(nn.GlobalAvgPool2D(), nn.Dense(num_classes))

return net

CIFAR-10图像分类挑战赛使用10个类别。在训练开始之前,将对模型执行Xavier随机初始化。

def get_net(ctx):

num_classes = 10

net = resnet18(num_classes)

net.initialize(ctx=ctx, init=init.Xavier())

return net

loss = gluon.loss.SoftmaxCrossEntropyLoss()

5. Defining the Training Functions

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zgzjxs.html