稀疏的梯度会妨碍 GAN 的训练。在深度学习中,稀疏性通常是我们需要的属性,但在GAN 中并非如此。有两件事情可能导致梯度稀疏:最大池化运算和 ReLU 激活。推荐使用步进卷积代替最大池化来进行下采样,还推荐使用 LeakyReLU 层来代替 ReLU 激活。LeakyReLU 和 ReLU 类似,但它允许较小的负数激活值,从而放宽了稀疏性限制
在生成的图像中,经常会见到棋盘状伪影,这是由生成器中像素空间的不均匀覆盖导致的。为了解决这个问题,每当在生成器和判别器中都使用步进的 Conv2DTranpose或 Conv2D 时,使用的内核大小要能够被步幅大小整除
GAN训练循环的大致流程
从潜在空间中抽取随机的点(随机噪声)
利用这个随机噪声用 generator 生成图像
将生成图像与真实图像混合
使用这些混合后的图像以及相应的标签(真实图像为“真”,生成图像为“假”)来训练discriminator
在潜在空间中随机抽取新的点
使用这些随机向量以及全部是“真实图像”的标签来训练 gan。这会更新生成器的权重(只更新生成器的权重,因为判别器在 gan 中被冻结),其更新方向是使得判别器能够将生成图像预测为“真实图像”。这个过程是训练生成器去欺骗判别器
demo
import keras from keras import layers import numpy as np import os from keras.preprocessing import image # 生成器 latent_dim = 32 height = 32 width = 32 channels = 3 generator_input = keras.Input(shape=(latent_dim,)) x = layers.Dense(128 * 16 * 16)(generator_input) x = layers.LeakyReLU()(x) x = layers.Reshape((16, 16, 128))(x) x = layers.Conv2D(256, 5, padding='same')(x) x = layers.LeakyReLU()(x) x = layers.Conv2DTranspose(256, 4, strides=2, padding='same')(x) x = layers.LeakyReLU()(x) x = layers.Conv2D(256, 5, padding='same')(x) x = layers.LeakyReLU()(x) x = layers.Conv2D(256, 5, padding='same')(x) x = layers.LeakyReLU()(x) x = layers.Conv2D(channels, 7, activation='tanh', padding='same')(x) generator = keras.models.Model(generator_input, x) generator.summary() # 判别器 discriminator_input = layers.Input(shape=(height, width, channels)) x = layers.Conv2D(128, 3)(discriminator_input) x = layers.LeakyReLU()(x) x = layers.Conv2D(128, 4, strides=2)(x) x = layers.LeakyReLU()(x) x = layers.Conv2D(128, 4, strides=2)(x) x = layers.LeakyReLU()(x) x = layers.Conv2D(128, 4, strides=2)(x) x = layers.LeakyReLU()(x) x = layers.Flatten()(x) x = layers.Dropout(0.4)(x) x = layers.Dense(1, activation='sigmoid')(x) discriminator = keras.models.Model(discriminator_input, x) discriminator.summary() discriminator_optimizer = keras.optimizers.RMSprop(lr=0.0008, clipvalue=1.0, decay=1e-8) discriminator.compile(optimizer=discriminator_optimizer, loss='binary_crossentropy') # 对抗网络,将生成器和判别器连接在一起 discriminator.trainable = False gan_input = keras.Input(shape=(latent_dim,)) gan_output = discriminator(generator(gan_input)) gan = keras.models.Model(gan_input, gan_output) gan_optimizer = keras.optimizers.RMSprop(lr=0.0004, clipvalue=1.0, decay=1e-8) gan.compile(optimizer=gan_optimizer, loss='binary_crossentropy') # 实现 GAN 的训练 (x_train, y_train), (_, _) = keras.datasets.cifar10.load_data() x_train = x_train[y_train.flatten() == 6] x_train = x_train.reshape((x_train.shape[0],) + (height, width, channels)).astype('float32') / 255 iterations = 10000 batch_size = 20 save_dir = 'your_dir' start = 0 for step in range(iterations): random_latent_vectors = np.random.normal(size=(batch_size, latent_dim)) generated_images = generator.predict(random_latent_vectors) stop = start + batch_size real_images = x_train[start: stop] combined_images = np.concatenate([generated_images, real_images]) labels = np.concatenate([np.ones((batch_size, 1)), np.zeros((batch_size, 1))]) labels += 0.05 * np.random.random(labels.shape) d_loss = discriminator.train_on_batch(combined_images, labels) random_latent_vectors = np.random.normal(size=(batch_size, latent_dim)) misleading_targets = np.zeros((batch_size, 1)) a_loss = gan.train_on_batch(random_latent_vectors, misleading_targets) start += batch_size if start > len(x_train) - batch_size: start = 0 if step % 100 == 0: gan.save_weights('gan.h5') print('discriminator loss:', d_loss) print('adversarial loss:', a_loss) img = image.array_to_img(generated_images[0] * 255., scale=False) img.save(os.path.join(save_dir, 'generated_frog' + str(step) + '.png')) img = image.array_to_img(real_images[0] * 255., scale=False) img.save(os.path.join(save_dir, 'real_frog' + str(step) + '.png'))GAN 由一个生成器网络和一个判别器网络组成。判别器的训练目的是能够区分生成器的输出与来自训练集的真实图像,生成器的训练目的是欺骗判别器。值得注意的是,生成器从未直接见过训练集中的图像,它所知道的关于数据的信息都来自于判别器
注:
在 Keras 中,任何对象都应该是一个层,所以如果代码不是内置层的一部分,我们应该将其包装到一个 Lambda 层(或自定义层)中python Deep learning 学习笔记(9)