运用深度学习进行文本生成 (4)

卷积神经网络一般多用于图像领域,主要由于其独特的局部特征提取功能。但人们发现一维卷积神经网络 (Conv1D) 同样适合序列数据的处理,因为其可以提取长序列中的局部信息,这在特定的 NLP 领域 (如机器翻译,自动问答等) 中非常有用。另外值得一提的是相比于用 RNN 处理序列数据,Conv1D的训练要快得多。

受上个例子中双向模型的启发,这里我也同时使用了正向和反向序列的信息,最后的模型大致是这样:

运用深度学习进行文本生成



这次的训练语料是《西游记》。

whole = open('西游记.txt', encoding='utf-8').read() maxlen = 30 # 正向序列长度 revlen = 20 # 反向序列长度 sentences = [] reverse_sentences = [] next_chars = [] for i in range(maxlen, len(whole) - revlen): sentences.append(whole[i - maxlen : i]) reverse_sentences.append(whole[i + 1 : i + revlen + 1][::-1]) next_chars.append(whole[i]) print('提取的正向句子总数:', len(sentences)) print('提取的反向句子总数:', len(reverse_sentences)) chars = sorted(list(set(whole))) char_indices = dict((char, chars.index(char)) for char in chars) x = np.zeros((len(sentences), maxlen), dtype='float32') reverse_x = np.zeros((len(reverse_sentences), revlen), dtype='float32') y = np.zeros((len(sentences),), dtype='float32') for i, sentence in enumerate(sentences): for t, char in enumerate(sentence): x[i, t] = char_indices[char] y[i] = char_indices[next_chars[i]] for i, reverse_sentence in enumerate(reverse_sentences): for t, char in enumerate(reverse_sentence): reverse_x[i, t] = char_indices[char]



建立神经网络模型:

normal_input = layers.Input(shape=(maxlen,), dtype='float32',) model_1 = layers.Embedding(len(chars), 128, input_length=maxlen)(normal_input) model_1 = layers.GRU(256, return_sequences=True)(model_1) model_1 = layers.GRU(128)(model_1) reverse_input = layers.Input(shape=(revlen,), dtype='float32',) model_2 = layers.Embedding(len(chars,), 128, input_length=revlen)(reverse_input) model_2 = layers.Conv1D(64, 5, activation='relu')(model_2) model_2 = layers.MaxPooling1D(2)(model_2) model_2 = layers.Conv1D(32, 3, activation='relu')(model_2) model_2 = layers.GlobalMaxPooling1D()(model_2) normal_input_2 = layers.Input(shape=(maxlen,), dtype='float32',) model_3 = layers.Embedding(len(chars), 128, input_length=maxlen)(normal_input_2) model_3 = layers.Conv1D(64, 7, activation='relu')(model_3) model_3 = layers.MaxPooling1D(2)(model_3) model_3 = layers.Conv1D(32, 5, activation='relu')(model_3) model_3 = layers.GlobalMaxPooling1D()(model_3) combine = layers.concatenate([model_1, model_2, model_3], axis=-1) output = layers.Dense(len(chars), activation='softmax')(combine) model = keras.models.Model([normal_input, reverse_input, normal_input_2], output) optimizer = keras.optimizers.RMSprop(lr=1e-3) model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer) model.fit({'normal': x, 'reverse': reverse_x, 'normal_2': x}, y, epochs=200, batch_size=1024, verbose=2)


在预测的过程中需要不断在 list 的尾部删除元素,在头部插入元素,因而使用 collections 模块中的 deque 代替 list 进行高效操作:

from collections import deque def write_3(model, temperature, word_num): gg = begin_sentence[:30] reverse_gg = deque(begin_sentence[31:51][::-1]) print(gg, end='/// ') for _ in range(word_num): sampled = np.zeros((1, maxlen)) reverse_sampled = np.zeros((1, revlen)) for t, char in enumerate(gg): sampled[0, t] = char_indices[char] for t, reverse_char in enumerate(reverse_gg): reverse_sampled[0, t] = char_indices[reverse_char] preds = model.predict({'normal': sampled, 'reverse': reverse_sampled, 'normal_2': sampled}, verbose=0)[0] if temperature is None: next_word = chars[np.argmax(preds)] else: next_index = sample(preds, temperature) next_word = chars[next_index] reverse_gg.pop() reverse_gg.appendleft(gg[0]) gg += next_word gg = gg[1:] sys.stdout.write(next_word) sys.stdout.flush()


初始文本是:

begin_sentence = whole[70000: 70100] print(begin_sentence[:30] + " //" + begin_sentence[30] + "// " + begin_sentence[31:51]) # ,命掌生死簿判官:“急取簿子来,看陛下阳寿天禄该有几何?”崔 //判// 官急转司房,将天下万国国王天禄总簿,先逐


不使用 temperature 生成:

write_3(model, None, 500, begin_sentence)

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zygsxp.html