Titanic 数据集是从 kaggle下载的,下载地址:https://www.kaggle.com/c/titanic/data
数据一共又3个文件,分别是:train.csv,test.csv,gender_submission.csv
先把需要视同的库导入:
import os import datetime import operator import numpy as np import pandas as pd import xgboost as xgb from sklearn.model_selection import train_test_split from sklearn.preprocessing import Imputer, scale import matplotlib.pyplot as plt
np.random.seed(19260817) # 设置一下种子,看一下博客园有没有能看懂的