kaggle竞赛分享:NFL大数据碗(上篇)

kaggle竞赛分享:NFL大数据碗 - 上 竞赛简介

一年一度的NFL大数据碗,今年的预测目标是通过两队球员的静态数据,预测该次进攻推进的码数,并转换为该概率分布;

竞赛链接

https://www.kaggle.com/c/nfl-big-data-bowl-2020

项目链接,该项目代码已经public,大家可以copy下来直接运行

https://www.kaggle.com/holoong9291/nfl-big-data-bowl

github仓库链接,更多做的过程中的一些思考、问题等可以在我的github中看到

https://github.com/NemoHoHaloAi/Competition/tree/master/kaggle/Top61%25-0.01404-zzz-NFL-Big-Data-Bowl

一些基本概念

美式足球:进攻方目的是通过跑动、传球等尽快抵达对方半场,也就是达阵,而防守方的目的则是相反,尽全力去阻止对方的前进以及尽可能断球;

球场长120码(109.728米),宽53码(48.768米),周长是361.992米;

球员:双方场上共22人,进攻方11人,防守方11人,进攻方持球;

进攻机会:进攻方共有四次机会,需要推进至少十码;

进攻方:进攻方的职责是通过四次机会,尽可能的向前推进10码或者达阵,以获得下一个四次机会,否则就需要交出球权;

防守方:防守方则是相反,尽可能的阻止对方前进,如果能够断球那更好,直接球权交换;

handoff:传球;

snap:发球;

橄榄球基本知识点我了解;

QB:四分卫,通常是发球后接球的那个人,一般口袋阵的中心,但是也不乏有像拉马尔-杰克逊这样的跑传结合的QB,目前古典QB代表是新英格兰爱国者NE的汤姆-布雷迪;

RB:跑卫,通常发球后进行冲刺、摆脱等,试图接住本方QB的传球后尽可能远的冲刺;

球场码线图

kaggle竞赛分享:NFL大数据碗(上篇)

一个常见的开球前站位图

kaggle竞赛分享:NFL大数据碗(上篇)

数据字段介绍、绘图分析

row

字段信息:

GameId - a unique game identifier - 比赛ID

PlayId - a unique play identifier -

Team - home or away - 主场还是客场

X - player position along the long axis of the field. See figure below. - 在球场的位置x

Y - player position along the short axis of the field. See figure below. - 在球场的位置y

S - speed in yards/second - 速度,码/秒

A - acceleration in yards/second^2

Dis - distance traveled from prior time point, in yards

Orientation - orientation of player (deg) 球员面向

Dir - angle of player motion (deg) 球员移动方向

NflId - a unique identifier of the player - NFL球员ID

DisplayName - player's name - 球员名

JerseyNumber - jersey number - 球衣号码

Season - year of the season

YardLine - the yard line of the line of scrimmage

Quarter - game quarter (1-5, 5 == overtime) - 当前是第几节比赛,5为加时

GameClock - time on the game clock - 比赛时间

PossessionTeam - team with possession - 持球方

Down - the down (1-4) - 达阵

Distance - yards needed for a first down - 距离拿首攻所需距离

FieldPosition - which side of the field the play is happening on

HomeScoreBeforePlay - home team score before play started - 赛前主队分数

VisitorScoreBeforePlay - visitor team score before play started - 赛前客队分数

NflIdRusher - the NflId of the rushing player

OffenseFormation - offense formation

OffensePersonnel - offensive team positional grouping

DefendersInTheBox - number of defenders lined up near the line of scrimmage, spanning the width of the offensive line

DefensePersonnel - defensive team positional grouping

PlayDirection - direction the play is headed

TimeHandoff - UTC time of the handoff - 传球时间

TimeSnap - UTC time of the snap - 发球时间

Yards - the yardage gained on the play (you are predicting this) - 目标

PlayerHeight - player height (ft-in) - 球员身高

PlayerWeight - player weight (lbs) - 球员体重

PlayerBirthDate - birth date (mm/dd/yyyy) - 生日、岁数

PlayerCollegeName - where the player attended college - 大学

Position - the player's position (the specific role on the field that they typically play) - 场上位置

HomeTeamAbbr - home team abbreviation - 主队缩写

VisitorTeamAbbr - visitor team abbreviation - 客队缩写

Week - week into the season

Stadium - stadium where the game is being played - 体育场

Location - city where the game is being player - 城市

StadiumType - description of the stadium environment - 体育场类型

Turf - description of the field surface - 草皮

GameWeather - description of the game weather - 比赛天气

Temperature - temperature (deg F) - 温度

Humidity - humidity - 湿度

WindSpeed - wind speed in miles/hour - 风速

WindDirection - wind direction - 风向

定义问题

回归预测,Target是码数,但是最终结果需要转换为条件概率分布;

Evaluation Function

Continuous Ranked Probability Score (CRPS);

项目流程分享 定义模型输出结果到概率分布的转换类

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zydgwg.html