TY - JOUR
T1 - Box office forecasting using machine learning algorithms based on SNS data
AU - Kim, Taegu
AU - Hong, Jungsik
AU - Kang, Pilsung
N1 - Funding Information:
This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT, and Future Planning ( NRF-2014R1A1A1004648 ).
Publisher Copyright:
© 2014 International Institute of Forecasters.
PY - 2015/4/1
Y1 - 2015/4/1
N2 - We propose a novel approach to the box office forecasting of motion pictures using social network service (SNS) data and machine learning-based algorithms. We begin by providing a comprehensive survey of the forecasting algorithms and explanatory variables used in the motion picture domain. Because of the importance of forecasting in early periods, we develop three sequential forecasting models for predicting the non-cumulative and cumulative box office earnings: (1) prior to, (2) a week after, and (3) two weeks after release. The numbers of SNS mentions and their weekly trends are used as input variables in addition to the screening-related information. A genetic algorithm is adopted for determining significant input variables, whereas three machine learning-based nonlinear regression algorithms and their combinations are employed for building forecasting models. Experimental results show that the utilization of SNS data, machine learning-based algorithms and their combination made noticeable improvements to the forecasting accuracies of all the three models.
AB - We propose a novel approach to the box office forecasting of motion pictures using social network service (SNS) data and machine learning-based algorithms. We begin by providing a comprehensive survey of the forecasting algorithms and explanatory variables used in the motion picture domain. Because of the importance of forecasting in early periods, we develop three sequential forecasting models for predicting the non-cumulative and cumulative box office earnings: (1) prior to, (2) a week after, and (3) two weeks after release. The numbers of SNS mentions and their weekly trends are used as input variables in addition to the screening-related information. A genetic algorithm is adopted for determining significant input variables, whereas three machine learning-based nonlinear regression algorithms and their combinations are employed for building forecasting models. Experimental results show that the utilization of SNS data, machine learning-based algorithms and their combination made noticeable improvements to the forecasting accuracies of all the three models.
KW - Box office earning forecast
KW - Forecast combination
KW - Genetic algorithm
KW - Machine learning
KW - Social network service
UR - http://www.scopus.com/inward/record.url?scp=84940007133&partnerID=8YFLogxK
U2 - 10.1016/j.ijforecast.2014.05.006
DO - 10.1016/j.ijforecast.2014.05.006
M3 - Article
AN - SCOPUS:84940007133
SN - 0169-2070
VL - 31
SP - 364
EP - 390
JO - International Journal of Forecasting
JF - International Journal of Forecasting
IS - 2
ER -