Social big data analysis on demands for the Korean sport industry

초록

The purpose of the study was to explore public opinions and perceptions with regard to the Korean sport industry based on a big data analysis of social media content. Social big data was collected using ‘TextoM’, a big data analysis solution and ‘Naver’, an Internet portal service provider. In order to collect data, a total of 29 keywords were used such as sport industry, sport facilities, sport services, sport marketing, professional baseball, and sport manufacturing. Social analytics was used to analyze big data from social media content about the Korean sport industry. A total of 6,002,666 cases including document, web, blogs, and cafes (i.e., online community) with regards to the Korean sport industry were analyzed. Frequency analysis, keyword analysis through text-mining, sentiment analysis, semantic network analysis, and CONCOR network analysis were conducted. First, frequency analysis shows the volume of the Korean sport industry related searches stayed highly and increased. Second, keyword analysis through text mining was conducted to examine frequently used keywords in the sport industry. The analysis indicates four different categories of the Korean sport industry keywords such as sport games, sport goods, health, and sport policy. Third, sentiment analysis showed positive sentiment was found to have a greater weight than negative sentiment toward the sport industry. Fourth, semantic network analysis showed that the Korean sport industry connected with tourism, media, information, game, international, health, and culture. Last, the results of CONCOR analysis showed that the sport industry clearly divided into four different segments such as sport goods and equipment segment, sport facility segment, sport service segment, and professional sport segment. Results and findings are discussed and future research are suggested.

Abstract

The purpose of the study was to explore public opinions and perceptions with regard to the Korean sport industry based on a big data analysis of social media content. Social big data was collected using ‘TextoM’, a big data analysis solution and ‘Naver’, an Internet portal service provider. In order to collect data, a total of 29 keywords were used such as sport industry, sport facilities, sport services, sport marketing, professional baseball, and sport manufacturing. Social analytics was used to analyze big data from social media content about the Korean sport industry. A total of 6,002,666 cases including document, web, blogs, and cafes (i.e., online community) with regards to the Korean sport industry were analyzed. Frequency analysis, keyword analysis through text-mining, sentiment analysis, semantic network analysis, and CONCOR network analysis were conducted. First, frequency analysis shows the volume of the Korean sport industry related searches stayed highly and increased. Second, keyword analysis through text mining was conducted to examine frequently used keywords in the sport industry. The analysis indicates four different categories of the Korean sport industry keywords such as sport games, sport goods, health, and sport policy. Third, sentiment analysis showed positive sentiment was found to have a greater weight than negative sentiment toward the sport industry. Fourth, semantic network analysis showed that the Korean sport industry connected with tourism, media, information, game, international, health, and culture. Last, the results of CONCOR analysis showed that the sport industry clearly divided into four different segments such as sport goods and equipment segment, sport facility segment, sport service segment, and professional sport segment. Results and findings are discussed and future research are suggested.

keyword
social big datasport industrydemandgoodsfacilityservice

Introduction

Sport industry has a strong connection with various industries such as manufacturing, information and communication, media, entertainment, and distribution, and there is a fierce competition among nations for the development of high-tech products and services of sports fusion and multimodal science and technology including new materials, nanotechnology and biotechnology. The sport industry, which is a global industrial culture shared by all over the world, is a huge market that is growing in itself by focusing on the convergence of sport contents. According to Plunkett Research (2018), the size of the global sport industry is 1.3 trillion. Since the sport industry is closely related to IT, entertainment, and healthcare and highly anticipated to generate added value, it is one of the typical fields that are expected to grow in connection with the rapidly growing big data industry. Korean government implemented various policies for the growth of the sport industry. However, most government policies for the sport industry are mostly planned on the supplier side. Thus, it is necessary to prepare an effective government policy that takes into account the consumer side. In recent years, the importance of using social big data in policy analysis that considers the consumer side has been highlighted. Big data can be variously classified according to the creator and type. In particular, due to increase in smartphone users and the activation of social media, unstructured social data such as text documents, image, and videos are spreading. Various messages on social media content have become sources to understand political, social, economic, and cultural the spirit and the sensitivity of the age (Song, 2012). According to Korea Internet & Security Agency (2015), 64.9% of the Internet users use social network services (SNS). 89.0% of 20’s 80.6% of 30’s, 67.4% of 40’s, and 49.4% of 50’s have used social media. It means that SNS is no longer the exclusive property of the young generation. Therefore, it is possible to deduct policy implications and look for solutions on social issues. Big data can be applied to a variety of methodologies such as path analysis, social network analysis and keyword analysis. Big data analysis also covers various fields such as epidemics, financial crisis, weather forecast, and election. Since there are increasing studies to emphasize the consumer side in the policy analysis and academic efforts are being made to overcome limitations of existing research which methodologically did not reflect the exact voice of customers, the importance of using social big data in the policy analysis considering the consumer side is becoming more important.

In so far, there were a few research in sport themed social network analysis in the field of sport management. However, social big data research is recently active in the Korean sport management such as water leisure sport tourism (Oh, Han, & Kim, 2019), leisure trend analysis (Kim, Lee, Han, & Han, 2019), a trend analysis on leisure sport tourism (Kim, Oh, & Han, 2019), perception about the Korean national baseball team for the 2018 Asian Games (Han, Oh, & Kim, 2018), 2018 Muju World Taekwondo Championship (Park, 2018), keyword analysis of leisure activity (Kim & Jeon, 2018), analysis of 2018 Pyeong Chang Olympic keyword (Lee, Lee, & Jang, 2017), perception of swimsuit (Lee, Lee, Kim, & Kim, 2017), analysis of national Paralympics (Kim & Lee, 2016), and sport participation (Oh, Han, & Kim, 2020). Jang and Yoon (2016) collected the social texts related to 'camping' and identified actual consumer oriented needs and analyzed the government policies and changes in public perception of camping through social media big data analysis. Song (2014) tried to overcome limitations that could not grasp the need for health and welfare through cross-sectional or longitudinal survey. According to Lee, Lee, and Kim (2014), it is possible that the survey result may be biased downward (or upward) by the desire of the respondents to be moral when the public perceptions and demands for the specific policy are grasped on the basis of the stated preference data. Since the consumer's perceptions and practice may be inconsistent, a policy analysis using big data is required to analyze consumer-oriented VOCs (voice of customers). Although a social big data analysis in government policies can be used to evaluate existing policies, search public opinion and emotion, identify areas of interest, and find new policies, research using social big data in the sport industry has not been done yet in spite of recognizing the importance of using big data in the sport industry. Thus, the purposes of this study were to explore public opinions and perceptions with regard to the Korean sport industry and examine the demands of new policy in the Korean sport industry based on the big data analysis with social media content.

Method

Data Collection Procedure

For the purpose of exploring public opinions and perceptions with regards to the Korean sport industry, research themes related to the three areas of the sport industry based on the sport industry classification were set up and a taxonomy for the sport industry through meetings of authors and experts was organized in order to select keywords in each area of the sport industry. Through this process, a total of 29 keywords were selected such as sport industry, sport facilities, sport equipment, sport services, outdoor, sport sales, sport marketing, sport tourism, sport education, sport media, sport agent, professional baseball, professional basketball, and sport manufacturing (see Table 1). Social big data was collected using ‘TextoM’, a big data analysis solution and ‘Naver’, an Internet portal service provider which has the most users in Korea. Social big data include news, blogs, café(i.e., online community), web, and other services in ‘Naver’ and temporal range of social big data for this study was 1 year from July 1, 2015 to June 30, 2016 because data collection was done in July, 2016. A total of 6,002,666 cases including document, web, blogs, cafes, and Q&A with regards to the Korean sport industry were collected.

View original
Table 1.
Keyword in the sport industry for data collection
Classification Keywords
Sport industry (1) Sport Goods
Sector (6)
sport goods, sport manufacturing, sport sales, sport distribution, outdoor, sport equipment
Sport Facility
Sector (5)
sport facilities, sport operation, sport construction, stadium, fitness center
Sport Service
Sector (13)
sport marketing, sport publication, sport media, sport education, sport tourism, sport toto, sport betting, sport agent, horse racing, professional baseball, professional soccer, professional basketball, professional volleyball
Others (4) sport job, sport research, sport support, sport development
View original
Figure 1.
Changes in the frequency of positive and negative keywords about the sport industry
IJASS_2020_v32n2_28_f001.jpg

Data Analysis

Social analytics including text-mining, sentiment analysis, semantic analysis, and CONCOR analysis was used to analyze big data from social media content about the Korean sport industry. Text-mining, referred to as text data mining, is a method of extracting useful words based on natural language processing and morphological analysis techniques in unstructured text, analyzing frequency, and finding the meaning of context level. Sentiment analysis identifies public opinions on the subject through the frequency with which positive or negative emotions are expressed in the documents containing the keywords of interest. It is an analytical technique that shows the characteristics of social media content and it reflects the characteristics of people's behaviors that appear differently in formal situations and informal situations, making use of tendency to open up more candid opinions in private media such as Facebook and blogs (Song & Song, 2015). Using this characteristic, public opinions on the topic can be identified through the frequency with which affirmative or negative sentiment is expressed in the document containing the keyword of interest. Semantic network analysis is a social network analysis method that analyzes the systematic structure based on the meaning shared between words. It is a method to grasp the connection relations between the keywords of the collected data and their communities. CONCOR analysis is an abbreviation of convergent correlations and the most commonly used method for structural isoquant analysis. It analyzes the Pearson correlation of the coexistence matrix between words, and identifies the relationship between the blocks of nods (Kim & Jun, 2014).

Results

Analysis of perceived value of the Korean sport industry

First, keyword analysis through text mining was conducted to examine frequently used keywords in the Korean sport industry(see Table 2). The analysis indicates four different categories of sport industry keywords such as sport game, sport goods, health, and sport policy. Sport game related keywords include professional baseball, stadium, star players, professional soccer, professional basketball, and broadcasting. Sport goods related keywords include outdoor, sport equipment, sport brand, and information. Health related keywords consist of exercise, diet, physical education, and fitness. Sport industry policy related keywords are sport economics, sport industry forecast, companies in sport industry, and sport distribution and sales.

View original
Table 2.
Keywords in the sport industry (Top 50 keywords)
Rank Keyword Frequency Rank Keyword Frequency
1 Broadcasting 892,951 26 Season 213,788
2 Sales 740,901 27 Professional 211,553
3 Event 622,894 28 Facility 205,657
4 Seoul 575,943 29 Employment 194,042
5 Professional baseball 556,645 30 Brand 193,179
6 Games 556,563 31 Culture 193,136
7 Travel 511,715 32 World Cup Stadium 192,495
8 Stadium 492,455 33 Exercise 189,274
9 Information 486,872 34 Health 188,406
10 Operation 472,188 35 Diet 187,915
11 Outdoor 450,476 36 Management 185,252
12 Market 419,839 37 Program 183,999
13 Star 364,026 38 USA 182,824
14 Player 348,704 39 Professional basketball 182,678
15 Baseball 319,234 40 Domestic 179,269
16 Price 291,443 41 Development 176,488
17 Soccer 287,044 42 Coach 176,388
18 Forecast 271,201 43 Japan 171,152
19 Research 241,577 44 Region 170,694
20 Equipment 239,118 45 Time 166,539
21 Construction 235,837 46 Children 166,323
22 Representative 339,640 47 Theme 163,540
23 Investment 226,383 48 World 161,496
24 Goods 217,433 49 Woman 161,356
25 TV 216,824 50 Golf 159,526

Second, sentiment analysis were conducted to see public perceptions with regards to the sport industry based on the frequency of positive and negative sentiment with sport industry related keywords on documents. Overall, positive sentiment was found to have a greater weight than negative sentiment toward the sport facility subsector. Negative sentiment increased slightly in November, 2015 and June, 2016, and positive sentiment showed a significant change in monthly increase and decrease. Looking at the ratio of positive sentiment to negative, it showed a sharp increase in October, 2015 and November, 2016. In November 2015, the negative sentiment is believed to have increased due to the controversy over the construction of sport stadium in Asan, which is the venue of the 2016 National Games. In June 2016, both positive and negative sentiment seemed to have increased due to the various opinions on the long-term lease issue of professional baseball stadiums following the revision of the Sport Industry Promotion Act.

‘Win’, ‘best’, ‘Success’, and ‘interests’ were typical positive sentiment words and ‘illegal’, ‘problem’, and ‘terror’ were three typical negative sentiment words toward the sport facility subsector. It seems to be mainly concerned with the use of sport facilities and the interest in investment and operations. Particularly, Daegu Samsung Lions Park, newly opened in March 2016, seems to have high interests, and various articles about the visit and experience of Lions Park baseball stadium are appearing on the blogs. Since the stadium provides various convenient facilities and services to fans and the community, keywords such as ‘best’ and ‘success’ appear, and because it is a professional baseball stadium, ‘win’ is associated with the positive sentiment. Negative sentiment toward the sport facility subsector related to illegal parking and parking environment problems, or related to the illegal CCTV installation for surveillance of Lotte Giant baseball players by management. In case of terrorism, it was confirmed information about the terrorist incidents in the overseas game.

Third, the results of a semantic network analysis showed that the sport facility subsector was connected with facility operation, facility information, and utilization of facilities. Firstly, the operation of the sport facilities can be confirmed that there is a great demand on how the sport facilities are managed and how they are operated. Secondly, there is a demand for information on sport facilities, which can be divided into information about facilities for professional sport and sport for all. Thirdly, the region shows that there is a great demand for information on sport facilities in the Seoul metropolitan area. Fourthly, it can be seen that keywords related to sporting events and concerts using sport facilities appear. In addition, keywords such as the World Cup stadium, Olympic Park, and neighborhoods appeared, suggesting that interess in sport facilities that can be easily found by the general public are high. As such, the sport facility subsector is focused on the management of facilities. Also, marketing related keywords such as revenue, publicity, and events have been greatly expanded, indicating that the promotion of programs for sport facilities is mainly occurred.

View original
Figure 2.
Word cloud based on sentiment analyses of the sport industry
IJASS_2020_v32n2_28_f002.jpg

Fourth, the results of CONCOR analysis showed that the sport facility subsector clearly divided into four different segments such as sport facility segment, sport and cultural activity segment, job information in the sport facility subsector, and facility information about sport for all. Firstly, the connection with keywords for sport events such as baseball and football together with keywords including stadiums and baseball fields, appears to be high. In the second, sport culture activity, family, concert, and exhibition are connected because the family participates in the cultural life using sport events and it is interpreted as high interest. Thirdly, job information is linked to keywords related to occupations including sales, design, planning, education, management, and marketing, and keywords related to facility types include resorts, sport centers, and golf. Fourthly, keywords such as indoor and outdoor swimming pool, safety, and transportation are strongly connected each other, so demand for the use of sport facilities seems high.

Discussion

The purpose of this study was to grasp the general perceptions and characteristics of the Korean sport industry and the subsectors of the sport industry such as sports facility sector, sport goods sector, and sport service sector by the social big data. The messages of social media users, such as Twitter and blogs, became a source of spirit and sensitivity of the times, as individuals' activities and communication areas expanded to cyberspace. Through social media, individuals became producers and consumers of information at the same time. As the interests generated by mass media are understood and propagated through social media content, personal messages on social media can play an important role in the strategic planning process of sport industry. Therefore, this study analyzes the social perceptions and the potential value of the Korean sport industry by analyzing the social big data on the Korean sport industry.

View original
Figure 3.

Semantic Network analysis

IJASS_2020_v32n2_28_f003.jpg
View original
Figure 4.

CONCOR analysis

IJASS_2020_v32n2_28_f004.jpg

Since major keywords are highly associated with professional sports related words based on the results of the social big data analysis, it is suggested that there is a high demand in the professional sports. It shows that because professional sports in Korea is of public interest, keywords related to professional sports are often mentioned on social media (Kim, Oh, Lim, & Han, 2017).

Since some keywords associated to sporting goods such as baseball, soccer, basketball, and volleyball were highly shown in the text, it indicates high demands and interests in some sports in terms of sport spectating and sport participating. In the sport facility sector, there is high demand for information about the use of sport facilities in various ways and outdoor wears are becoming popular products of the general public in the sport goods sector. Demand for professional sports and sport broadcasting is the highest, and demand for health, diet, and sport education is also high in the sport service sector. Based on these findings, it is believed that current ongoing policies in the Korean sport industry such as providing sport industry market trends and related information, promoting convergence in the sport industry, fostering professionals and job creation support, supporting research and development in the sport industry, and testing and authenticating sport goods and equipment reflect the needs of the consumers and demands in the sport market.

According to the results of sentiment analyses, the image of the sport industry is based on the assumption that the government recognizes the sport industry as the core industry of the creative economy and actively fosters it. Therefore, positive images such as 'expansion', 'best', ‘growth’, and 'future' appears but, negative images such as 'illegal', 'controversial' and 'manipulation' appear due to problems of professional sport players' illegal sport betting and match-fixing. As for the trends of positive and negative sentiment by subsectors, positive sentiment was significantly higher than negative sentiment in all subsectors in the sport industry including sport service sector, sporting goods sector, and sport facility sector. Positive sentiment toward the sport industry in the social media content provides circumstantial evidence for low and/or no interests in the sport industry because positive rumors spread more powerfully and faster than negative rumors (DiFonzo, Robinson, Suls, & Rini, 2012). As a result, negative opinions appear more than positive opinions on social media content in general. However, it is confirmed that there are many documents that the government has exposed through the media as public relations and public announcements rather than the voice of customers in the sport industry based on the results of the study. It seems that sport industry policies reflect the perceptions about people's values and demands for the sport market, it is critical to spread information on policies accurately and quickly through active marketing because policies are not effectively communicated to the public (Kim, Oh, Lim, & Han, 2017).

According to the results of the semantic network analyses using the simultaneous appearance information of the 500 major keywords related to the sport industry, the sport industry has the highest correlation with the media industry such as broadcasting. This is because media have a close relationship with professional sports and various derivatives and related industries such as broadcasting rights, portrait rights, advertising, and sponsorship are growing. Since the demand for sporting goods such as outdoor, rash-guard, and the demand for sport service such as culture and tourism are high, it is expected that the market will expand through convergence with other fields. In detail, the sport facility sector should implement policies that actively provide information on the use of public sport facilities so that general people can use the sport facilities more conveniently and easily access information on the use and operation of arenas and sport facilities. In the sporting goods sector, consumers consider the brand first when purchasing sporting goods. Since domestic sporting goods are dominated by global brands such as Nike, Adidas, and Under Amour and recently sporting goods trade deficit trend continues (Ministry of Culture, Sport and Tourism, 2018), so it is necessary to implement policies to foster domestic sport goods that can compete in the global market. The results show the sport service sector relates to tourism, games, and culture. It shows that the sport industry is gradually expanding through the convergence of sport service sector, tourism, games, and culture. Therefore, it is necessary to establish policies to support the promotion of inbound sport tourism for foreigners using professional sports such as professional baseball and professional soccer as well as participating sports such as skiing and golf. In the point of view of the ongoing policies, it can be seen that there is not yet enough policy against the demand for a specific management system for activating the use of sport facilities. On the other hand, it can be confirmed that policies such as supporting for small and medium sport enterprises, developing regional sport enterprises and industry, supporting globalization of sporting goods companies that foster domestic sporting goods and meet convergence needs with other fields. It shows that a selection and concentration method that provides more money to small and medium sporting enterprises with high probability of success is necessary (Ko, Ma, & Kim, 2019).

According to CONCOR analyses, the entire sport industry was classified as a sporting goods sector, a sport facility sector, and a sport service sector as well as a sport industry category. In addition, a sub-category of sports service industry was further classified as a sport participation. In addition, professional sport was further classified as a sector of the sport industry because of the popularity of professional sport leagues even though professional sport is a sub-sector of the sport service sector. Sporting goods sector is classified into four keywords related to sporting goods, sporting goods service activities, outdoor goods, and job in sports goods. It shows that the outdoor industry, including camping, is rapidly growing. The sport facility sector has keywords related to specified stadiums, cultural activities, facility operation, and subsidiary facilities. As a result, it was found that there was a strong interest in the specified stadium in relation to the professional sport. Also, since the sport facilities are used as venues for cultural activities such as concerts, there are keywords related to cultural activities. In addition, there are lots of interests in the sport facility operation system and the auxiliary facilities. Therefore, to make it easier for citizens to obtain information on sport facilities, public data on the use of sport facilities should be opened to enhance convenience for the public. For example, Madison Square Garden (MSG), used by New York Knicks, a National Basketball Association (NBA) basketball team in New York, USA, is working with the New York Rangers of the National Hockey League (NHL). A basketball court in the MSG turns into an ice hockey link, and when it is not a season, it becomes a place for concerts and other cultural events. Therefore, when constructing a new sport facility, it is necessary to consider building a multipurpose facility rather than building stadiums for specific items only. The sport service sector has keywords related to professional sports, sport marketing, and media. The largest group emerged as professional sports, which shows public interest in professional sports. In addition, since media and sport marketing are closely related to professional sports, the public seems to be interested. Professional sports are expected to be greatly envied through the convergence of VR, AR, IoT, and AI, so there is a need for policy development (Ko, Ma, & Kim, 2019).

Limitations and Future Research

There are several limitations in the current study. The social big data used in this study is vast in scope and the amount of information it implies is too large. However, in spite of these quantitative advantages, there is a limit to the application to the sport industry. Social Big data can be very useful in daily life related fields because it analyzes based on unstructured data such as various documents and messages generated by general public. Although the amount of data that is deeply involved in everyday life such as sport for all or professional sports is relatively interesting, the amount of data is abundant. However, there is a lack of interest of the public in the detailed field of sport industry such as policies in the sport industry. This is because there is a limit to the collection of social big data on the whole issue of the sport industry, and in order to offset this, it is necessary to actually switch to the terms used by the general public. Since this did not collect data from channels with more personal opinions such as Twitter and Facebook, it is noted that the characteristics of social big data should be considered to the interpretation of the whole research results.

Traditionally, a survey or an analysis of public statistical data using universities or research institutes is used to understand the public's perception of policy issues in various fields and to grasp the public's perception, however, there are disadvantages due to limitations of times and costs. Objective statistical indicators are excellent in terms of credibility, but they are not suitable for identifying individual perceptions of particular issues. In comparison with this, the analysis of social big data has several advantages over the traditional methods in terms of amounts of data and honesty from the anonymity. Since social big data can be collected and analyzed in vast amounts of data and individuals have a tendency to be honest in the social media from the anonymity, it is easy to understand the individual's perception and interest in the particular issue. However, there are a lot of unnecessary data including abbreviations, and there is a restriction that needs to be refined since there is a huge amount of data to be collected and processed. Therefore, the analysis of perceived value toward the Korean sport industry using social big data analysis in this study is meaningful as a basic data to grasp public perception of sport industry. If collected data is effectively refined and analyzed, it can be used to complement the development and configuration of policies in the Korean sport industry.

The purpose of this study was to suggest a policy that matches the perceptions of sport industry policymakers and consumers. As for the government, there is a limit to understanding the needs of individual consumers like private companies and pursuing customized policies for individual consumers. The government aims to provide the environment for the majority of sport industry related companies to grow and contribute to the national economy for the public interest in the supplier side, but there may be a gap from the actual demand of the public perceived by the people. Because most of the social big data related to the sport industry were news, press releases, and newspaper articles published in blogs and cafes, the analysis of the social big data reflects the needs from suppliers rather than consumers. In order to solve this problem, it is necessary to have a platform for sharing and communicating information through active marketing for sport industry. Through this, it is possible to collect documents about the sport industry that contains information on individual's interest and feelings.

For the future research, it is suggested to analyze the changes in the public perception of sport industry after 2018 PyeongChang Winter Olympic Games or a particular event. Since COVID-19 has a significant impact on the global sport industry and changes many things, it should be suggested that perceptions and behaviors of sport consumers before and after COVID-19. This study has limitations because it collected only social big data from July 1, 2015 to June 30, 2016. Thus, future research should collect a wider range of data for the better understanding in the public perceptions of sport industry. Since Sports Industry Promotion Act (SIPA) was revised and enforced in August, 2016, future research should examine the public perceptions of the sport industry before and after the SIPA enforcement. In order to identify the difference in perceptions, frequency analysis, analysis of amounts of information, keyword analysis through text mining, sentiment analysis, semantic network analysis, and CONCOR network analysis can be used.

This work was funded by the Korea Institute of Sport Science (KISS).

This work was supported by the Kyung Hee University under the 2018 sabbatical year research grant.

References

1 

.

2 

.

3 

.

4 

.

5 

.

6 

.

7 

.

8 

.

9 

.

10 

.

11 

.

12 

.

13 

.

14 

.

15 

.

16 

.

17 

.

18 

.

19 

.

20 

.

21 

.

22 

.

23 

.