Identifying precisely how the public understands cancer-related information and investigating promptly how people perceive and cope with problems will require the establishment of a public awareness panel research system focusing on carcinogenic causes, cancer prevention, and people’s behavior. Such a system could provide follow-up after changes in public awareness, knowledge and behavior, implementing immediate corrective actions whenever misconceptions arose or inadequate information was distributed.
The tasks required to establish a sustainable risk communication platform for carcinogenic factors are presented as below.
Establishing and managing integrated big data
National Cancer Registry dataThe National Cancer Registry data will support the systemic management of cancer and cancer research. As the geographical distribution of cancer patients is one of the most important elements in an epidemiological analysis of cancer, a geographical analysis of cancer patients using the National Cancer Registry data will provide important data on carcinogenic factors.
Demographic dataDemographic analysis is based on humanistic-social phenomena and takes into account population, population density, population movements, gender ratio, and age. Established demographic data can be applied in various fields and used to identify the attributes of space. The socio-economic characteristics of a particular area (demographic characteristics expressed in space) should be examined along with demographic data. In addition, spatial analysis should also make use of socio-economic information such as occupation, income, education, and taxes. The Ministry of Transportation provides various land- and transportation-related statistical data, including a spatial information service, national spatial integration, an electronic land service, and a national geographic information distribution service and network that provides maps via the geographic information system (GIS) service. In particular, information about apartments, standard official land prices, and real estate data can be used to estimate the regional income quintile. It will also be necessary to collect data and statistical screening information from the National Statistics Office, the Ministry of Transportation, and the Ministry of Education, as well as from reports issued by population analysis institutions.
Natural environment dataNatural environment data comprises all physical factors affecting human life, including behavior and lifestyle. The terrain, with its altitude, slope, undulation, soil and land cover conditions, and climatic factors (average annual temperature, temperature ranges, precipitation, wind speed, and humidity) are categorized as environment data. As individual elements of the natural environment act in combination, a comprehensive understanding is more important than a detailed analysis of individual phenomena. The National Geographic Information Institute manages and sells digital maps detailing land characteristics and the status of land use, as well as aerial photographs and videos. The Ministry of the Environment provides data on land cover conditions, surface water quality, air pollution, and waste generation through its affiliated institution, the Environment and Geographic Information Service. In particular, users of public institutions have access to a spatial information source data download service that collects natural environment data from the Ministry of the Environment. The National Weather Service offers an analysis of monthly climate statistics, including average annual temperature, the annual temperature range, precipitation, wind speed and direction, humidity, evaporation, frost days, and foggy days, along with maps detailing weather and climate. Statistical data on climate, weather factors, and meteorological location points are provided by the National Weather Service, making it possible to conduct secondary processing using collected data. The conversion of these natural environmental data to a format that allows for comprehensive analysis makes it possible to conduct spatial analysis research on cancer, using demographic and other spatial elements.
Data on environmental risk factors related to cancerIf environmental carcinogens are naturally present or if there is a facility where carcinogens are thought to have been intentionally released, researchers must build location point data using geographical information, such as addresses. In South Korea, there are no standard definitions or lists of environmental carcinogens. It is possible to extract some mismatched information about carcinogens and environmental carcinogens under the Occupational Safety and Health Act of the Ministry of Labor and the Hazardous Chemicals Management Act of the Ministry of the Environment. If it is deemed necessary, it is possible to use lists of carcinogens developed by foreign institutions such as the WHO IARC. Because the motivation for recording location information on environmental carcinogens as point data and establishing a database generally reflects suspicions about carcinogen-related facilities, follow-up is required to improve the reliability of large data.
Information on types of industrial facilities, carcinogenic emissions, and impacts should be entered in the built point data, which can be used in carcinogen-related analyses.Because carcinogenic environmental risk factors change over time, continuous updating is necessary, through the use of designated data collection periods. After data on environmental carcinogenic risk factors have been collected, the impact of the related environment or facility on the surrounding population can be evaluated through an analysis of the diffusion rate, influence, and location point where carcinogen has been discharged.
News monitoring of carcinogenic factorsIt is very important to keep track of the way in which media reports about carcinogenic factors are made. This is because the public’s awareness of carcinogenic factors is influenced by media reports to a significant extent. The results of news monitoring are integrated into other data (for example, SNS social discourse data), and used to trace and analyze the path and spread of social discourse on carcinogenic factors. Monitoring the news may also provide important data for building an integrated risk management system in which the media itself is important subject of study.
The media outlets analyzed are mainly typical news media (newspapers and TV and radio stations) in Seoul and other large cities, the top ten major daily newspapers, and prescreened and selected broadcasters in Seoul and other cities (in addition to KBS, MBC, and SBS). Data collection is carried out using the Korea Press Foundation’s media publicity database; data screening involves an exhaustive search for articles that include the keywords “carcinogen,” carcinogen report,” or “carcinogen information.” Key items for analysis include media coverage volumes, reporting frames, reporting attitudes, and the orientation and quality of reporting.
Analysis of discourses on social network services relating to carcinogenic factorsTo collect and organize a large volume of data (various social discourses) related to carcinogenic factors and users on SNS, it is necessary to determine the methods, time period, and scope for optimal social network data mining. After inputting the mass of collected SNS data into the database for fast processing and analyzing, words related to carcinogenic factors are selected and revised appropriately. Evidence-based carcinogenic factors and related SNS search terms are also additionally selected. Only carcinogen-related terms and user information are extracted from the SNS data, and characteristics the users’ social networks are analyzed using a variety of quantitative analysis methods. Comparisons of quantitative indicators such as density, centrality, clustering coefficients, and structural holes in the social network can reveal changes in user references to carcinogenic factors and changes in the way that social networks of users spread false information about carcinogenic factors, depending on the characteristics of each factor. We also explore how the social network-related quantitative indicators of users who mention cancer on SNS change depending on the type of cancer. After completing the quantitative analysis, the emotional quality of carcinogenic-related social discourse on SNS is analyzed using the natural language analysis method. From this research, the following qualitative research topics may be extracted:
- How do people emotionally react to news or information about various carcinogens and carcinogenic factors?
- How will these feelings change if the information turns out to be inaccurate—or is verified through scientific experiment?
- When new information on a carcinogenic food product is presented, how quickly will people’s feelings change into a particular emotional state?
- Classical methods (reading, understanding, and analyzing SNS contents directly) can complement computer-aided natural language analysis in overcoming accuracy problems. In particular, a variety of qualitative manual analysis can be performed at low cost using a service called Mechanical Turk of Amazon, which has been actively used in many qualitative studies in recent years, in parallel to manual analysis. In South Korea, there is a similar small-scale, survey-based service; wiki-based topic analysis is also well suited to South Korea, given its high Internet penetration and participation rates.
Big data-based spatial analysis utilizing geographic information systemSpatial data on people’s social environments take account of land use, traffic routes, subways, buses, parks, hospitals, and educational institutions. Natural environmental data include streams, mountains, and the green belt. Basic statistical information covers administrative districts, businesses, employees, demographic characteristics, transfer information, and local taxes. When necessary, a survey of a particular region is conducted. A survey area and target group that reflects the gender and age balance of the local population are assigned, and a survey application is installed in responders’ smart phones. After the survey, the correlation between regional differences and the survey results, as well as spatial distribution and the socio-demographic environment are analyzed. The smart phone application identifies the spatio-temporal patterns of movement of survey respondents, transmitting their location information through smart phones to a server at predetermined time intervals, thus enabling the server to record the location of individuals over time. Classified by gender, age, and, occupation, a user’s activity area (defined by hours of work, leisure time, and household residence time) is assigned to a cluster, so that the contents of risk-related communications can be compared. The individual’s exposure to environmental factors can be analyzed by studying his or her moving path. Such analyses, using GPS/GIS technology, enable a precise evaluation of environmental, human, and social factors.
Panel survey on public awareness of carcinogensIt will be important to establish a public awareness panel survey system to correctly assess the public’s understanding of cancer-related information and explore how people perceive risk and cope when a problem arises. Such a system would reveal changes in public awareness, knowledge, and behavior related to carcinogenic factors over time, making it possible to take immediate corrective actions to counteract misconceptions and the distribution of misleading information.
The survey will evaluate the public’s understanding of carcinogenic factors, and assess attitudes and beliefs, motivations for seeking information, behavior patterns, information acquisition sources, and information sharing patterns. It will reveal the following information: the degree of public knowledge of carcinogenic factors; which personal, social networking, and regional factors make a difference in the awareness of carcinogens; who are most susceptible to information about carcinogenic factors, how quickly rumors spread; the relationship between understanding carcinogenic factors and taking action to prevent cancer; and the relationship between misperceiving carcinogenic factors and taking action to prevent cancer. This will become an important resource for those seeking to protect vulnerable people from false information and rumors related to carcinogenic factors.
Monitoring and conducting in-depth interviews with participants in social network services discourses related to carcinogenic factorsWe monitor and conduct in-depth interviews with users who participate in online SNS discourses related to carcinogenic factors and carcinogens. Once user IDs have been obtained through SNS data collection, the SNS usage and responses to other user issues are examined over a period of time, and compared to previous carcinogen-related reactions. In addition, people who have posted or spread incorrect information about carcinogenic factors and carcinogens are monitored, and their general SNS usage patterns observed and compared. After analyzing issues from various angles (by groups and time periods) we interview a sample group of users before finally interpreting the results. This provides data that can’t be found through SNS content analysis alone, such as the offline sources used for information, users’ motivation for spreading incorrect information, and the medical expertise of users who participate in discourses and information spreading.
Data visualization and infographic implementation for effective risk communicationRecently, the value of the infographic has been gradually spreading through the media, business and public institutions (government, organizations, and schools). Public institutions are upgrading their public services and turning press releases into a form of infographic as part of information delivery. Data visualization, as a method of presenting big data analysis results, is essential; visualization and infographic implementation can be a persuasive tool for effectively communicating with the public about socially controversial risk factors and issues.