SOCIAL SENSING -- 社会感应
Introduction of Social Sensing -- 社会感应初识
Online social media, such as Twitter and Instagram, democratized information broadcast, allowing anyone to share information about themselves and their surroundings at an unprecedented scale. The large volume of information thus posted on these media offer a new lens into the physical world through the eyes of the social network. The exploitation of this lens to inspect aspects of world state has recently been termed social sensing.
在线社交媒体,如推特(微博)、Instagram(照片墙)、和民主化的信息广播(自媒体)允许每个人以史无前例的规模分享他们自己和周围的信息。这样大量的个人媒体信息可以提供一个全新的视角来窥视我们的现实世界。以开发和探索的视角来审视现实世界的方方面面如今被称作为社会感应。
The power of manipulating reality via the use (or intentional misuse) of social media opened concerns with issues ranging from radicalization by terror propaganda to potential manipulation of elections in mature democracies. Many important challenges and open research questions arise in this emerging field that aims to better understand how information can be extracted from the medium and what properties characterize the extracted information and the world it represents. Addressing the above challenges requires multi-disciplinary research at the intersection of computer science and social sciences that combines cyber-physical computing, sociology, sensor networks, social networks, cognition, data mining, estimation theory, data fusion, information theory, linguistics, machine learning, behavioral economics, and possibly others.
使用和滥用社交媒体来操纵现实的力量引来一系列的问题和担心,从激进主义的恐吓宣传到在成熟的民主国家操作和影响总统大选。许多重要的挑战和开放性研究问题在这门快速发展的学科中应运而生。社会感应主要针对更好的理解信息如何从媒体中被提取,以及提取的信息有哪些数据特征属性和它如何阐述世界。强调以上的问题及挑战需要在计算机科学和社会科学的交叉领域结合了网络现实计算、社会学、感应网络、社会网络、认知、数据挖掘、估算理论、数据融合、信息理论、语言学、机器学习、行为经济学和其他可能学科进行多理论的探索。
Study Area/Ideas About Social Sensing -- 社会感应的研究方向及应用
Trust and Credibility Analysis / 信任和可信度分析 The online social media (e.g., Twitter, Flickr, Facebook, Foursquare, etc.) is designed as an open data-sharing platform for average people. This creates an ideal scenario for unreliable content from a large amount of unvetted human sources. Given the massive amount of twitter users (e.g., 284 million monthly active users) and tweets they make (e.g., half billion tweets per day), it is not simple to figure out the trustworthiness of sources and the credibility of their tweets. Therefore, it would be interesting and important to develop new trust and credibility analysis tools to obtain accurate and credible information from noisy and unfiltered social sensing data.
Disaster Report and Event Tracking / 灾难报道和事件追踪 Due to the popularity and penetration of the online social media, people now use them to report the status of disasters and emergency events. For example, in the Boston Marathon Bombing event in April 2013, the first "report" of the bombing event actually came from a tweet made by a witness who was at the scene of the bombing. The timestamp of that particular tweet is the exact moment the first explosion happened. The rich set of social sensing data in the disaster scenarios offers us great opportunities to develop some real-time situation awareness tools that can efficiently detect and track the status of disasters in a reliable and timely fashion. Such tools could greatly assist the government to effectively dispatch rescue team, allocate important resources and get useful feedback from common citizens in the aftermath of a disaster.
Social Media Command Center for Business Intelligence / 商业情报的社交媒体控制中心 Large companies (e.g., Dell, Cisco, Wells Fargo) and airlines (e.g., Delta, Southwest) recently start to build a dedicated business intelligence team called social media command center (SMCC). In SMCC, the company's social media team monitor the online social media and engage social conversation around their brand and market. SMCC allows the real-time monitoring of trends regarding marketing efficiency, customer service and feedback, and risk management, making it easy for passing execs to gauge the social health of the brand at a glance. Therefore, it would be an interesting task to build your own version of the social media command center for your favorite brand or company using freely available online social media data.
A New Personalized Information Subscription Service / 个人定制化信息订阅服务 Much like Google News aggregates headlines from relatively reliable news sources (e.g., popular news website) to provide readers a personalized subscription service for news reading, it will be very interesting to develop a new information subscription service that leverages the rich set of real-time information embedded in online social media and explore the collective wisdoms of common individuals. One major challenge to provide this service is how to efficiently distill and organize information contributed by diversified and unreliable sources and summarize such information to an optimized degree that each subscriber feels comfortable to read and trust.
Real-time Data Analytics / 实时数据分析 Making sense of huge volumes of social sensing data streams coming from a complex and highly dynamic environment in a timely manner is a big challenge. It would be very interesting to build a new data analysis engine that efficiently organizes a firehose of streaming and heterogeneous data feeds and delivers reliable information with real-time guarantees. Some important problems need to be addressed in order to develop this real-time data analysis engine. For example, how can we distribute data streams over clusters and compute results in a way that optimizes the estimation accuracy while minimizing the analysis time? How can we develop an efficient distributed data analysis algorithm that outputs almost the same results as the centralized version but at a much faster speed?
Multi-genre Network Analysis / 多类型网络分析 Comprehensive understandings of multi-genre networks (e.g., social network, information network, and physical network) play a critical role in the future social sensing applications. For example, a recent heavy traffic jam on a major southern California freeway detected by the deployed sensor network (i.e., physical network) co-occurred with unusual bursts of traffic on Twitter (i.e., social network) around the same location. The contents of tweets actually offered a very clear and first-time explanation of the traffic jam as a local protest demonstration for purposes of tax. It would be interesting to develop new techniques that will automatically unearth new information by exploring the data correlation across multi-genre networks and provide more effective solutions for decision makers.
Big Data Processing and Storage / 大数据的处理和储存 In just one minute, more than 350,000 new tweets are made on Twitter, 700,000 status updates happen on Facebook, more than 3500 images are added on Flickr, and 100 hours of video are uploaded to YouTube. The online social media is creating a deluge of information that greatly exceeds the capability of our humans to consume it. This information deluge motivates an urgent need of big data related techniques to efficiently process and store the data from online social media in an efficient and effecitve way. It would be interesting to develop novel algorithms and schemes that leverage state-of-the-art distributed systems and cloud computing paradigms (e.g., Hadoop, Amazon EC2, etc) to tackle the big data challenge in social sensing.
Detect and Reduce Redundant Information / 监测和减少冗余信息 Given the large amount of data made in social sensing applications, the amount of duplicate content and the demand for the redundant information reduction is increasing tremendously. For example, Twitter users can easily repeat the information from others by using a simple "Retweet" function. Alternatively, some users may rephrase what they have read/learned and make a "new" tweet in a slightly different form. Such redundant information puts a heavy burden on users of micro-blogging services when searching for new content. It would be interesting to develop some duplicate detection and redundacy reduction schemes for social sensing applications that can dramatically reduce various kinds of duplicates and diversify the search results.
Geo-location and Spatial Distribution Problem / 地理定位和空间分布问题 Understanding the spatial-temporal distribution of the social sensing data is very important in many real world applications (e.g., disaster tracking, Geotagging, crowdsensing). However, many participants choose to disable geo-location features of their social sensing apps due to the sensitivity of the location data (especially when it is coupled with the temporal information). For example, there are normally less than 1% of tweets that actually have the accurate geo-location information (i.e., GPS coordinates) embedded. Approximately 25% of users have listed a user location as granular as a city name, which also contain non-trivial amount of errors and ambiguities (e.g., confusion about the same city name in different states). Therefore, it would be very interesting to develop some location inference systems that can accurately estimate possible locations of the social sensing data by doing a deeper content analysis (e.g., text mining) in addition to some background knowledge available (e.g., mapping from specific words to given locations).
Assembling Information from Structured and Unstructured Data / 从结构和非结构数据中重组信息 Data generated in social sensing can be heterogeneous in modalities (i.e., both structured and unstructured.). For example, structured data could be the numerical readings from the sensors on the participants smartphones. The unstructured data could be a piece of free text or an image that a user uploads to Twitter or Flickr describing the current situation in her/his surroundings. Different tools and techniques have been developed to process and analyze structured and unstructured data respectively. However, it remains a big challenge to explore the correlations across data types and assemble/fuse useful information from both structured and unstructured data. It would be interesting to develop new data processing and inference systems that are capable of assembling information from both structured and unstructured data for our social sensing applications.
Citations:
Retrieved from: https://www3.nd.edu/~dwang5/courses/spring17/
Zhang, Daniel, et al. "A real-time and non-cooperative task allocation framework for social sensing applications in edge computing systems." 2018 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 2018.
Aggarwal, Charu C., and Tarek Abdelzaher. "Social sensing." Managing and mining sensor data. Springer, Boston, MA, 2013. 237-297.