Data on how taxis travel through communities and on how people label points of interest on social media could help analysts and criminologists better understand neighborhood crime rates in a city, say researchers.
Analysis of data from points of interest in Chicago—including restaurants, shops, nightclubs, and transit stations—designated by members of FourSquare, a social media site, along with the city’s taxi flow information, offered significantly more accurate estimates of crime rates compared to traditional means.
Crime analysts currently mainly rely on demographic and geographic data to study crime and predict trends.
Big data projects could improve understanding of crime and help planners make better decisions, as well as allow communities and police to use their resources to more efficiently fight crime, says Jessie Li, assistant professor of information sciences and technology at Penn State.
Taxi routes are like hyperlinks, connecting different communities with each other, adds Li.
“We had this idea that taxis serve as hyperlinks because people are not only influenced by the nearby location, but they are also frequently influenced by the places they go to,” says Li. “For example, your home may be a half hour drive from your work; they are not spatially close. But you spend a lot of time there and you end up being influenced by people, such as your colleagues, there.”
Points-of-interest information may improve crime statistic analysis because it shows how certain areas are used and why people want to be there, according to the researchers, who presented their findings August 15 at the conference on Knowledge Discovery and Data Mining in San Francisco, California.
“According to the data, areas with nightclubs tend to be low crime areas, at least in Chicago, which may be a surprise to many,” says Li. “However, it may reflect the people’s choices to be there—they want to go to a nightclub that is safe, not one that’s dangerous.”
Li says that this study also points to how the field of big data is providing both new sources of data and new ways to explore the implications of that data.
Big data can often show correlations between the sources of data and certain effects, such as crime, which is helpful for making predictions. However, Li points out that the sources of data are not necessarily causing the effect.
“What we see here is a correlation between the taxi and points-of-interest data and crime rates,” says Li. “The data show us the correlation, but, scientifically, as far as a cause, we don’t know.”
The researchers used data on taxi trip records in Chicago, which included pickup and drop off times and locations, operation time, and total fare amount, from October to December 2013. They also gathered 112,000 points-of-interest from FourSquare for the study.
Statistics on crimes in Chicago came from the city’s data portal and demographic details included information on population, poverty, disadvantage index, and ethnic diversity.
The National Science Foundation supported this work.
Source: Penn State