Ghost Cities of China
2017
Project Team:
Sarah Williams, Wenfei Xu, Michael Foster, Shin Bin Tan, Changping Chen
After years of economic growth in the 1990’s and 2000’s, housing vacancy has become a serious concern for many second and third-tier Chinese cities experiencing economic slow down which some say started as early as 2008 during global economic crisis. These vacant areas are often referred to as “Ghost Cities” and expose underlying weaknesses in the Chinese real estate market, yet identify their locations is difficult because it is nearly impossible to obtain data about these sites from the Chines government.
Given the difficulty of measuring Chinese real estate risk through the use of government data, this research project set out to test whether it is possible to use easily obtainable data scraped from social media and points of interest data provided on Chinese websites to identify these vacant and undeveloped areas in second and third-tier cities. Data was scrapped from Dianping (Chinese Yelp), Amap (Chinese Map Quest), Fang (Chinese Zillow), and Baidu (Chinese Google Maps) using open access API’s, and was used in a model developed to identify vacancy by measuring a residential point of interest’s access to basic amenities, such as grocery stores, restaurants, schools, malls, and banks among others.
In this project, we collected data for over 20 cities to create a granular model for residential vacancy discovery. After we created our ghost cities model, we went to China to ground-truth our information. We consider this also an important part of the model building process, as we were able to validate some of the results from our model first hand, as well as develop a deeper understanding of the issue from key stakeholders.
Cities are separated into an 'urban' and 'suburban' group based on their population distribution, and each group's amenities scores are calculated and clustered based on spatial autocorrelation. We create a gravity model based on nearby amenities that are essential to a lively community. Each residential area in the city is given an amenities score, and scores are then filtered for ghost city candidates. Ultimately, the research results show that openly accessible data available through social media can help locate and estimate risk in the Chinese real estate market, but perhaps more importantly, identifying where these areas are concentrated can help city planners, developers and local citizens make better investment decisions and address the risk created by these under-utilized developments. These results represent a data set, built outside the government, of the transitional, underperforming or vacant housing stock in the Chinese market, while also showing how open-source and social media data can be used to understand the urban condition.