Greyfox reviews ASPI report on Cultural Erasure in Xinjiang, explaining how flawed methodology leads to dubious conclusions.
(Based on https://threadreaderapp.com/thread/1469700378342334467.html)
This is a thread to examine the ASPI report on Cultural Erasure in Xinjiang (Cultural erasure, Tracing the destruction of Uyghur and Islamic spaces in Xinjiang, Nathan Ruser, With Dr James Leibold, Kelsey Munro and Tilla Hoja, INTERNATIONAL CYBER POLICY CENTRE, Policy Brief Report No. 38/2020)
I will focus on the problems in the methodology and statistics, rather than the conclusions and examples, but it would follow that those are wrong, due to the bad methodology.
Before diving right in, note that even ASPI recognizes on p5 that for many mosques, the dome and minarets are very recent additions, which automatically destroys the narrative that "traditional" culture is being eliminated because domes and minarets are being removed now.
Mosques across Xinjiang were rebuilt following the Cultural Revolution, and some were significantly renovated between 2012 and 2016, including by the construction of Arab- and Islamic-style domes and minarets. However, immediately after, beginning in 2016, government authorities embarked on a systematic campaign to rectify and in many cases outright demolish mosques.
ASPI also admits they can only use satellite imagery to study the mosques, which is problematic, because many mosques are unidentifiable from satellite images, as noted by ASPI, so on the ground verification is key to getting useful data, which was not done.
The Chinese Government 2004 Economic Census identified more than 72,000 officially registered religious sites across China, including more than 24,000 mosques in Xinjiang. Given the lack of access to Xinjiang and the sheer number of sites, we used satellite imagery to build a new dataset of pre-2017 mosques and sacred sites.
The data from the 2004 Economic Census provided addresses for each of the nearly 24,000 mosque sites in Xinjiang.  However, in many cases, the addresses are imprecise and could not be clearly associated with physical buildings visible in the satellite imagery. Therefore, we used a combination of two different methods.
ASPI used 2 methods to find mosques; 2 Points Of Interest (POI) databases that identified a combined 340 mosques and a manual visual approach that identified 192, meaning a total 532 sampled (not 533 as stated).
The database was queried for the word (mosque). That yielded 1,733 mosques nationwide, including 289 in Xinjiang. Of those, 16 were excluded due to their current status or location being unclear or due to being duplicate results, leaving 273.
Finally, a second POI database from AutoNavi was queried for mosques. That found an additional 73 mosques, of which 67 mere unique and not duplicates of previously examined mosques.
That resulted in 307 search areas. Mosques were found in approximately 70% of the areas; the 94 remaining search areas generally had inadequate satellite imagery to ascertain the location of the mosque, or had no clearly discernible mosque in the search area.  Finally, duplicates were removed. Later, we removed mosques for which recent satellite imagery was unavailable and the current status of which could not be ascertained.  That left a total 192 mosques found through this method.
Together, using these two methodologies and three datasets, we found a total 533 unique mosques, representing 2.25% of the official total in the region. A map of all mosques in our pre-2017 dataset is included in Figure 31.
Curiously, this is a different sampled number compared to the number in Table 1 (524). Even more interestingly, if you add up the Total Sampled column in Table 1, the total is 543. This could be a simple typo, but is typical of the sloppiness of the calculations.
Consider the 2 POI databases.
It is unclear how representative these databases are. As mosques in Urumqi were over-represented in ASPI sampling, these databases were probably biased towards populated areas, which may skew results. For example, urbanizing areas with rapid development is more likely to have renovating mosques or be moved to a new location. ASPI searched these databases using `mosque`, which may not yield maximum results, since many mosques do not have `mosque` in their name, as the listing shows in the link shows 2.
It is unclear if mosques with `mosque` in their Chinese name will be hits in the ASPI search. It is also possible that these POI databases are mis-labelled and inaccurate, so on-the-ground confirmation would be essential.
This sampling method is therefore very unreliable.
The sampling method using manual visual searches for mosques is also subject to many errors, as noted here. The 5 minute limit on each search would increase the error rate, particularly when considering the problems of identifying mosques via satellite.
 Each search area was searched for approximately 5 minutes unless there were only a small number of structures in the search area. The fact that we did not find a mosque does not mean that there was not one present. We often found structures that we suspected to be mosques, but we could not determine for sure based on available satellite imagery.
The assessment of whether the mosques were destroyed or damaged was done via visual inspection using satellite images, which makes the results unreliable when there is no confirmation on the ground.
Once we compiled the pre-2017 dataset of mosque locations, each one was then visually compared to recent satellite imagery (generally mid-2019 to 2020). We recorded its current status, changes since 2017 and, where available, date ranges for the demolition or removal of Islamic architecture.  For undamaged sites, we recorded the date of the last available satellite imagery so that follow-up studies can be prioritised to look at the oldest sites. We generally accessed satellite images via Google Earth; where Google Earth did not have sufficient satellite imagery, we used other commercial sources with 30-50-centimetre resolution.
This is also just a snapshot in time, and the situation may change over time. The status of a mosque is also determined without consideration of whether it has been relocated or is currently undergoing renovations. This will skew the results towards more mosques being classified as "destroyed" or "damaged", even though they are not. The classification of "slightly" or "significantly" damaged is also subjective. Therefore, this classification system is too subjective and unreliable.
In some cases, we based the distinction between "slightly damaged" and "significantly damaged" on an assessment of how important the removed features were to the mosque`s structure and aesthetics. For example, a mosque with only a small dome that had been removed would be coded as slightly damaged, despite the fact that all Islamic architecture on the structure had been removed, as the dome wasn`t a significant element in the building`s earlier aesthetics.
It is also worth noting that many important mosques do not have a dome or minarets, again showing these features alone do not indicate anything. (See thread from Kyle) 3 Some buildings in China also have domes, but are not mosques, and its unclear if ASPI adjusts for these.
Their conclusion that they are 95% confident they have the destroyed and damaged mosque numbers within a 4% confidence interval, based on their data, relies on a completely random sampling methodology.
Extrapolating those figures on a prefectural level from official statistics allowed us to estimate the full number of destroyed and damaged mosques in Xinjiang. We found that across the XUAR approximately 16,000 mosques have been damaged or destroyed and 8,450 have been entirely demolished. The 95% confidence range of our regional findings is 4% for the estimates of demolished, destroyed and undamaged mosque numbers. The full prefectural breakdown is shown in Table 1 and Figure 3.
However, we can see that the distribution of sampled mosques is not proportional to the number of mosques in each area. I tried to calculate the probability of such a sample distribution, and got the result that there`s 0% probability it is randomly sampled for many areas.
Therefore, the confidence interval for their results is wrong. If we look at individual areas, we see that they fall far short of the required sample number for their stated confidence level in each area. The implied error margin is also very high in many areas.
The extrapolation method is very poor.
Firstly, the extrapolation should consider the margin of error for each area and give a suitable range of possible answers. This was not done.Secondly, there were areas where they referred to the overall average in their extrapolation. This is an invalid extrapolation method , because it will just produce the average they have in mind. It is not an independent way of producing an answer.
Therefore, we can see that at each step, ASPI used biased or erroneous methods to do their analysis on mosques, so their results are unreliable. ASPI also did some calculations for cultural sites, but they did not share the data, so it is not possible to verify the reliability
Extrapolation for the total number of destroyed and damaged mosques across Xinjiang was done at the prefectural level, which accounted for the majority of variation within the sampled data. For the 11 prefectures that were represented by over 2.5% of their total mosques, we directly extrapolated using the sampled data; for example, in Urumqi, where 38% of mosques were sampled, 17% were destroyed, so we extrapolated that 17% of all mosques had been destroyed.
For the remaining prefectures with under 2.5% of all mosques sampled, the extrapolation was guided equally, using both the prefectural rates of destruction and the Xinjiang-wide rates (excluding Urumqi, an outlier in our sample and dramatically overrepresented). For example, if a prefecture with fewer than 2.5% of mosques sampled had 40% of all sampled mosques destroyed, but the Xinjiang-wide rate was only it would be extrapolated that 35% of all mosques had been destroyed in the prefecture.
The report presented evidence that some mosques are preserved despite development around them. Therefore, blowing up their narrative and showing again that a simple satellite analysis is insufficient to understand the situation.
no minaret, no `mosque` in name
minaret, no `mosque` in name
no minaret, `mosque` in name