Forum: Consider costs of being overprotective of data

An illustration of data being looked through. PHOTO: UNSPLASH

I appreciate the Smart Nation and Digital Government Office's (SNDGO) reply to my letter (Concerted effort made to expand data available to public, Aug 3). In response, I offer four points.

First, releasing more data does not mean releasing useful data. While there may be a plethora of data sets available to the public, this data is typically not useful for robust analyses that will lead to policy co-creation.

Senior Minister Tharman Shanmugaratnam in 2018 said that we should be careful of "single-factor correlations". Unfortunately, one can do little more than that using data on sites like data.gov.sg

Second, re-identification is a real risk, but one which can be reasonably addressed. For example, one could release survey data on individuals using age bands (51 to 60, 61 to 70 and so on) instead of releasing their exact ages. Employing such methods would make it harder to re-identify individuals.

The use of secure data labs and the like could then be put in place only if researchers wish to access more sensitive data such as geographical location. Such practices are not uncommon elsewhere. Data repositories all around the world manage to securely store sensitive data sets, while making as much of it publicly available as possible. It is not an all-or-nothing situation.

Third, most survey data held by government agencies is unlikely to threaten Singapore's national interests. For instance, the National Youth Survey and the Marriage and Parenthood Survey are studies that look at societal trends in youth participation and marriage. They contain relatively harmless measures on areas such as social activity and general well-being that are of interest to many Singaporeans.

Finally, the Government should seriously consider the costs of being overprotective of data. Researchers can apply for various grants to collect survey data, but many of these studies overlap substantially with existing data already held by government agencies. This is a waste of taxpayers' money, given how expensive the data is to collect.

A "careful and calibrated approach" sounds good in the abstract, but over-protectiveness compromises the timeliness of research findings.

In my personal experience, government agencies are often slow and opaque in the process of collaborating with independent researchers.

Over-protectiveness can lead to groupthink. SNDGO refers to "commissioned researchers", but if data is shared with only people who the Government deems "safe", doesn't that defeat the purpose of policy co-creation with the public?

Shannon Ang (Dr)

Join ST's Telegram channel and get the latest breaking news delivered to you.

A version of this article appeared in the print edition of The Straits Times on August 05, 2020, with the headline Forum: Consider costs of being overprotective of data. Subscribe