Skip to content

Newspaper Column

PCPD in Media

"AI's Tipping Point: A Reminder on the Importance of Privacy and Ethics" -- Privacy Commissioner's article contribution at Hong Kong Lawyer (June 2023)

The topic of Artificial Intelligence (AI) has recently been dominating the headlines, particularly with the emergence of Generative AI-powered chatbots such as Open AI’s ChatGPT, Google’s Bard, Microsoft’s Bing Chat, Baidu’s ERNIE Bot and Alibaba’s Tongyi Qianwen. These powerful language tools have the ability to generate human-like responses and revolutionise the way we communicate and interact with technology. That said, in March 2023, thousands of AI experts, academics and business leaders signed an open letter to “call on all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4”, pending the development and implementation of a set of shared safety protocols for advanced AI design and development. At this point, it is high time we revisit the implications of the use of AI on privacy and ethical values, in an attempt to set out the relevant considerations in ensuring that AI is being developed and used in a responsible manner.

Generative AI: A Game Changer

According to McKinsey, “Generative AI” is generally defined as “algorithms that can be used to create new content, including audio, code, images, text, simulations, and videos”. Unlike earlier forms of AI that focus on automation or primarily conduct decision-making by analysing big data, which may not be as visible to the public, Generative AI has quickly become the talk of the town thanks to its “magical” capabilities to respond to almost any requests, and create new and convincingly human content based on prompts, and its accessibility in the form of Chatbots, search engines, and image-generating online platforms.

Generative AI has the revolutionary potential to transform different industries by increasing efficiency and uncovering novel insights. Tech giants have reportedly been exploring Generative AI models and applying them to their productivity software, which could benefit countless businesses downstream. General knowledge AI Chatbots based on Large Language Models (LLM) like ChatGPT can increase efficiency by assisting with drafting documents, creating personalised contents and business ideas, providing insights in response to enquiries, and more. The legal industry is not immune to such transformation. Some law firms have started to use Generative AI to automate and enhance various aspects of legal work, such as contract analysis, due diligence, litigation and regulatory compliance.

With Growth Comes Risks

Examining Generative AI without rose-coloured spectacles, however, reveals that it also presents a myriad of privacy and ethical challenges.

Privacy Risks

AI chatbots based on LLM are different from the less advanced forms of AI based on supervised machine learning. They leverage deep learning technology to analyse and learn from massive amounts of unstructured data without supervision. The training data often comes from public text extracted from the Internet, which may include sensitive personal data or even just trivia postings online. For instance, the developer of ChatGPT reportedly scraped as many as 300 billion words from the Internet to train ChatGPT. As many AI developers keep their datasets proprietary and disclose few details about their collection, there is a risk that data protection laws which typically require personal data to be collected in a fair manner and on an informed basis (such as Data Protection Principles (DPP) 1 and 5 of the Personal Data (Privacy) Ordinance (PDPO)) may be circumvented, posing privacy risks.

Outputs of AI chatbots may also generate privacy problems. User conversations may become new training data for the AI models. Users might inadvertently be feeding sensitive information to the AI systems, which is susceptible to misuse beyond the original purpose, thereby contravening the limitation of use principle (DPP3 of the PDPO). Situations where an AI chatbot produces an output response containing personal data to which the original context has been taken out and/or misinterpreted can also happen.

In addition, Generative AI developers may run into challenges concerning the rights of data subjects to access and correct their personal data (DPP6 of the PDPO) and the retention of personal data (DPP2 and section 26 of the PDPO). If outdated and/or inaccurate personal data was part of the AI’s training data and become part of the LLM, requesting access, correction and deletion of such data could be difficult, if not impossible.

Furthermore, the data security risks of storing large amount of conversations in an AI chatbot’s model and database should not be overlooked. Even without malicious external threats, accidental leakage alone could be damaging. As recently as March 2023, ChatGPT suffered a major data breach exposing the titles of conversation history, the names, email addresses and last four digits of the credit card numbers of some of its users.

Needless to say, it is crucial to ensure that personal data is protected against unauthorised or accidental access, processing, erasure, loss or use (DPP4 of the PDPO).

Wider Ethical Risks

The “garbage in, garbage out” problem has always been an issue for AI models. In my view, this problem is particularly worrying in AI chatbots. Experts refer to this phenomenon as “hallucination”, where a chatbot confidently provides incorrect yet seemingly plausible information. On an occasion I confronted a chatbot by pointing out that the answer provided to me in response to my question was incorrect, and what I got from the chatbot was an instant reply: “Sorry, I made a mistake.” Inaccurate information, such as those on medical advice, can also lead to serious unintended consequences for human users.

To further complicate the picture, the ethical risks of discriminatory content or biased output behind the use of Generative AI cannot be overlooked. As a reflection of the real world, the training data for AI models may have embedded elements of bias and prejudice (such as those relating to racial, gender and age discrimination). Such data would be “baked into” AI models, which in turn generate discriminatory content or biased output.

Last, the unavoidable conundrum of developing a general purpose AI model is the risk of exploitation by bad actors. A case in point is “deepfake”, where fake audio, images or videos would be synthesised, and potentially be used to spread fake news or harmful propaganda. AI chatbots could also be asked to generate codes for malware.

All of these highlight the need for concrete efforts to address the potential misuse of AI and develop effective safeguards to prevent exploitation.

Regulatory Landscape of AI

On the regulatory front, in the Mainland, the “Provisions on the Administration of Deep Synthesis of Internet-based Information Services”, the rules that regulate deep synthesis service providers, operators and users, came into force in January 2023. In April 2023, the Cyberspace Administration of China (CAC) also issued the draft of “Measures for the Management of the Services by Generative AI” for public consultation, which, among others, stipulates the harmful contents prohibited from being generated, and requests providers of Generative AI products and services to submit security assessment to the CAC before launching their services publicly. The providers are also expressly required to comply with the Mainland’s Personal Information Protection Law. On the other hand, the EU is planning to regulate AI, including Generative AI, through the proposed “Artificial Intelligence Act”, which suggests a prescriptive risk-based approach to regulating all AI systems and bans certain high risk AI systems. Canada is also considering a similar law, namely the “Artificial Intelligence and Data Act”, which is undergoing public consultation. Most recently, the UK Government published a white paper in March 2023 on regulating AI through a principle-based framework, which is considered more pro-innovation and flexible than the EU’s approach. Despite the differences in approaches, they share a common theme acknowledging the importance of data protection and ethical considerations. All proposed legislation would mandate that AI systems be developed and used in a way that respects individual privacy and data protection rights, and uphold ethical values such as prevention of bias and prejudice, fairness and transparency.

Governments and regulators have also been issuing guidelines on AI and recommending organisations deploying Generative AI in their operations to pay heed to such AI governance and ethics frameworks. My Office issued the “Guidance on the Ethical Development and Use of Artificial Intelligence” in August 2021 to help organisations develop and use AI systems in a privacy-friendly and ethical manner. The Guidance recommends internationally recognised ethical AI principles covering accountability, human oversight, transparency and interpretability, fairness, data privacy, beneficial AI, and reliability, robustness and security. In September 2021, the Mainland Government also issued the “Guidance on the Ethics of the New Generation AI” (《新一代人工智能倫理規範》), which adopts similar principles such as enhancing human well-being (增進人類福祉), promoting fairness and justice (促進公平公正), and protecting privacy and safety (保護隱私安全). Elsewhere in the Asia Pacific region, Singapore, Japan and South Korea have all published guidelines on the ethical governance of AI.

While consensus is yet to be formed globally as to whether AI should be regulated through legislation or other means and the extent of such regulation, one thing is for sure: While the development and use of AI presents an exciting yet complicated landscape with opportunities awaiting to unfold, the possible harms which it may do to data privacy and our ethical values must also be assessed and controlled. All stakeholders, including tech companies and AI developers, should join hands in co-creating a safe and healthy ecosystem to make sure that this transformative technology would be used for human good.