A potentially significant limitation of AI is the bias that can be embedded in the products it generates. Fed immense amounts of data and text available on the internet, these large language model systems are trained to simply predict the most likely sequence of words in response to a given prompt, and will therefore reflect and perpetuate the biases inherent in the inputted information. An additional source of bias lies in the fact that some generative AI tools utilize reinforcement learning with human feedback (RLHF), with the caveat that the human testers used to provide this feedback are themselves non-neutral. Accordingly, generative AI like ChatGPT is documented to have provided output that is socio-politically biased, occasionally even containing sexist, racist, or otherwise offensive information.
Want to Read More?
Generative AI tools make things up. As probabilistic models they are designed to generate the most likely response to any given prompt. Given that these tools do not 'know' anything, and are - in most instances - limited in their ability to fact check, the responses they generate can include factual errors and invented citations/references. This known phenomenon has been termed 'hallucination,' and is one persuasive reason to evaluate and fact-check all responses a generative AI tool produces.
There is some speculation that generative AI tools will soon include a 'confidence indicator' that might let users know the degree of confidence the tool has that a generated response is accurate. Likewise, some reporting suggests that generative AI tools will begin to fact-check their responses against internet sources or other AI models. At the time we are writing, these capabilities are not in wide circulation.
We need to practice a healthy skepticism about the reliability of generative AI produced responses and have a consistent practice of checking outputs against verified sources.
Generative Artificial Intelligence in Teaching and Learning Copyright © 2023 by Centre for Faculty Development and Teaching Innovation, Centennial College is licensed under a Creative Commons Attribution 4.0 International License
Generative Artificial Intelligence in Teaching and Learning at McMaster University Copyright © 2023 by Paul R MacPherson Institute for Leadership, Innovation and Excellence in Teaching is licensed under a Creative Commons Attribution 4.0 International License
While generative AI tools can help users with such tasks as brainstorming for new ideas, organizing existing information, mapping out scholarly discussions, or summarizing sources, they are also notorious for not relying fully on factual information or rigorous research strategies. In fact, they are known for producing "hallucinations," an AI science term used to describe false information created by the AI system to defend its statements. Oftentimes, these "hallucinations" can be presented in a very confident manner and consist of partially or fully fabricated citations or facts.
Certain AI tools have even been used to intentionally produce false images or audiovisual recordings to spread misinformation and mislead the audience. Referred to as "deep fakes," these materials can be utilized to subvert democratic processes and are thus particularly dangerous.
Additionally, the information presented by generative AI tools may lack currency as some of the systems do not necessarily have access to the latest information. Rather, they may have been trained on past datasets, thus generating dated representations of current events and the related information landscape.
There are currently also multiple privacy concerns associated with the use of generative AI tools. The most prominent issues revolve around the possibility of a breach of personal/sensitive data and re-identification. More specifically, most AI-powered language models, including ChatGPT, require for users to input large amounts of data to be trained and generate new information products effectively. This translates into personal or sensitive user-submitted data becoming an integral part of the collection of material used to further train the AI without the explicit consent of the user. Moreover, certain generative AI policies even permit AI developers to profit off of this personal/sensitive information by selling it to third parties. Even in cases when clear identifying personal information is not entered by AI user, the utilization of the system carries a risk of re-identification as the submitted dataset may contain patterns allowing for the generated information to be linked back to the individual or entity.
Given these issues, we recommend that users carefully review user agreements and understand the ways in which generative AI tools may collect and make use of user data before consenting to use the tools. Users should turn off data collection when possible and if that's not possible carefully consider any information they share with the tool.
Want to Read More?