XClose

Teaching & Learning

Home
Menu

Introduction to Generative AI

Explore Generative AI (GenAI), how it works and its strengths and weaknesses. Find guidance to help ensure that use of GenAI across UCL is effective, ethical and transparent.

On this page: 

What is GenAI (such as ChatGPT)?

GenAI is an Artificial Intelligence (AI) technology that automatically generates content in response to written prompts. The generated content includes texts, software code, images, videos, and music.   

GenAI is trained using data from webpages, social media conversations and other online content. It generates its outputs by statistically analysing the distribution of words or pixels or other elements in the data that it has ingested and identifying and repeating common patterns (for example, which words typically follow which words).   

There are many other types of AI applications, that do not involve GenAI, that are having an impact on teaching and learning. These other types of AI (known as ‘teaching and learning with AI’ or ‘AIED’) will be addressed in more detail as these web pages develop.  

Remember:  

  • GenAI looks accurate… but it isn’t
  • GenAI looks intelligent… but it isn’t
  • GenAI looks as if it understands… but it doesn’t.  

Microsoft CoPilot, also known as Bing Chat Enterprise

UCL staff and students can access Microsoft Copilot, which can be used for both text and image generation. With commercial data protection, this is intended as a more secure alternative than other GenAI services. If you wish to use GenAI, then this is the safest way to do so. 

If you're logged into Microsoft Copilot with your UCL credentials, what goes in – and what comes out – is not saved or shared, and your data is not used to train the models. 

Find out more and how to access Microsoft CoPilot on UCL's Microsoft Office 365 Sharepoint site.

Practical guidance on how educators can use Microsoft CoPilot is available on the National Centre for AI in Tertiary Education's blog


Text GenAI  

In response to a human-written prompt, text GenAI generates text that usually appears as if a human has written it.

Yet, just like human-written texts, text GenAI outputs can be superficial, inaccurate, untrustworthy, and full of errors.  

"Large language models [which is the technology behind text GenAI] are the ultimate bullshitters because they are designed to be plausible (and therefore convincing) with no regard for the truth.”  Associate Professor Carissa Véliz, University of Oxford

Despite appearances, text GenAI does not understand either the prompt written by the human or the text that it generates.  

Every time that we use a text GenAI tool, we need to consider its output from a sceptical perspective.  

Examples of text GenAI tools:

Please note that UCL does not recommend any of the tools in this list. Microsoft CoPilot is now available for UCL students and staff. Read more about using CoPilot

Examples of other GenAI tools built on top of GenAI tools:
  • ChatPDF (summarises and answers questions about submitted PDF documents)  
  • Elicit (aims to automate parts of researchers’ workflows, identifying  relevant papers and summarising key information)  
  • WebChatGPT ( Google Chrome extension that gives ChatGPT Internet access, to enable more accurate and up-to-date conversations)
  • Microsoft has incorporated ChatGPT into its Bing search engine and is implementing ChatGPT across its Office portfolio.

Please note that UCL does not recommend any of the tools in this list. Microsoft CoPilot is now available for UCL students and staff. Read more about using CoPilot


Image/video/music GenAI  

Image, video and music GenAI can generate outputs based on human-written prompts. Some can also respond to visual or musical prompts.  

Again, the appearance of image/video and music GenAI outputs might appear novel. However usually they are only complex combinations of the millions of images/videos/music that they have ingested during their training.  

On the one hand, this is how creativity often works. For example, Rock and Roll music combined ideas from R&B, gospel and country music.  

But importantly, Rock and Roll drew on ideas from the earlier works. Meanwhile, GenAI actually uses the earlier works in its outputs, and without the consent of the original creators.  

Another issue raised by image GenAI is how difficult it can be to write an effective prompt. For example, the breakthrough AI image Théâtre D’opéra Spatial, took weeks of prompt writing and fine-tuning hundreds of images.   

Examples of image/video and music GenAI tools:

Please note that UCL does not recommend any of the tools in this list. Microsoft CoPilot is now available for UCL students and staff. Read more about using CoPilot

Return to top 


How does Generative AI work?   

Both text and image GenAI are based on a set of AI techniques that have been available to researchers for several years and have been built one on top of another. 

Text GenAI  

Although the terms listed below are all often used in descriptions of GenAI, it isn’t necessary to understand exactly what they all mean. The most important things to note are the hierarchy of the technologies and their complexity.  

ChatGPT (and other text GenAI)  is a type of:    

Generative Pre-trained Transformer (GPT: an advanced type of LLM)  

which is a type of Large Language Model (LLM: a massive computer-based representation of examples of natural language)  

which is a type of General-purpose Transformer (an ANN language processor)  

which is a type of Artificial Neural Network (ANN: an ML approach inspired by how the human brain works, its synaptic connections between neurons)  

which is a type of Machine Learning (ML: an approach to AI that uses algorithms to automatically improve its performance from data)  

which is a type of Artificial Intelligence.  


Issues around training a text GPT  

So that a text GenAI can generate text, it first has to be trained. This involves the tool being provided with and processing huge amounts of data scraped from the internet and elsewhere. It is reported, but not confirmed by OpenAI, that the training of GPT4 involved a million gigabytes of data. Processing this data involves identifying patterns, such as which words typically go together (e.g. “Happy” is often followed by “Birthday”).  

Carbon footprint 

Training a GPT requires huge amounts of power and indirectly generates huge amounts of carbon, with important consequences for climate change. For example, it is estimated that the training of GPT3 (the GPT used by the first version of ChatGPT made available to the public) consumed 1,287 megawatt hours of electricity and generated 552 tons of carbon dioxide, the equivalent of 123 cars driven for one year.  

Feedback loop 

Another concern is that when future GPTs are trained, the data that they ingest are likely to include substantial amounts of text generated by previous versions of GPT. This self-referential loop might contaminate the training data and compromise the capabilities of future GPT models.  

Human costs 

Once the text GenAI model is trained but before it is used, it is often checked and refined in a process known as Reinforcement Learning from Human Feedback (RLHF). In RLHF, text GenAI responses are reviewed and validated by human reviewers. These human reviewers ensure that the GenAI responses are appropriate, accurate, and align with the intended purpose. Sometimes the provider of the GenAI then sets up what are known as ‘guardrails’ to prevent the GenAI generating objectionable materials.  

In the development of ChatGPT, the RLHF reviewers mostly were workers in global south countries such as Kenya. Workers were paid less than $3 per hour to review the outputs of ChatGPT and identify any objectionable or nasty materials. This work has had a massive negative impact on many of those who were involved.


How a GPT generates text  

Once the GPT has been trained, generating a text response to a prompt involves the following steps:  

1.    The prompt is broken down into smaller units (called tokens) that are input into the GPT.  

2.    The GPT uses statistical patterns to predict likely words or phrases that might form a coherent response to the prompt.  

  • The GPT identifies patterns of words and phrases that commonly co-occur in its prebuilt large data model (which comprises text scraped from the Internet and elsewhere).  
  • Using these patterns, the GPT estimates the probability of specific words or phrases appearing in a given context.  
  • Beginning with a random prediction, the GPT uses these estimated probabilities to predict the next likely word or phrase in its response.  

3.    The predicted words or phrases are filtered through what are known as ‘guardrails’ to remove any offensive content.  

4.    Steps 2 to 3 are repeated until a response is finished. The response is considered finished when it reaches a maximum token limit or meets predefined stopping criteria.  

5.    The response is post-processed to improve readability by applying formatting, punctuation, and other enhancements (such as beginning the response with words that a human might use, such as “Sure,” or “Certainly,” or “I’m sorry”).  


Image and Music GenAI  

Image GenAI and music GenAI use a different type of ANN known as Generative Adversarial Networks (GANs) which can also be combined with Variational Autoencoders. Here, we focus on image GANs.   

GANs have two parts (two ‘adversaries’), the ‘generator’ and the ‘discriminator’. The generator creates a random image in response to the human-written prompt, and the discriminator tries to distinguish between this generated image and real images. The generator then uses the result of the discriminator to adjust its parameters, in order to create another image.   

This process is repeated, possibly thousands of times, with the generator making more and more realistic images that the discriminator is increasingly less able to distinguish from real images.

For example, a successful GAN trained on a dataset of thousands of landscape photographs might generate new but unreal images of landscapes that are almost indistinguishable from real photographs.   

Meanwhile, a GAN trained on a dataset of popular music (or even music by a single artist) might generate new pieces of music that are very similar to but still different from the structure and complexity of the original music. 

Return to top 


What are the strengths and weaknesses of GenAI? 

Some of the benefits of AI and why we should critically evaluate its outputs. 

Strengths  

GenAI can produce diverse and seemingly original outputs, creating content that may not have been seen before based on patterns in the data they were trained on.

GenAI can process and interpret human language, allowing them to generate contextually relevant responses to user prompts.

GenAI can process and generate text in multiple languages.

GenAI can be fine-tuned for various tasks and domains, making them widely applicable (e.g., chatbots, content generation, and language translation). 

GenAI can learn patterns and representations from vast amounts of data, enabling them to capture nuances in language and generate outputs based on the patterns they've seen during training.

GenAI models can remember previous interactions, which results in more coherent and relevant conversation experiences for users.

GenAI can generate responses quickly, allowing for rapid interactions and real-time applications.  

Weaknesses

GenAI can generate information that appears factual but is inaccurate

It’s potentially dangerous that GenAI models appear to understand the content that they use and generate, but in reality they do not understand it. This could lead users to have misplaced trust in the GenAI output. 

GenAI output imitates or summarises existing content - mostly without the permission of the Intellectual Property owners - but can give the appearance of creativity

GenAI can produce content that is morally and ethically troubling, and its use can raise moral and ethical issues. 

Training and running GenAI models can require significant computational and power resources.

The outputs of GenAI need to be moderated  to establish ‘guardrails’ that prevent it generating inappropriate or offensive outputs. For ChatGPT, this was undertaken by poorly paid workers in Kenya, many of whom suffered mental health issues because of the disturbing generated output that they had witnessed. 

GenAI can be used to automatically generate fake news and deep fakes

GenAI is contributing to the digital divide. It relies on huge amounts of data and massive computing power, which is mostly only available to the largest international technology companies and a few economies. This means that the possibility to create and control GenAI is out of reach of most people, especially those in the Global South. 

While we understand broadly how GenAI works, because of its complexity it is usually impossible to know why it produces particular outputs

The output of GenAI is flooding the internet. This poses an interesting recursive risk for future GPT models. These themselves will be trained on online content that earlier GPT models have created (including all its biases and errors). 

GenAI tends to output standard answers that replicate the values of the creators of the data used to train the models. This may constrain the development of plural opinions and further marginalize marginalized voices

Return to top 


Further information 

Find a broad range of commentaries and resources to inform your own views. Note that including a link on this page does not suggest that UCL supports or endorses the views expressed.

Universities

Arizona State University (March 2023): ChatGPT in the Classroom. A practical source of information for college-level instructors struggling to navigate the potential impacts of ChatGPT (and other Large Language Models) in class. 

Russell Group (July 2023): New principles on use of AI in education 

University of Cambridge (May 2023): ChatGPT (We need to talk) 

University of Leeds (undated): IT Security Considerations on the Use of ChatGPT and AI LLM Engines 

Monash University (updated): Generative artificial intelligence technologies and teaching and learning 

University of Sydney (updated): Artificial intelligence and education at Sydney 

Peter Bryant (University of Sydney Associate Dean Education) (January 2023): ChatGPT: IDGAF (Or: How I Learned to Stop Worrying and Ignore the Bot) 

Deakin University (March 2023): ChatGPT – how should educators respond? 

Imperial College London (March 2023): Generative AI Tools Guidance 

Academic and related organisations

QAA (January 2023): The rise of artificial intelligence software and potential risks for academic integrity: A QAA briefing paper for higher education providers 

National Centre for AI in Tertiary Education (JISC) (January 2023): A short experiment in defeating a ChatGPT detector 

HEPI (May 2023): How are HE leaders responding to generative AI? 

QAA (May 2023): Maintaining quality and standards in the ChatGPT era: QAA advice on the opportunities and challenges posed by Generative Artificial Intelligence 

National Centre for AI (JISC) (May 2023): A Generative AI Primer 

Sensemaking, AI, and Learning (SAIL) (May 2023): SAIL: We're not doing ok. 

Wolfram (February 2023): What Is ChatGPT Doing … and Why Does It Work? 

National Centre for AI in Tertiary Education (JISC) (March 2023): AI writing detectors – concepts and considerations 

Understanding AI in Education 

UNESCO (2023): ChatGPT and artificial intelligence in higher education: quick start guide 

SEDA (March 2023): ChatGPT Seminar Series (recordings) 

QAA (updated): ChatGPT and Artificial Intelligence. Advice, guidance and resources for higher education professionals to adapt their teaching in light of artificial intelligence

Media

Business Insider (January 2023): Microsoft warns employees not to share 'sensitive data' with ChatGPT 

TechCrunch (February 2023): Most sites claiming to catch AI-written text fail spectacularly 

Feedback Fruits (December 2022): Generate individualised feedback on writing in larger student cohorts 

Insider (August 2023): An Asian MIT student asked AI to turn an image of her into a professional headshot. It made her white, with lighter skin and blue eyes. 

New York Post (July 2023): Purdue professor accused of being AI bot for lacking ‘warmth’ in viral email: ‘I’m just Autistic’ 

BBC (July 2023): The A-Z of AI: 30 terms you need to understand artificial intelligence 

Reuters (April 2023): EU proposes new copyright rules for generative AI 

Times Higher Education (July 2023): It is too easy to falsely accuse a student of using AI: a cautionary tale 

Wonkhe (June 2023): The real risk of generative AI is a crisis of knowledge 

Wonkhe (July 2023): Learning how to be more human will prepare universities for an AI-mediated future 

Bounded Regret (June 2023): What will GPT-2030 look like? 

FT (May 2023): The AI revolution already transforming education. Schools and universities are using ChatGPT in the classroom, but will it devalue the fundamentals of learning? 

Educsause Review (April 2023): EDUCAUSE QuickPoll Results: Adopting and Adapting to Generative AI in Higher Ed Tech 

Inside Higher Ed (April 2023): How ChatGPT Bested Me and Worsted My Students 

Wonkhe (April 2023): Towards an inclusive approach to using AI in learning and teaching 

Rachel Arthur Writes (April 2023): AI Tools for Teachers 

Wonkhe (April 2023): Making higher education assessment literate 

MIT Technology Review (April 2023): ChatGPT is going to change education, not destroy it 

Jim Dickinson (Wonkhe) (): An avalanche really is coming this time 

OpenAI (undated): Educator considerations for ChatGPT  

Times Higher Education (February 2023): Inside the post-ChatGPT scramble to create AI essay detectors 

The Chronicle (March 2023): ChatGPT and Other Cutting-Edge Learning Tech (Zoom event video recording) 

#LTHEchat (March 2023): #LTHEchat 259: ChatGPT and academic integrity Led by @profdcotton Dr Peter Cotton and @reubenshipway 

The Conversation (February 2023): ChatGPT and cheating: 5 ways to change how students are graded

ZDNet (February 2023): This professor asked his students to use ChatGPT. The results were surprising

Return to top