Generative AI: Recent Developments

Sommersemester 2024
Hinrich Schütze, Haotian Ye
Fr 10:15-11:45

Room

Room 057, Oettingenstr 67

Topic

Generative AI has been during the last few years and continues to be an incredibly dynamic research area -- in terms of scientific progress, technical innovation and real-world impact. In this seminar, we will review and assess the latest developments in generative AI.

Credit for MSc Computerlinguistik

To get credit for this class, you must give a presentation AND write a thesis (Hausarbeit).
Topics of presentation and thesis can be different.
Length of presentation: 30 minutes + 15 minutes Q&A/discussion
Evaluation of presentation and thesis: pass/fail
Pruefungsordnung for MSc Computational Linguistics

Schedule

day	topic	resources	details

Apr 19	introduction		Organization, lectures, student topics.


Apr 26	foundations		Brief recap of generative AI foundations.
	assignment of topics	presentation schedule

May 3	basics: instruction tuning	instructGPT	Language models are pretrained on text corpora, but we want to use them for dialog. The text distributions of text corpora and dialog are quite different. Instruction tuning can be defined as the process of modifying an existing pretrained language model to make it more dialogic.
	basics: build gpt

May 10	instruction tuning (2)	Koksal et al.	Reverse instructions: How to leverage existing high-quality user-generated output for creating instructing tuning datasets efficiently and synthetically (EACL/MILA talk).


May 17	low-resource multilinguality		LLMs have impressive performance for English and several high-resource languages on many tasks. But they perform poorly for most of the world's thousands of languages spoken today. This is mostly due to the fact that large training datasets are a crucial ingredient for successful LLM training today. These large training datasets do not exist for most languages. What are the challenges and opportunities in creating language models for low-resource languages?


May 24	MW: continual learning		Continual learning aims to allow machine learning models to continuously learn on new data, by accumulating knowledge without forgetting what was learned in the past. This is generally seen as a capability necessary for advanced AI because many tasks encountered by an AI system are new tasks. They can only be solved by leveraging known tasks -- and then not forgetting this new task, but leveraging it in turn for future tasks.


May 31	AM: memory		LLMs use their parameters as memory, but for many tasks an explicit memory is superior to "parametric memory". E.g., LLMs generally do not memorize infrequent facts well, which then leads to hallucinations. Explicity memory also is (in contrast to parametric memory) interpretable, editable, interoperable and scalable -- all of these are properties that are missing from current LLMs and are necessary for many applications for which we would like to use LLMs. We present a new LLM architecture that includes an explicit memory.


June 7	CM: multilinguality		Scripts pose difficulty for multilingual language models in learning crosslingual knowledge through lexical overlap, e.g., LMs may have difficulty learning that Russian бутерброд and German Butterbrot refer to the same concept. We refer to this problem as the script barrier. To address this problem, we propose Transliteration Contrastive Modeling (TCM) to finetune LMs by contrasting sentences in its training data and their transliterations in a unified script. This ensures uniformity in the representation space for different scripts.


June 14	prompting		Generative AI models follow instructions given in natural language. These instructions are referred to as prompts when they are used to "prompt" the language model to solve a task. Good performance on tasks greatly depends on the form of the prompt, which has given rise to prompt engineering. Models can be used on prompts zero-shot or after finetuning. This lecture will cover prompt-based finetuning and prompt engineering.


June 21	PL: multilinguality		While there has been some progress on masked language models for low-resource languages, training autoregressive LLMs for low-resource languages is more challenging. For good generative capabilities (natural language generation), one generally needs more training data than for natural language understanding. We present MaLA-500, a generative LLM for low-resource languages, and show that it outperforms previous low-resource LLMs on several evaluation datasets.


June 28	SZ: robotics		In robotics, LLMs harness their advanced reasoning and language comprehension capabilities to formulate precise and efficient action plans based on natural language instructions. However, for embodied tasks, where robots interact with complex environments, text-only LLMs often face challenges due to a lack of compatibility with robotic visual perception. There are many challenges when one wants to integrate (multimodal) LLMs into various robotic tasks. Recent work that has addressed these challenges includes determining at which position an object needs to be grasped (e.g., for a potted plant, the pot needs to be grasped, not the plant itself), what natural action is appropriate for addressing a user need (e.g., cleaning up a spill can be done with a sponge) and for navigation (e.g., when looking for a sponge, the kitchen and the bathroom are places where it is most likely to be found).


July 5	students


July 12	students


July 19	students

Generative AI topics for student presentations/theses

Topics and papers given for each topic are examples. Feel free to propose your own topics and papers for your Referat/Hausarbeit.

paper	topic
	all topics covered in the lectures (see above)
arxiv	instruction tuning
arxiv	multilinguality
arxiv	RAG
arxiv	vector databases like pinecone
arxiv	linguistics
arxiv	long context processing
arxiv	mixture of experts
arxiv arxiv	(multi-)agents
arxiv arxiv	counterfactuals
arxiv	out-of-domain tasks
arxiv	reasoning
arxiv	guardrails for genAI
arxiv	explainability
arxiv	in-context learning
arxiv	synthetic data generation
arxiv	reflection
arxiv	robotics

	create your own chatbot
	rabbit