Content Safety for Generative Multimedia: Automated Evaluation and Re-Prompting for Age-Appropriate AI-Generated Content
Beschreibung
Generative AI models (e.g., LLMs, diffusion models) are increasingly used to create multimodal content (text + images) for applications like interactive storytelling, educational tools, and digital media. However, ensuring that generated content is safe, unbiased, and age-appropriate remains a critical challenge. Manual moderation is unscalable, and existing automated filters often lack contextual understanding or multimodal reasoning.
This thesis explores the development of automated pipelines to evaluate and refine AI-generated content, with a focus on:
- Real-time safety assessment of text-image pairs.
- Automatic re-prompting to guide models toward compliant outputs.
- Adaptability to diverse use cases (e.g., children’s toys, educational platforms).
Objectives
-
Multimodal Safety Evaluation:
- Investigate state-of-the-art metrics (e.g., CLIP-based similarity, toxicity scores, emotional valence) to detect unsafe or age-inappropriate content in text-image pairs.
- Develop a lightweight, interpretable scoring system combining:
- Text analysis (e.g., perspective API, custom fine-tuned classifiers).
- Image analysis (e.g., NSFW detectors, aesthetic/emotional classifiers).
- Cross-modal alignment (e.g., does the image match the text’s intent and safety constraints?).
-
Re-Prompting Strategies:
- Design adaptive prompting techniques to iteratively refine outputs (e.g., using reinforcement learning or constrained decoding).
- Explore few-shot learning to generalize safety rules across domains (e.g., fairy tales vs. scientific explanations).
-
Benchmarking and Evaluation:
- Curate a multimodal dataset of edge cases (e.g., subtle biases, ambiguous contexts).
- Compare against human annotations and existing tools (e.g., Google’s Perspective API, LAION filters).
- Optimize for latency and computational efficiency (critical for embedded/edge devices).
Betreuer:
Synthetic data generation for multi-sensor (visual-inertial) rigs for large scale indoor 3D environments
Beschreibung
Synthetic data is revolutionizing Computer Vision by enabling the generation of large-scale, annotated datasets for training and evaluating algorithms. Real-world data collection is often limited by cost, privacy, and scalability. By simulating realistic sensor data in virtual 3D environments, we can accelerate research in 3D scene understanding, object detection, SLAM, and multi-modal perception.
Objective: In this thesis, you will design and implement a system to generate synthetic sensor data for multi-sensor rigs (e.g., cameras, LiDARs) in large-scale indoor 3D environments. The system will allow users to:
- Define custom sensor rigs (e.g., combinations of 2D/3D cameras, LiDARs, IMUs).
- Insert these rigs into a virtual world and control their trajectories.
- Render sequential data such as images, panoramas, and 3D point clouds for Computer Vision applications.
Your Tasks:
- Environment Setup: Choose and set up a suitable 3D simulation environment (e.g., NVIDIA Omniverse, CARLA, or Unity/Unreal Engine).
- Sensor Rig Definition: Develop an API to define and configure multi-sensor rigs with realistic sensor models.
- Trajectory Control: Implement functionality to move the rig along pre-defined or user-controlled trajectories within the virtual world.
- Data Generation: Render and export sequential sensor data (images, depth maps, point clouds, etc.) for use in downstream tasks like SLAM, object detection, or scene understanding.
- Validation: Evaluate the realism and utility of the generated data, potentially by training or testing a model on the synthetic dataset.
Voraussetzungen
- Strong background in 3D Computer Graphics and Computer Vision.
- Proficiency in C++ and Python.
- Experience with 3D simulation tools (e.g., Omniverse, CARLA, Blender) is a plus.
- Self-motivated, creative, and eager to tackle challenging technical problems
Contact: Interested candidates should send a CV, transcript, and a brief statement of motivation to the thesis supervisor.