Skip to main content

Tiu MoLess than 1 minute

https://github.com/PJLab-ADG/awesome-knowledge-driven-AD

CVPR2024 论文接收列表 https://cvpr.thecvf.com/Conferences/2024/AcceptedPapers

LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model https://openaccess.thecvf.com/content/CVPR2024/papers/Wang_LocLLM_Exploiting_Generalizable_Human_Keypoint_Localization_via_Large_Language_Model_CVPR_2024_paper.pdf

https://github.com/kennethwdk/LocLLM

PointLLM: Empowering Large Language Models to Understand Point Clouds https://github.com/OpenRobotLab/PointLLM?tab=readme-ov-file

HandDiffuse: Generative Controllers for Two-Hand Interactions via Diffusion Models (CVPR'24) https://handdiffuse.github.io/

GPT4Point : A Unified Framework for Point-Language Understanding and Generation https://github.com/Pointcept/GPT4Point

Visual In-Context Prompting https://openaccess.thecvf.com/content/CVPR2024/papers/Li_Visual_In-Context_Prompting_CVPR_2024_paper.pdf

Visual Instruction Tuning https://llava-vl.github.io

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs https://arxiv.org/pdf/2401.06209 https://github.com/tsb0601/MMVP

Describing Differences in Image Sets with Natural Language

https://understanding-visual-datasets.github.io/VisDiff-website/ https://github.com/Understanding-Visual-Datasets/VisDiff https://openaccess.thecvf.com/content/CVPR2024/papers/Dunlap_Describing_Differences_in_Image_Sets_with_Natural_Language_CVPR_2024_paper.pdf blog: https://voxel51.com/blog/cvpr-2024-survival-guide-five-vision-language-papers-you-dont-want-to-miss/

Low-Resource Vision Challenges for Foundation Models https://xiaobai1217.github.io/Low-Resource-Vision/