19/07/2024 | Abdullah Alchihabi | Stealing Part of a Production Language Model |
15/03/2024 | Hanping Zhang | A Connection between One-Step RL and Critic Regularization in Reinforcement Learning |
01/03/2024 | Hao Yan | Improving Convergence and Generalization Using Parameter Symmetries |
02/02/2024 | Marzi Heidari | InstanT: Semi-supervised Learning with Instance-dependent Thresholds |
19/01/2024 | Abdullah Alchihabi | Scaling Data-Constrained Language Models |
14/12/2023 | Qing En | Segment Everything Everywhere All at Once |
02/11/2023 | Hanping Zhang | Imitating Human Behavior with Diffusion Models |
19/10/2023 | Hao Yan | Siamese Masked Autoencoders |
05/10/2023 | Yan Yan | Multi-Label Knowledge Distillation |
14/09/2023 | Abdullah Alchihabi | RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback |