Selected Publications

*: indicating equal contribution or alphabetic ordering.

For all publications, please see Google scholar.

  • Act to See: Emergent Active Visual Perception in Video CoT via Tool Use.
    Martin Q. Ma, Yuxiao Qu, Willis Guo, Aditya Agrawal, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency.
    Under Submission, 2025.
    Paper, Code

  • Video Active Perception: Efficient Inference-Time Long-Form Video Understanding with Vision-Language Models.
    Martin Q. Ma, Willis Guo, Aditya Agrawal, Ankit Gupta, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency.
    ICCV workshop, 2025.
    [Paper], Code