Shu Kong
I'm a postdoc with Deva Ramanan in RI | CMU. I got PhD from ICS | UCI advised by Charless Fowlkes.
My research is motivated by a desire to create intelligent systems that benefit human life through machine vision and learning. My current focus is on "open-world vision for better visual perception and learning".
Contact
- Email: aimerykong (at) gmail- c0m
- Office: EDSH 101, 5000 Forbes Ave, Pittsburgh, PA, 15213
Links
Recent Update Highlights
-
Our virtual workshop Open-World Vision
will be held in conjunction with CVPR21
(12/11/2020)
-
Our work is published on
"Improving the Taxonomy of Fossil Pollen using Convolutional Neural Networks and Superresolution Microscopy". See highlight in NSF news
. (09/14/2020)
-
We have an online page for the project "multimodal object detection for driving". (05/14/2020)
-
Our paper "Celeganser: Automated Analysis of Nematode Morphology and Age" is accepted by
. Read more from Linfeng Wang (04/15/2020)
-
Our paper "Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real Domain Shift and Improve Depth Estimation" is accepted by
. Read more from Yunhan Zhao (02/24/2020)
-
thanks to the Kleist family for the generous support through Bob & Barbara Kleist Endowed Fellowship (1/13/2020)
-
thesis defended "Pixel-Level Prediction: Models and Applications" (slides)
(11/20/2019)
-
Demo videos are released for our project "Video-Sentence Grounding with Referring Attention and Weak Supervision". (4/9/2019)
-
Project page is created for "Multigrid Predictive Filter Flow for Unsupervised Learning on Videos"; see also teaser videos at Youtube playlist, github code and demo, and the arxiv paper. (4/3/2019)
-
Our paper "Modularized Textual Grounding for Counterfactual Resilience" appears at
, check Zhiyuan Fang for details! (2/24/2019)
-
joining
as summer intern (1/22/2019)
-
Project page is created for "Image Reconstruction with Predictive Filter Flow", with released paper, slides and demo script. (11/28/2018)
-
Project page is created for our
work "Pixel-wise Attentional Gating for Scene Parsing", which is our Robust Vision Challenge entry for depth estimation and semantic segmentation. (05/06/2018)
-
Project page is created for "Fine-Grained Facial Expression Analysis Using Dimensional Emotion Model" (which will appear in Neurocomputing, 2020), with released demos, code and models. (05/03/2018)
-
Our paper "Recurrent Pixel Embedding for Instance Grouping" is accepted by
(Spotlight). Read more at the Project page for demo, code, models, poster, slides, etc. (02/18/2018)
-
Our paper "Recurrent Scene Parsing with Perspective Understanding in the Loop" is accepted by
. Read more at the Project page for demo/code/models/poster/slides. (02/18/2018)
-
Thank
Google Graduate Student Award for the generous support. (9/2/2017)
-
Project page is created for the google internal project. (9/2/2017)
-
Project page is created for our automated pollen recognition system. (6/2/2017)
-
Our paper ''Low-rank Bilinear Pooling for Fine-grained Classification'' is accepted by
See github for demo, model and code. (3/2/2017)
-
joining
as summer intern (1/10/2017)
-
advanced to candidacy [slides] (11/30/2016)
-
Project page is created for "deep image aesthetics analysis" of our
work, with code, demo and dataset.
-
Project page is created for "fossilized pollen grain identification" of our
work, with code, demo and dataset.
Research Projects
-
Sparse Coding, Dictionary Learning, and Applications
Tensor Computation and Applications
Papers
-
Y. Zhao, S. Kong, C. Fowlkes, "When Perspective Comes for Free: Improving Depth Prediction with Camera Pose Encoding", arXiv:2007.03887, 2020
[project page] [arxiv] [github] [slides] -
Z. Fang, S. Kong, Z. Wang, C. Fowlkes, Y. Yang, "Weakly-Supervised Temporal-Language Association with Referring Attention", arXiv:2006.11747, 2020
[project page] [arxiv] -
Linfeng Wang, S. Kong, Zachary Pincus, C. Fowlkes, "Celeganser: Automated Analysis of Nematode Morphology and Age ", CVMI@CVPR, Seattle, 2020
[project page] [preprint] [slides] [poster] [github] -
Y. Zhao, S. Kong, D. Shin, C. Fowlkes, "Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real Domain Shift and Improve Depth Estimation", CVPR, Seattle, 2020
[project page] [arxiv] [slides] [poster] [github] -
F. Zhou, S. Kong, C. Fowlkes, T. Chen, B. Lei, "Fine-Grained Facial Expression Analysis Using Dimensional Emotion Model", Neurocomputing, 2020.
[project page] [arxiv] [demo] [models] [github] -
S. Kong, C. Fowlkes, "Multigrid Predictive Filter Flow for Unsupervised Learning on Videos", arXiv:1904.01693, 2019.
[project page] [arxiv] [github] [demo] [slides] [poster] -
Zhiyuan Fang, S. Kong, C. Fowlkes, Yezhou Yang, "Modularized Textual Grounding for Counterfactual Resilience", CVPR, Long Beach, CA, June 2019.
[paper] [project page] [github] -
S. Kong, C. Fowlkes, "Image Reconstruction with Predictive Filter Flow", arXiv:1811.11482, 2018.
[project page] [high-res paper (44MB)] [github] [slides] [poster] -
S. Kong, C. Fowlkes, "Pixel-wise Attentional Gating for Scene Parsing", WACV, Hawaii,2019.
[project page] [arxiv] [github] [slides] [ROB Entry of Depth Est.] [ROB Entry of Segm.] -
S. Kong, C. Fowlkes, "Recurrent Scene Parsing with Perspective Understanding in the Loop", CVPR, 2018.
[project page] [technical report] [demo] [model] [poster] [slides] -
S. Kong, C. Fowlkes, "Low-rank Bilinear Pooling for Fine-Grained Classification", CVPR, 2017.
[project page] [technical report] [abstract] [demo] [model] [poster] [slides] -
S. Kong, X. Shen, Z. Lin, R. Mech, C. Fowlkes, "Photo Aesthetics Ranking Network with Attributes and Content Adaptation", ECCV, Amsterdam, the Netherlands, (Oct. 2016).
[project page] [paper] [code&demo] [dataset&model] [bibtex] [poster] [AMT instruction] [patent filed] -
S. Kong, S. Punyasena, C. Fowlkes, "Spatially Aware Dictionary Learning and Coding for Fossil Pollen Identification", CVPR CVMI Workshop, Los Vegas, NV, (July 2016).
[project page with code&demo] [paper] [bibtex] [talk] [poster] -
Shu Kong, Zhuolin Jiang, Qiang Yang, "Modeling Neuron Selectivity over Simple Mid-Level Features for Image Classification", IEEE Trans. on Image Processing, 2015
[paper] -
Yuetan Lin, Shu Kong, Donghui Wang, Yueting Zhuang, "Saliency Detection within a Deep Convolutional Architecture", AAAI'14 Workshop on Cognitive Computing for Augmented Human Intelligence, 2014.
[paper] -
Donghui Wang, Shu Kong, "Learning Class-Specific Dictionaries for Digit Recognition from Spherical Surface of a 3D Ball", Machine Vision and Applications (MVA), 2012.
[paper] [SingleBall_dataset (288MB)] [MultiBall_dataset (121MB)]
Abstract/Workshop
-
Zhiyuan Fang, Shu Kong, Charless Fowlkes ,Yezhou Yang, " Modularized Textual Grounding for Counterfactual Resilience", Language And Vision workshop joint with CVPR, 2019.
-
Surangi W. Punyasena, Shu Kong, Charless C. Fowlkes, "Improving the taxonomic accuracy and precision of fossil pollen identifications", North American Paleontological Convention, Riverside, USA, 2019.
-
Ingrid Romero, Shu Kong, Charless C. Fowlkes, Michael A. Urban, Surangi W. Punyasena, "Automated Neotropical Fossil Pollen Fabaceae Analysis Using Convolutional Neural Networks", GSA Annual Meeting in Indianapolis, Indiana, USA, 2018.
-
Zhiyuan Fang, Shu Kong, Tianshu Yu, Yezhou Yang, "Weakly Supervised Attention Learning for Textual Phrases Grounding", Language and Vision Workshop jointwith CVPR, 2018.
-
Shu Kong, Charless C. Fowlkes, "Low-rank Bilinear Pooling for Fine-Grained Classification", the Fourth Workshop on Fine-grained Visual Categorization joint with CVPR, 2017.
-
Shu Kong, Charless C. Fowlkes, "Recurrent Scene Parsing with Perspective Understanding in the Loop", Southern California Machine Learning Symposium, 2017.
-
Ingrid Romero, Shu Kong, Charless C. Fowlkes, Michael A. Urban, Carlos D'Apolito, Carlos Jaramillo, OBOH-IKUENOBEA, Francisca E. Oboh-Ikuenobea, Surangi W. Punyasena, "NOVEL MORPHOLOGICAL ANALYSIS OF A FOSSIL FABACEAE POLLEN TYPE, STRIATOPOLLIS CATATUMBUS (TRIBE DETARIAE)", GSA, 2017.
-
Romero, I.C., S. Kong, C.C. Fowlkes, M.A. Urban, C.A. D'Apolito, C. Jaramillo, F. Oboh-Ikuenobe, and S.W. Punyasena, "Cenozoic biogeography of Striatopollis catatumbus (Fabaceae Detariae)", AASP-The Palynological Society, 2017.
-
Derek S. Haselhorst, Shu Kong, Charless C. Fowlkes, J. Enrique Moreno, David K. Tcheng, Surangi W. Punyasena, "Automating tropical pollen counts using convolutional neural nets: from image acquisition to identification", the iDigBio inaugural conference, 2017.
-
Surangi W. Punyasena, Shu Kong, Charless C. Fowlkes, and Stephen P. Jackson, "Reconstructing the extinction dynamics of Picea critchfieldii - the application of computer vision to fossil pollen analysis ", the iDigBio inaugural conference, 2017.
-
Shu Kong, Charless C. Fowlkes, "Low-rank Bilinear Pooling for Fine-Grained Classification", Southern California Machine Learning Symposium, 2016.
Patents
- Utilizing deep learning to rate attributes of digital images, US 2018 / 0268535 A1
- UTILIZING DEEP LEARNING FOR RATING AESTHETICS OF DIGITAL IMAGES, US 20170294010
- Method and Apparatus for Image Content Recognition, CN 201410350987.X
- Method and Apparatus for Image Feature Extraction, CN 201410223300.6
Funding/Support
- CMU Argo AI Center for Autonomous Driving Research, 2020 --
- Bob & Barbara Kleist Endowed Graduate Fellowship 2019 - 2020
- NIA R01AG057748 2019
CVPR PhD Consortium, 2019
- IIS-1253538 2016-2020
- NSF DBI-1262547 2015-2020
WACV PhD Consortium, 2019
Google Graduate Student Award, 2017
Hardware donation from NVIDIA, 2016
Janelia Junior Scientist Workshop Travel Grant 2016
Adobe Research Gift 2015
- Multidisciplinary Design Program Grant 2014-2015
-
ECCV Travel Award 2012
Presentation/Talk
-
"Pixel-Level Learning and Prediction for Fine-Grained Visual Understanding", GVV @ MPI-Informatik, hosted by Prof. Christian Theobalt, November 4, 2019.
-
"Unsupervised Depth Learning from Monocular Videos: Is It Done Right?", Mobile Vision, Oculus, Facebook Research, August 22, 2019.
-
"Attending to Pixels, Embedding Pixels, Predicting Pixels", CMU VASC Seminar, hosted by Prof. Deva Ramanan and bro Peiyun Hu, Aug. 6, 2019.
-
"Attending Pixels, Embedding Pixels, Predicting Pixels", Mobile Vision, Oculus, Facebook Research, July 18, 2019.
-
"Attending Pixels, Embedding Pixels, Predicting Pixels", CVPR PhD Consortium with Prof. Cordelia Schmid, June 19, 2019.
-
"Attending Pixels, Embedding Pixels, Predicting Pixels", vision@Caltech, hosted by Prof. Pietro Perona and Oisin Mac Aodha, June 6, 2019.
-
"Attention to Pixels, embed pixels, track pixels", UC Berkeley BAIR of Prof. Alyosha Efros and Prof. Hany Farid, May 24, 2019.
-
"Video Mining by Weakly/Un-supervised Learning", CLVR@USC of Prof. Joseph Lim, May 16, 2019.
-
"Video Mining: from Sub-pixel to Causality", Video Computing Group at UC Reverside of Prof. Amit Roy-Chowdhury, April 25, 2019.
-
"Predictive Filter Flow: Diving into (Sub)pixels with Unsupervised, Controllable and Interpretable Learning", hosted by "Academic Uncle" Alyosha Efros@BAIR and Andrew Owens, Feb. 18, 2019.
-
""Fine-Grained Visual Understanding and Learning, WACV PhD Consortium of Prof. Larry S. Davis, Jan. 8, 2019.
-
"Fine-Grained Image Understanding", Traceup, Sep. 14, 2018.
-
"More to Say About ImageNet Models", UCI Computational Vision Group, May 29, 2018.
-
"Pay Attention to the Pixel, Understand the Scene Better", Center for Machine Learning and Intelligent Systems, UCI, May 14, 2018. [talk]
-
"(Dis)entangling Fine-Grained Scene Parsing", UCI Computational Vision Group, May 9, 2018.
-
"Scene Parsing through Per-Pixel Labeling: a better and faster way", ASU Active Perception Group Seminar, hosted by Prof. Yezhou Yang and bro Jacob Fang, ASU, March 23, 2018. [talk]
-
"Towards Human-Object Interaction, and Beyond", UCI Computational Vision Group, February 27, 2018.
-
"Learning to Group Pixels into Boundaries, Objectness, Segments and Instances", UCI Computational Vision Group, October 31, 2017.
-
"Predicting Real-World Distance between 360 Photos using Deep Learning", Geo, Google, September 5, 2017. [talk]
-
"Recurrent Scene Parser with Perspective Estimation in the Loop, and beyond", DBH, UCI, April 19, 2017. [talk]
-
"Semantic Segmentation: Tricks of the Trade", UCI Computational Vision Group, Feb 22, 2017.
-
"Ubiquitous Fine-Grained Computer Vision ", UCI Computational Vision Group, Nov 30, 2016. [talk]
-
"Instance Segmentation", UCI Computational Vision Group, Nov 21, 2016. [talk]
-
"Low-rank Bilinear Pooling for Fine-Grained Classification", Southern California Machine Learning Symposium, Caltech, Nov 18, 2016.
-
"Automated Biological Image Analysis using Computer Vision and Machine Learning through Identification, Counting, Detection and Segmetnation ", Junior Scientist Workshop on Machine Learning and Computer Vision, Janelia Research Campus, Oct 2-7, 2016.
-
"Geographically Aware Knowledge Mining on Mobile Data", UCI Data Hackathon, May 15, 2016. [slides]
-
"Selecting Patches, Matching Species: Fossil Pollen Identification by Spatially Aware Coding", UCI Computational Vision Group, Apr. 06, 2016. [slides]
-
"From Linear to Bilinear, and Beyond", UCI Computational Vision Group, Jan. 20, 2016. [slides]
-
"Deep Understanding Image Aesthetics", UCI Computational Vision Group, Sep. 30, 2015. [slides]
-
"Image Quality and Aesthetics Estimation", Adobe Research, Sep. 18, 2015.
-
"Automated Biological Image Analysis using Computer Vision and Machine Learning", Multi-Disciplinary Project Research Symposium, Calit2 Auditorium, May. 30, 2015.
-
"Beyond R-CNN detection: Learning to Merge Contextual Attribute", UCI Computational Vision Group, UCI, Jan. 29, 2015. [slides]
-
"A Story from Saliency to Objectness and Extension by Deep Neural Network with Perspective and Doubt", UCI Computational Vision Group, Nov. 6, 2014. [slides]
Services
-
Conference: CVPR, ICCV, ECCV, ICLR, NeuriPS, ICML, UAI, AAAI, BMVC.
-
Journal: IEEE PAMI, IJCV, IEEE TIP, RA-L, IEEE JBHI, IEEE TKDE, PLOS ONE, IEEE THMS, IEEE CYB, JVLC, Palaeo Electronica, PRLetters, IEEE Access, MVAP, DSP, IEEE SPLetters.
Organizer
Reviewer/Program Committee
-
CMU AI Mentoring Program mentor, 2020
-
Undergrad GradSchool Q&A Panel (2017), UROP (2015), MDP (2015), Individual Study CompSci299 (2015~2019)
Mentorship Program
-
MSCV Admissions Committee RI-CMU, 2021
-
Student Committee of Faculty Hiring CS-ICS-UCI: 2018, 2019
-
Graduate Open House Host: 2018, 2019
-
Panelist@ASUCI Research Mobilization Commission, 2019
Department/School/University Service
-
RGG (2020-), Trace (2018-2019), US Cabinets Online (2018), Paralian Tech (2017)
Consultant
-
Big Data Image Processing & Analysis Course Information (2017Fall), Computational Photography and Vision (2017Spring), Big Data Image Processing & Analysis Course Information (2016Fall), Graph Algorithms (2016Spring), Machine Learning and Data Mining (2015Winter), Introduction to Graphic Models (2015Fall), Graph Algorithms (2015Spring), Machine Learning and Data Mining (2014Winter), Introduction to Artificial Intelligence (2013Spring), Computer Vision (2012Fall), Logic and Computer Design Fundamentals (2011Fall).
Teaching
Misc
- .. -. / .... .. -- / .-- . / .-.. --- ...- .
-
I love mentoring and educating, probably due to my blood that I am a 76th generation descendant of Confucius, with my family seniority as Ling (令).
-
བཀྲ་ཤིས་བདེ་ལེགས, I have a Tibetan name, Tenzing Luobu, 单增罗布.
-
I was a co-founder of SEED -- a Registered Campus Organization to promote harmony and love within the campus, to bring critical thinking and loving attitude across cultures towards daily lives.
-
I'm very slow in responding to messages from all kinds of social media (I'm anti-social-media:-). So email should be the best way to reach me.
-
I actively get involved in cross-discipline research, e.g., Big Data Image Processing and Analysis (Big DIPA).
-
Joan Agulilar and I designed "almighty search" for Snake game. The "almighty search" can always achieve the highest score, see description here, and technical report here.