Web Image Prediction Using Multivariate Point Processes
Motivation of Research
The main research problem of this project is as follows.
Given a query keyword (eg.world+cup) and any future time point, can we predict what images will be likely to be appear on the Web?
The term world+cup mainly refers to the soccer event, and thus soccer images may be reasonable guess. However, as shown in the below figure, actual user photos are extremely diverse (e.g. ski, skate, bicycle, or horse riding), because the term usages vary much according to different users’ experiences and preferences.
As input, we download all images that are queried by the term world+cup from Flickr up to at a certain time point (eg. end of 2008). We also assume that each image is associated with meta-data like timestamp and owner ID. We then learn the temporal model for each topic keyword to describe the relations between image occurrence probabilities and the factors or covariates that influence them. Finally, the learned model is used to predict likely images for a given topic keyword and a future time point.
We call the prediction of photos for arbitrary individuals as collective image prediction. If a particular user is specified at query time, the prediction can be more focused on the user’s unique angle of seeing the topic. We call this task as personalized image prediction.
Our unified statistical framework to solve image prediction problem is the multivariate point process, which is a stochastic process that consists of a series of random events occurring at points in time and space. The point process is a powerful statistical model for spatio-temporal events.
Naturally, one occurrence of a particular image at a particular time can be represented by a point in time and image space. Consider an example of a short stream of penguin images. Each image is associated with a timestamp and visual cluster ID that is obtained by image clustering. Then, we can trivially represent this stream of images as the discrete-time trivariate point process like Fig.2.(b). Finally, we formulate regularized Poisson regression model to solve the relations between image occurrence and the covariates that influence it in a flexible, scalable, and globally optimal way.
Some examples of collective and personalized prediction are as follows.
In this paper, we discuss the Web image prediction problem. This research is important because it is an image-based approach for user behavior prediction, and can be easily extended to time-sensitive image reranking.
From experiments, we observe that Web image collections are extremely diverse, but some topics follow predictable patterns. Specifically, our predictive model works well for polysemous topics that show strong annual or periodic trends. We can also conclude that the image based personalization is highly demanding because images can convey more delicate information about user preferences that are hardly captured by text descriptions. For example of fine+art topic, we can effectively address a question like What styles of paintings does a user like?