Thursday, October 31, 2013

11-7 Optional alternative reading


  1. Summary: The authors examined eye movements of 31 users while they performed a Google search. Study subjects were given 10 questions to answer through their search — five homepage searches and 5 informational searches. The study showed that users spent about the same amount of time on the first and second result in the list, and few users looked beyond the sixth result, which was right at the page break. The authors intend to do future studies combining eye tracking and relevance judgments.

    1. In the section on future work the authors state that they are gathering relevance judgments for a further study. Why is it important to assess the relevance of a documents apart from the eye tracking study? Could eye tracking be used as a substitute for relevance judgments?

    2. The authors used a variety in subjects for searches. Why is it important to have a variety in the type of searches? Would the type of search affect how the user looked at the page and for how long?

    3. In studies on eye tracking, how do they know what eye movements mean and if there’s a correlation with relevance? Does it mean a user is satisfied with the search, or that the item just has a catchy title?

  2. Can Relevance of Images Be Inferred from Eye Movements? -Klami, Saunders, Campos, Kaski

    Summary: In this article, the authors suggest a model of IR that would no longer rely on written metadata for searchable images, but would use instead a combination of content-based features, eye movements, more advanced algorithms, and feedback regarding the image relevance. The experiment is kept simple and requires a user to go through 100 pages of images with 4 images on each page, identifying the sports-related image on each page. The user uses a keyboard to give explicit feedback about which image was the relevant one per page. They describe the equipment used and the ways in which they attempt to create a realistic or typical work station. Users judged relevance correctly 95% of the time in this experiment, and the authors chart eye-movement using 9 different criteria in their findings (p. 4). Using this eye movement and a Fisher score representation retrieval system (p. 5) which uses texture and color as classifiers, more relevant images can be returned to the user based on the kinds of things their eye movement suggests as relevant. They also found that "gaze patterns are relatively universal for this kind of task"(p. 6), which means that this kind of study does not have to be used on an individual basis, and training data can instead be relied upon.

    1. In the description of their study, the authors state: "However, we also provide a demonstration that the image collection used in the experiments would provide sufficient content-level information for searching similar images based on the feedback"(p. 2). Does this mean that a results page of relevant documents (images) would be highly dynamic- changing periodically to show "similar images" based on user's eye movements? The description of the Fisher score retrieval system was a bit confusing to me- how can the texture and color classifiers be enough when they don't seem to recognize context (p. 5)?

    2. Users, especially in the "millenial" generation are so used to multitasking and dedicating attention to more than one place at a time, I am wondering how the authors of this study account for natural distraction. They address this briefly: "The users were asked to perform the task as quickly as possible, to avoid eye movements not related to the task"(p. 3) But, is this request enough to account for this natural tendency?

    3. I have been thinking about privacy issues while reading this. How can this be implemented to help users as they search from their home or work computers? Aren't these cameras a bit invasive? But, the universality of user eye motion discussed near the conclusion seems to answer at least part of this question.

  3. Information Retrieval by Inferring Implicit Queries from Eye Movements

    Summary: The researchers test to see if they can use the results from an eye tracking study to formulate implicit queries that help determine document relevance and ranking. They have 10 participants evaluate 500 Wikipedia documents from 25 categories. They use support vector machines (SVM) and compute ideal weights for each term that shows up in the documents (and is in the dictionary), as well as to determine the relevance of the larger set of unseen documents. They then compare two methods, one with only eye tracking results, where no explicit feedback is available, and one combining eye tracking and explicit relevance feedback. Their initial hypothesis is that the combined method will outperform the other method when giving relevancy feedback. They go on to establish that this is not true for all topics. However, overall precision of the combined method is better than the implicit-only method.

    1. How much does a user’s knowledge of the subject / topic familiarity impact the results of an eye-tracking analysis. For example, the largest dot on in figure 2 is on Xenotarsosaurus. If a group of paleontologists was ranking the documents, would the dot have been as large, and thus would the magnitude of the inferred weight been as high? So when using eye tracking to judge relevance, is it imperative to make sure that it is a mixed bag of documents – not particular to a topic – and choose participants accordingly? Then, do you think the methods described in this experiment reflect the “true interests of the user”?

    2. How does eye tracking account for possible distractions that might sway results? For example, some form of media or a differently styled word might catch the participant’s attention. How are such situations treated when analyzing the results?

    3. I was wondering how (or if at all) eye tracking would work for web search or a list of items? Won’t the document rank or the list order influence how that page is viewed?

  4. Eye-Tracking Analysis of User Behavior in WWW Search by Laura, Thorsten and Geri Gay

    An analysis as to how users interact with the search results page of a web search engine based upon eye tracking is performed. The authors claim that this study would help in improvising the interface design as well as for more accurate interpretations of implicit feedback for machine learning. 36 undergraduate students were recruited to test the system with Google’s search results for ten questions and their behavior was tested without informing them. The experimental results show that the users generally follow a top-down approach while searching for relevant documents for their queries. Also it depicts the user behavior on a temporal basis that concludes that more the time the user spends on reading an abstract of a link the more likely he/she is to choose the link as a relevant read.

    1. The authors have mentioned that the previous work lacked in robustness and proved less capable in generating patterns of user search and scanning behavior. They claim to address these issues in their study using eye tracking. But the experimental results and the statistics do not explicitly show how their system is better or an enhanced approach. How have they handled the issues of robustness and the better predictability of user behavior in the study? They fail to quantitatively compare their results with the prior, which had they done could have substantially helped in understanding the study better.

    2. Using the eye-tracking method it is possible to know where the users are spending more time and where they don’t indirectly influencing the relevance factor. But just the gaze alone cannot be tracked and used as a metric for relevance judgments. This is because of the inconsistency in the users’ behavior. How can one eliminate the distractions that cause the user to view or perform tasks other than the intended task? Can they be eliminated at all?

    3. The Section 4 speaks about two questions. Firstly, how does rank influence the amount of attention a link receives and secondly, how do users explore the list. The results corresponding to the study does not indicate any significant conclusion that is different from the ones that were obtained in our earlier study, which used click-through rate and other temporal metrics. So how does eye-tracking help in adding value to analyzing the user behavior for identifying the notion of relevance from a user perspective?

  5. Information Retrieval by Inferring Implicit Queries from Eye Movements by Hardoon et al.

    This paper introduces a method to learn and predict queries from eye movements. The search system adopts the bag of words approach with a TFIDF retrieval function. An eye tracker is used to extract fixation words, i.e. words in focus, to construct query vectors. Training uses weights as determined by a SVM classifier at ideal weights. The ideal term weights are then mapped to term weights as determined by the eye tracker using a regression system.

    1. Fixation words may be ones that are not relevant or ones that are difficult to determine. If a system weights these words highly the direction of the search system may not be the one intended by the user. How can the user then make corrections? The concern is that once the system starts heading in the wrong direction, it may be difficult to bring it back on track.

    2. The queries constructed may not have any semantic meaning. Looking for such relationships between fixation words may be a useful preprocessing step?

    3. Since documents often are a rich source of information for a given topic, and facets of a topic. How can an eye tracker differentiate between multiple intents? Is this reflected as long queries predicted by the system?

  6. Can Relevance of Images Be Inferred from Eye Movements?
    Summary: The authors are inspired by the work done to try and gather relevance judgments of text documents by tracking eye movement. The authors conducted an experiment in which they tried to predict if an image was relevant to sports based on the user’s eye movements. The authors made a few important design decisions. Most notably, the authors allowed at most one relevant image to appear on a page. The experiments had 100 pages, of which 70 had relevant images on them and 30 had no relevant images at all. The authors’ gathered the eye data in real time and later went back to break down the information and tried to answer two questions. First, can they figure out if the page has a relevant image? Second, which of the four images on the page is relevant? In the end, the authors felt their approach led to promising results, and as of the time of publication, are looking into how to improve their ability to infer relevance.

    1. The authors stated they elected to have at most one relevant image per page because they wanted to reduce the data load during evaluation. In addition, only four images are displayed for each page of the experiment. All in all, these constraints do not reflect a real world image retrieval setting. More than the number of images per page, why must they constrain the number of possible relevant images per page? No matter if an image is relevant or not, the system will still need to track the user’s eye movement and focal points across all four images. Therefore, I don’t see how the data or computational load can be reduced by having, at most, one relevant image. In addition, would it really be easy to extend a system designed to look for one or none relevant images to allow trying to track any number of relevant images?

    2. The experiment is set up so every user looks through 100 pages. Although there are only four images to a page, 100 pages still seems like a long trial. The longer a task takes the increased likelihood the user will become tired or lose focus with the task at hand. Based on an alternate reading I did about gamifying crowd sourced tasks, the authors make a seemingly valid argument that the more people enjoy the task, the more they are engaged and striving to produce correct results. Setting an experiment to have 100 pages appears to be in contrast to this idea. Since the documents under test are images and easier to process than long, written documents, did the authors feel this was not an important experimental factor?

    3. For their experiment, the authors established two research questions. The first of which is to determine if there is a relevant image displayed on the page or not. The second is to determine which image is relevant. However, in their evaluations, the authors focused on techniques to model user behavior and predict what the user determines relevant. In the end, the authors main evaluation is comparing their user specialized model to their generic model, which is based on all users from the study. Since the authors only briefly discuss the two research questions they listed, why did they not add in a third research question about specialization versus generic modeling for analysis for eye movements?

  7. I have substituted “Information Retrieval by Inferring Implicit Queries from Eye Movements” by Hardoon and Shawe-Taylor instead of “ Can eyes reveal interest? Implicit queries from gaze patterns.” by Hardoon et al. The authors use the eye movements to form an ‘IR query’ that is used to rank the unseen documents based on their relevance. The study encompasses a query formulation based on eye tracking alone and another query formulation using eye tracking and other explicit information. Using the Bag of Words representation of a document and TF-IDF approach ,the authors choose the query function to be an SVM based model to compute weights and predict relevance of unseen documents. The authors conclude by stating that eye movements are indeed a useful feedback channel but add that it was not always found to be significantly helpful. My questions for discussion are as follows :

    1. It is not evident from the paper if the authors take into account that the user can be highly susceptible to distractions. The ‘fixation’ might not correspond to relevance or interest as the user might also be thinking about something else in his/her mind. How does the study take all of this into account?

    2. One more thing that is not clear from the paper is how the authors assume that there is a single intent or even the presence of an intent from a user. How do we classify intent? It is more subjective and it is not clear how the eye tracker can correlate intent to fixation.

    3.There seems to be a lot of unaccounted bias in the study. The study seems to be biased towards words that the user has already seen and against the words which might be synonymous to what the user is looking for, but ones which the user is not familiar with.

    4. Why did the authors choose to study only expert users? Is there any specific reasoning behind this?

  8. Article: Eye-Tracking Analysis of User Behavior in WWW Search

    In this article, the authors look at how users interact with results presented from an online search engine by using eye-tracking. They took 36 participants and gave them 10 tasks to complete using Google. They then tracked how these users interacted with the results.

    1. The authors note that they had trouble calibrating the eye-tracking devices and as a result only 26 of their 36 subjects data was able to be used. What calibration problems could there have been that led to the data being unusable?

    2. In the results section, the authors discuss that they analyzed the user behavior before the user clicks on the first link or exits the page otherwise, but no further clicks. Wouldn't further clicks be an interesting and useful amount of information to collect in terms of learning how users interact with online search?

    3. In the future works section, the authors note that they are collecting relevance judgments for a future study. How would relevance judgments aid in learning how users search online? If fixation and eye movement behaviors are what are being looked at, how does the relevance of the document being looked at help understand the user?

  9. [Can eyes reveal interest? Implicit queries from gaze patterns]

    Klami et al. explore an alternative method for improve image searching, implicit relevance feedback through eye movement. First, they test the idea of using gaze data to infer relevance by having a user look at 100 pages each with 4 images and determine if any image is relevant to a text query. Then. using the data collected, they trained a classifier to predict if there are relevant images in new pages for a given user. Moreover, using the same data and classifier, they predict relevance for a page given a new user. The results of this experiment show that it is possible to accurately predict relevance from gaze data.

    1) Can you elaborate more about the Gaussian mixture model (GMM) and Bag-Or-Visual (BOV)?

    2) During their experiment, there were some (user, page) pairs where the eye tracker couldn't collect any data. Given that there are images that are obviously relevant to the query “sports” and thus a user immediately responds, is it possible that most of the discarded pairs were because of the latter situation?

    3) What does the authors mean with exploration and exploitation strategies? Are these strategies related to diversification of results?

  10. Article :Can Relevance of Images Be Inferred from Eye Movements?
    The author has devised an image retrieval algorithm based of eye movement tracking and user feedback. They performed an experiment where images from a specific category (images pertaining to sports) were shown to users with non relevant images and then feedback from the user was gathered with regards to the images being shown on each page. Along with feedback and eye movement tracked data and other image processing techniques an algorithm was devised so that relevant images could be retrieved.

    Q. Eye tracking seems to provide good results withe image retrieval. But is that because there are not numerous evaluation measures for image retrieval algorithms or is it because that Eye tracking is more important when it comes to evaluating images ?

    Q. In one of the example indicated by the author, they show that the user didnt fixate on one of the image because he saw another image which are related to sports so he didnt even need to check rest of the images on the page. I believe this assumption might have been wrong, because the user might not even have needed to fixate on the image to be able to judge if the image is relevant or not. It is possible that he judged the image in hindsight and realised that it was not worth noticing. This conjecture would be proved wrong if the eye tracking device used is very sensitive and is able to monitor even the slight movement in eye.

    Q. Can eye tracking algorithm be used to provide graded relevance in some manner rather than just binary relevance? Will the testing be more beneficial if this process was extended to the fact that if the user fixates for a longer duration on an image then it mint be because the image is more relevant.

  11. Can Relevance of Images be Inferred from Eye Movements? Kiami et al
    Summary: The authors wanted to establish weather or not tracking eye movements can be an adequate proxy for relevance judgements. They develop a task featuring 400 images. About 70 of them have to do with sports. It is loosely defined on purpose and meant to leave some images open to user interpretation. A user (subjected to eye tracking technology) is presented with 100 pages of images, 4 images to a page. Either 1 of the 4 images will be related to sport or none of them will. The user either marks the entire page relevant (if it has one relevant image) or not relevant. Theses aspects of gaze pattern are examined: 1) Total length of fixations 2) number of fixations 3) Average length of fixations 4) number of transitions form one image to another 5) number of images with at lease one fixations and 6) number of fixations within an image. The authors ultimately wanted to predict the relevance of a specific image then use that data to train a predictor system to evaluate new images. They showed this was at least possible with some work remaining to be done.

    1- The collection of images used appeared homogenous (from the examples shown) the were all full color photographs of common objects or people. The formula to predict future relevance of a new image relied on a histogram of edge orientations and a RGB variance analysis. How the the authors expect their predictor to work on black and white, hand illustrated, or abstract images?

    2- The paper clearly demonstrated a proof of concept and the authors are obviously enthusiastic but I wonder what they have planned for future work. More complex tasks? More varied images or topics? Videos? Larger collections? How do they plan to develop the predictor?

    3- The authors mention a future when eye tracking devices “may become one of the most informative and natural sensor mechanisms for gathering useful user data at low cost. Ubiquitous use of such devices would facilitate personalization and adaptivity of user interfaces…” (Obviously Google took the hint.) Besides implied relevance judgements how could ubiquitous eye tracking improve IR systems?