Research Tricks

Many road blockers may slow down your research pace. These tricks are to keep you efficient when doing research. I try to list the ones that are less well-known. Although, I also list important & well-known ones here, mainly as a way to remind myself. I learnt them mostly from all the wonderful people around me, and found them useful in pushing research (or any work) forward.
  • Searching for relevant research in Google Scholar

    Relevant research papers usually have to cite certain classical papers, and you can find those relevant works in the "cited by" list of the classical papers. However, Google Scholar doesn't allow you to search within the "cited by" list from its web GUI, even in advanced search. For example, this query gives over 1000 result papers citing the classical RSJ weight paper. It will be a pain to look thru all the 1000 citers.

    Luckily, you can restrict by adding search terms in the URL! This query allows you to restrict by the terms structured retrieval models.

    There are still two unsatisfactory points,

    1. The terms you specify serves as a very strict filter using Boolean AND, and you cannot use the terms to only rank the results.
      A remedy: make sure all filtering terms you use are really necessary terms.
      To do that, one trick is to include as many synonyms/searchonyms for each term as possible, e.g. (structured OR field OR parsing OR NLP) AND retrieval AND models is used in this query. Searchonyms are words that may be used to refer to the same thing in the collection. For example, field retrieval is a specific form of structured retrieval, thus, field is a searchonym of structured, for this query. You can even use query operators recursively, like "(structured OR field) retrieval model"
    2. The above type of Boolean styled queries are extemely expressive and effective. A large part of my PhD thesis is about CNF queries. Take a look at more examples that I have used to do literature review for my thesis.
    3. One thing Google doesn't yet have is to do intersections of the citers for a number of papers.
      In order to find results that cite two or more common papers, you'll need to use search terms, e.g. the titles of the cited papers. You can do phrase queries for this, but depending on how well Google extracts pdf contents, you may lose some relevant results.
  • Improve your writeup/paper iteratively

    1. Start writing with a sketch or outline of the things you want to put in.
    2. After finishing a draft, put it down for a couple of days, and start reading it fresh and critically.
    3. Start reading the paper from a random section/paragraph, imagining you are a reader knowing only a little about the paper and jumped to that section. A navigatable outline of the paper would be helpful here.
    4. And imagine that your reader only has a couple of seconds to read a section, or paragraph, in search of some useful information, could your reader efficiently do that?
    5. The above two points may not be so much of a problem for a short document or if the reader has time, but, these conditions are real luxuries for a writer. Luxuries spoil.
    6. Sometimes, jump out of the details and individual sections, and think broadly about the work. E.g.
      • Look only at the outline of the paper, and think whether that makes sense.
      • Think about how your work fit into the large body of works in your field.
      • Try to generalize, what general problem your work is really addressing,
      • what general new perspective have you brought in, can you apply it elsewhere? Those other places may have existing work that is related to yours.
    7. Sometimes, you've worked and thought long enough about the work and the paper, and you become so familiar with the paper that you think nothing should be changed about the paper, the paper is perfect.
      This would be a good time to talk to other people about your work, officemates, people not in your specific area, and even have people read your draft to give you feedback.
      This would also be a good time to set the paper aside for a couple of days or longer, before turning to it.
      Another trick I find effective here is to imagine that you are a reader who wants to get some information out of the paper by jumping randomly into a section of the paper and see if it is enough self-contained.
  • When you need motivation (what to do to keep yourself motivated)

    A PhD, or a Masters degree, or even just a year, a semester is a fairly long time. Some time during that period, you may experience a lack of motivation to work, or don't know what to work on. Here are some tricks to help you avoid that experience.
    1. Be SMART: setup Specific Measurable Attainable Realistic and Timely goals.
    2. The trick from Hemingway: Sometimes at the beginning of a morning, you lose context of what you were working on and have a hard time following up on yesterday's work. A trick to avoid being stuck like this is to leave one day's work at the middle, e.g. for writing, leave a half finished sentence. This way, it will be easy for your mind to get going the next day.
  • Data analysis

    • Inspecting experiment results

      Many research ideas come from inspecting experimental results, not necessarily end-to-end results, but also analyses of data at various stages. Real world data are usually multi-faceted, and complex. When inspecting results, always first make your display intrinsic, intuitive and simple. This will make it easy to identify general problems and trends.

      For retrieval results, include result snippets would help identify retrieval algorithm problems more easily. First, your analysis should start with a goal, e.g. retrieval effectiveness. Then, identify relevant aspects, e.g. query quality, collection and retrieval algorithm, and display central characteristics of these.

      Some people say research is about finding the right way to display data, and the new idea comes nearly free given the displayed data. This means, the design of result display is an integral part of data/result inspection.

      More broadly,

    • Related to above, how to design metrics to make effective observations

      This is especially the case in an industrial setting, where everything needs to be measurable and accountable, and you also have to support/motivate any action with real results.
    • For improving a current technique or system

      Here, the standard thing to do is failure analysis. Identify failed/suboptimal cases for your system, Analyze the causes for the system to perform suboptimally, and Identify percentages of those causes, so that you can choose to focus on the areas that need most attention and may be most impactful.

      You may design different solutions targeting the different causes. However, don't forget, that sometimes, a simple solution might solve multiple of your problems!

      The scientific way to design a solution is to formulate your goal in the form of e.g. an optimization problem. Make all assumptions explicit, and derive the optimal solution to your problem given the simplifying assumptions.

      Now that you have all your assumptions laid out for investigation, you can see what assumption is causing most of the problems. Weaken that assumption so that it's more realistic.

  • Writing/talking as a way to clarify ideas

    Idea here can be an opinion about a problem or any topic, a sketchy solution to a problem, or anything. When an idea jumps into your head, no matter how clear you think it is, it's usually vague. At that moment, trying to write it down in one or two sentences would be helpful. When it starts to show promise, further write a paragraph of why you think it will work. And basically whenever something becomes more complex than you think, you should write about it.

    For example, when learning about a field, try to layout the key components of the field and write about them. Why they are key, how they are connected.

    Or for example, when listening to talks or reading books, try to make as many notes as possible, either by writing sentences down or drawing something up, and also ask lots of questions. Becoming verbal or visual facilitates the mind to reason.

    For this method to take maximal effect, when writing, you'll need to be as faithful to the reality/truth as possible, as specific as possible, but at the same time concise and general. And let the writing make its own way thru.

    Talking about the idea to yourself, or with your friends is very similar to writing. For important things, make up some presentation slides and imagine you are giving a talk to a large group of people. Talking to a real person is usually faster than talking to yourself or writing it up, and leaves less time for thinking.

  • Keep yourself and all resources busy

    1. To keep yourself busy, do multiple projects at the same time. We have to admit that research projects usually have lows and highs, in terms of progress, because of dependencies on other resources or simply because you don't know what to do next. Doing more than one project will naturally interleave them, so that you always make reasonable output.
      Although, a certain cost is associated with switching mind sets among projects. You need to schedule well, and manage yourself to minimize the costs.
      One advantage of working on multiple projects is that you'll have a fresh eye after you switch away for sometime and then back to a problem or a paper that you are working on.
    2. To keep machines/servers etc. resources busy, automate your programs as much as possible, so that you setup the run(s), keep the machines running, and after sometime (when you can work on other problems), the evaluation and results will be ready for you. People, such as your advisor, are more valuable and more difficult-to-get resources, so do manage that well, e.g. make sure you know when they are busy or not.
  • Attitudes & Vision

    Science is all about truths. But doing science or doing anything is all about attitudes and faith.
    Keep a positive attitude toward any paper, any result. Think not about how the result is weak, but what are the merits, and how it can be used elsewhere. This will keep you focused on useful research, and be productive.

    When faced with a difficulty, keeping a clear vision or goal in mind will sometimes help you avoid the difficulty, bypassing it totally, because there might be other better and easier paths to the destination.

  • Immerse in related research projects

    When selecting projects to work on, try to focus on related problems as much as possible. Ideas from one problem may inspire new ones for other problems. Thus, focusing on related problems is a good way to keep pushing research forward. Similarly, go to relevant meetings and conferences, and talk to people who is working or have worked on related problems.
  • Teaching

    Good teachers don't simply show the results, they do it from scratch in front of the whole class, i.e. they re-construct the whole research (problem and solutions) and show it to their class. 1) They re-create the scene (the research background) at the time of invention, 2) they try to step by step work through the problem, identify characteristics of the problem, identify keys to solving it and propose possible solutions, and 3) they make it obvious for students to re-discover the textbook solution, or sometimes they find a more elegant solution.

    So teach in the area of your research, and teach fundamental problems and principles. I try my best to do the same. And in my limited teaching experience, I have already benefited a lot from it. Certainly, students appreciate that as well.

    Teaching can be toward students (grads, undergrads), or can be in a conference tutorial. In all cases, the trick is to assume that your audience knows nothing, so you need to make every step explicit, clear and intuitive.

  • Debugging (or solving problems) quickly & effectively

    This is about general problem solving, not just debugging computer programs. First, expect to get stuck in research or any work. Then, to keep a high efficiency, you need to solve the problems quickly, and move forward.

    You probably noticed, the other tricks from above try to bypass the road blockers and hope they will resolve by themselves, while this section deals with the real mess.

    1. Characterize/generalizing the problem: When faced with a problem, (I'm talking about road blockers, but it's encouraged to apply this procedure to any important problem, no matter how trivial the problem seems, and you'll find non-trivially good solutions), the first thing you do is to closely inspect the scene (e.g. the "error messages") and try to characterize the problem. You may think you know what's the problem, but if it's an important problem that will take effect in the long run, you should spend time to get to know what really is the problem. Here, writing helps. You ask yourself or other people, what happened, what really is the problem. At the same time as you try to answer the questions, you should eliminate irrelevant conditions, and describe the problem as generally as possible. Often, "generally" means concisesly. By generalizing it, you may find other people who might face with the same problem. Real world problems, sometimes even program bugs, have solutions outside the problem itself, usually very good solutions. The more you generalize your problem, the larger scale your problem applies to, and the more likely you will find a solution.
      For example, there is an Eclipse plugin for Hadoop, but around Oct. 2009, the plugin was not well updated and did not work with newer versions of Hadoop and Eclipse. This was one problem I faced with. Formulated specifically, it is about fixing a buggy plugin, but more generally, it is to find an easy to use programming tool for Hadoop, thus, all Hadoop users would be interested.
    2. Identify possible solutions: Solutions come from various sources. An important source is from other people who may have already solved the problem. First, try Google the problem (use your general formulation as well as more specific formulations, because Google hasn't yet gained consciousness and human intelligence). Second, sometimes it's too hard to find relevant information, like in my Eclipse plugin case, where online documentation is, surprisingly, very sparse. Try ask relevant people, people around you, online maillists, Yahoo Answers. Depending on how well you generalized your problem, you may quickly find an answer from relevant people, or sometimes if you are lucky, people may even ask you questions to help generalize the problem and identify a solution.
      For my plugin problem, I asked on the Hadoop users maillist. Although, I asked narrowly about the Hadoop Eclipse plugin, I was lucky and people tried to generalize my problem, and provided useful solutions.
    3. Choose a solution: Given a set of possible choices, it is easy to make a decision. In my plugin case, it turned out there is a much nicer Hadoop plugin for NetBeans.
    4. Compromise / Procrastinate: The worst situation is, nobody answers your call. You have to choose to compromise by using other less optimal hacks to bypass the problem, or delay it to a latter time. For important problems, you may want to keep it in mind for some days. Maybe after several days, or even the next morning, a more general formulation of the problem pops up in your mind, and you see the solution. But before that, it is exactly when you'll need other projects or tasks to fill in your time slots.

    Similar to debugging, your problem solving skills will improve as you consciously accumulate experience. And you will find better solutions faster. Lastly, to generalize our debugging topic a little, here are some useful pointers to the general problem solving, decision making and conflict resolution literature.

  • Advices from other people

  • Larger topics

    For larger topics like what research topic to choose, how to do research, some good resources are here:

Le Zhao  (To the reader: if you have resources fitting the goal here, I'll be more than glad to hear from you!)
Last Update: 2011-12-21