puzzling.org · mary.gardiner.id.au · Macquarie University
Mary Gardiner · Weblog

Thu, 06 Nov 2008

Computational linguistics blogs
I write here very seldom, but I am at least trying to expand my web reading in CL. Blogs I am reading include: Peter Turney, Hal Duame III and Jason M. Adams. Here's a note of interest from Duame, Using machine learning to answer scientific questions:

There are several issues with trying to answer such questions [of whether or not a particular feature is useful for a task]. One issue is that typically what you're actually looking at is the question: can I figure out a way to turn WordNet into a feature vector that is useful for the task of coreference resolution? This is probably a partial explanation for why our community seems not to like negative results to such questions: maybe you just weren't sufficiently clever in encoding WordNet as features. Or maybe WordNet features are useful but there is some other set of features that's more useful that's just swamping the benefits of WordNet.

My first question is whether we're even going about this the right way. The usual approach is to take some baseline system, add WordNet features, and see if predictive performance goes up (i.e., performance on test data). This seems like a bit of a round-about way to attack the problem. After all, this problem of "does some feature have an influence on the target concept" is a classical and very well studied area of statistic: most people have probably seen ANOVA (analysis of variance), but there are many many more ways to try to address this question. And, importantly, they don't hinge on the notions of predictive performance. (Which almost immediately ties us in to "my system is better than yours.")



posted at: 20:37 | path: /reading | permanent link to this entry

Wed, 17 Sep 2008

Arguments in favour of published code in computer science/computational linguistics

The September 2008 issue of Computational Linguistics (volume 34:3) has a Last Words column by Ted Pedersen, Empiricism is Not a Matter of Faith. Pedersen has already made a PDF version available. In this article, Pedersen argues that computational linguistics is suffering because there is no tradition of publishing source code with research results.

In an email discussion about this article today, I was arguing in favour of the same idea (which you can take with a grain of salt as I have not published my source code to date). The opposing argument, by the way, was that everything important for a reimplementation should be described in written form in the paper, and if it isn't, then the paper is rubbish.

My arguments:

There are various cases where you want to reproduce something in computing research:

  1. you want to do a very similar task to published method X, but you have an approach Y that you think will work better. This is probably the case where source code is least useful, because you will be spending a lot of time implementing Y with or without the source code to X. However, it is sometimes useful:
    1. you want to contrast your method Y with published method X, but for example you want to do so on a slightly different dataset for some reason, perhaps because the task is slightly different and you want to see if X still works on the slightly different task
    2. your method Y includes some of method X... but the results you are getting aren't nearly as good as claimed in the paper about X and therefore you can't test the things you had hoped to about Y because you can't get a comparable baseline/input from X
    3. your method Y is heaps heaps better than X... and you'd like to prove it definitively by reproducing X's claimed numbers and running Y on the same data... but you can't.
    In 1.1 re-implementation is possible (in fact, I've spent most of my time on this) but source code would be a lot faster. And more results for the same effort is good for science. In 1.2 and 1.3 reimplementation is also slow, and it is VERY common to run into the problem of not being able to reproduce the numbers.
  2. you want to use result X as a stepping stone to result Y, eg you want to use the best possible stemmer to clean up things for your fancy new IR engine. In this case, access to the source code (or a working compiled version, I guess, in most cases, but it means that if you want to adapt any aspect you must reimplement) of X is very useful because it may save you weeks or months on debugging your own version of X when you wanted to be working on Y all along. You also run the risk described in 1 (and by Pedersen) of your implementation of X not working nearly as well as claimed, because X is more fragile than the authors realised and having spent a lot of time on X when X isn't actually your research problem.

Even if you don't need much of the code for X because Y is sufficiently different, for the case of reproducing results think of the code as backup. Sometimes, if you're lucky and you can't reproduce a result and not a long time has passed you can mail the author and they send you the source code or help you out.

However, if time has passed and/or the authors have sloppy data preservation, have quit research, wiped their hard drives, got into drugs, died, or were frauds in the first place and you realise you can't reproduce from the written description (even though arguably this makes it a bad paper), at the moment in CL you are out of luck.

Minimising being out of luck is a good thing for science.



posted at: 17:26 | path: /research-practice | permanent link to this entry

Wed, 20 Feb 2008

Spreadsheets are useful
I discovered this about eight months ago and then proceeded to forget I have a research weblog: spreadsheets are useful. For a first pass at data analysis, learn about the power of SUMIF and COUNTIF and play. After a while they're more time consuming to produce than coding a custom data analysis program, but only after you've got your analysis techniques bedded down for a particular experiment.

posted at: 12:26 | path: /tools | permanent link to this entry

Mon, 18 Dec 2006

Publishing negative results

The very first thing people tell me when they hear that I don't have results from my experiments is that I should publish my lack of results. Negative results are interesting!

The trouble is that that's not actually true. I suspect I keep getting told it because people would like to believe that it's true (I hear it most from other PhD students). And sometimes, if you're lucky with your idea and your experimental design, the null hypothesis is somewhat interesting. I've even known people who had a null hypothesis that was more interesting than the hypothesis they were testing (hypothetical example: huh, it turns out that massive doses of radiation actually don't hurt lizards at all, I for one welcome our new lizard overlords).

But a lot of the time, the null hypothesis is actually very dull: you had a vague hunch based on a conversation with your supervisor at the pub that X and Y were correlated. They aren't. Damn.

I have taken to referring to this as the 'muffin effect', as in huh, what do you know, putting a blueberry muffin near the control panel has absolutely no effect whatsoever on the behaviour of the particles in the accelerator, hand over that LaTeX, I have to publish right now! That is, quite a few null hypotheses are really quite dull, and let's face it, not publishable. Enterprising PhD students should seek out these golden geese experiments that are publishable either way.



posted at: 22:40 | path: /travails | permanent link to this entry

Wed, 14 Jun 2006

Statistics in language studies
At various points in my research I am going to be conducting experiments on both human and computer competency at various linguistic tasks. Now, I have an undergraduate mathematics major of indifferent competence. The main thing I can say about it is that it gives some assurance that I can pick up mathematical techniques to an undergraduate standard with the help of a sufficiently good textbook, it doesn't say much about those techniques being stored ready-made in my mind. And even given that I don't have a statistics background beyond very elementary high school level probability (the probability of picking a particular k objects from n total objects given equal chance of any particular object being chosen, either ordered or unordered) thanks to foolishly deciding that pure mathematics subjects were more worthy. So, in order to have a reasonable grasp of the basics of experimental design I decided to have a look around for books. This is a difficult search: statistical textbooks range widely in readability, audience, quality and difficulty. At the moment I've settled on:

Woods, Anthony, Paul Fletcher & Arthur Hughes. 1986, Statistics in Language Studies, Cambridge University Press, Cambridge, United Kingdom.
Aside from dating from a time prior (probably just prior) to the assumption that computers knew everything and that one would never need tables of the normal distribution ever again I've found it quite useful in elucidating the basic terms of statistics and sampling from an experimenters' point of view. I suspect I will want to read at least one follow-up about choosing samples from human populations, as they tend to recommend leaving this to a survey statistician and I'm curious. But it's gotten me the first step up the ladder.

posted at: 15:29 | path: /reading | permanent link to this entry

Wed, 07 Jun 2006

Abstract of the week: Turney 2002
I'm in the early stages of my literature review at the moment: the stage where the mountain of papers left to read seems to be growing exponentially, and they all blur into one another a bit. (Or at least, at the moment I'm hoping that this is a common stage, and not just me.) In this moment, a light in the darkness:

Turney, Peter D. 2002, Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL'02), Philadelphia, Pennsylvania, USA. July 8-10, 2002. pp. 417-–424 .
Specifically, go and look at the abstract of this paper? Isn't that delightful? It's a good, clear summary of the paper. After a week in the darkness, I'm happy to see that today.

posted at: 11:26 | path: /reading | permanent link to this entry

Mon, 08 May 2006

This weblog
Welcome to my research weblog. I'm not entirely sure what will go here while I'm still in the reading part of my PhD. I tend to at least try and write for an audience rather than myself, so I won't be making my reading notes, todo list or word count public. This will probably contain mainly software related posts and interesting notes on papers in the short term.

posted at: 17:30 | path: /about | permanent link to this entry

Syndicate

Archives

November 2008
Sun Mon Tue Wed Thu Fri Sat
           
           

All months

Categories

Powered by bloxsom