Wed, 20 Feb 2008
Spreadsheets are useful
I discovered this about eight months ago and then proceeded to forget I have a research weblog: spreadsheets are useful. For a first pass at data analysis, learn about the power of SUMIF and COUNTIF and play. After a while they're more time consuming to produce than coding a custom data analysis program, but only after you've got your analysis techniques bedded down for a particular experiment.
posted at: 12:26 | path: /tools | permanent link to this entry
Mon, 18 Dec 2006
The very first thing people tell me when they hear that I don't have results
from my experiments is that I should publish my lack of results. Negative
results are interesting!
The trouble is that that's not actually true. I suspect I keep getting told
it because people would like to believe that it's true (I hear it most from
other PhD students). And sometimes, if you're lucky with your idea and your
experimental design, the null hypothesis is somewhat interesting. I've even
known people who had a null hypothesis that was more interesting than the
hypothesis they were testing (hypothetical example: huh, it turns out that
massive doses of radiation actually don't hurt lizards at all, I for one
welcome our new lizard overlords
).
But a lot of the time, the null hypothesis is actually very dull: you had a vague hunch based on a conversation with your supervisor at the pub that X and Y were correlated. They aren't. Damn.
I have taken to referring to this as the 'muffin effect', as in huh, what
do you know, putting a blueberry muffin near the control panel has
absolutely no effect whatsoever on the behaviour of the particles in
the accelerator, hand over that LaTeX, I have to publish right now!
That
is, quite a few null hypotheses are really quite dull, and let's face it, not
publishable. Enterprising PhD students should seek out these golden geese
experiments that are publishable either way.
posted at: 22:40 | path: /travails | permanent link to this entry
Wed, 14 Jun 2006
Statistics in language studies
At various points in my research I am going to be conducting experiments on
both human and computer competency at various linguistic tasks. Now, I have an
undergraduate mathematics major of indifferent competence. The main thing I can
say about it is that it gives some assurance that I can pick up mathematical
techniques to an undergraduate standard with the help of a sufficiently good
textbook, it doesn't say much about those techniques being stored ready-made in
my mind. And even given that I don't have a statistics background beyond very
elementary high school level probability (the probability of picking a
particular k objects from n total objects given equal chance
of any particular object being chosen, either ordered or unordered) thanks to
foolishly deciding that pure mathematics subjects were more worthy.
So, in order to have a reasonable grasp of the basics of experimental design I
decided to have a look around for books. This is a difficult search:
statistical textbooks range widely in readability, audience, quality and
difficulty. At the moment I've settled on:
Woods, Anthony, Paul Fletcher & Arthur Hughes. 1986, Statistics in Language Studies, Cambridge University Press, Cambridge, United Kingdom.Aside from dating from a time prior (probably just prior) to the assumption that computers knew everything and that one would never need tables of the normal distribution ever again I've found it quite useful in elucidating the basic terms of statistics and sampling from an experimenters' point of view. I suspect I will want to read at least one follow-up about choosing samples from human populations, as they tend to recommend leaving this to a survey statistician and I'm curious. But it's gotten me the first step up the ladder.
posted at: 15:29 | path: /reading | permanent link to this entry
Wed, 07 Jun 2006
Abstract of the week: Turney 2002
I'm in the early stages of my literature review at the moment: the stage where
the mountain of papers left to read seems to be growing exponentially, and they
all blur into one another a bit. (Or at least, at the moment I'm hoping that
this is a common stage, and not just me.) In this moment, a light in the
darkness:
Turney, Peter D. 2002,Specifically, go and look at the abstract of this paper? Isn't that delightful? It's a good, clear summary of the paper. After a week in the darkness, I'm happy to see that today.Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviewsin Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL'02), Philadelphia, Pennsylvania, USA. July 8-10, 2002. pp. 417-–424 .
posted at: 11:26 | path: /reading | permanent link to this entry
Mon, 08 May 2006
This weblog
Welcome to my research weblog. I'm not entirely sure what will go here while
I'm still in the reading part of my PhD. I tend to at least try and write for
an audience rather than myself, so I won't be making my reading notes, todo
list or word count public.
This will probably contain mainly software related posts and interesting notes
on papers in the short term.
posted at: 17:30 | path: /about | permanent link to this entry
