Pawel Mazur > Resources > Links



My Collection of Links to Online Resources
(constantly under construction and development)

Table of Contents
Associations and OrganizationsLanguage Technology Companies
Conferences and ProceedingsJournals
ThesesBlogs
Software Architectures and Collections of Processing ComponentsInformation Extraction, Text Mining, Text Analytics
Named Entity RecognitionTemporal Information Extraction
Morphological AnalysersPOS Taggers
Syntactic ParsersTreebanks
Machine LearningOther
Other Collections of ResourcesProlog Interpreters
Programming in JavaProgramming (other)
Mac OS XWeb
Graphics

Associations and Organizations

ACL The Association for Computational Linguistics
EACL The European Chapter of the ACL
NAACL The North American Chapter of the ACL
FoLLI The Association of Logic, Language and Information
ELSNET European Network of Excellence in Human Language Technologies
ICCL International Committee on Computational Linguistics
ACM SIGIR ACM Special Interest Group on Information Retrieval
ELRA European Language Resources Association
AAAI American Association for Artificial Intelligence
ALTA Australasian Language Technology Association
MITRE The MITRE Corporation

Language Technology Companies

MorphoLogic Hungarian company established in 1991.
Textkernel Dutch company focused on text mining and information extraction.
Basis Technology US company working on multilingual text processing.
Teragram US company, owned by SAS, developing multilingual natural language processing technologies.
Appen Australian company developing high quality speech and language technology and applications.

Conferences and Proceedings

EACL calendar Forthcoming conferences in the field of computational linguistics and natural language processing.
ACL Anthology Large collection of proceedings at ACL.
LIDOS Literature Information and Documentation System.
ACM DL ACM Digital Library.
ACL Current and past conferences organized or co-organized by ACL.
CoNLL Conference on Computational Natural Language Learning
LREC International Conference on Language Resources and Evaluation
CoLing International Conference on Computational Linguistic
CICLing Conference on Intelligent Text Processing and Computational Linguistics
TREC Text REtrieval Conference
ALTA proceedings 2003, 2004, 2005, 2006
IJCNLP International Joint Conference on Natural Language Processing
IWCS International Workshop on Computational Semantics
ICoS International Workshop on Inference in Computational Semantics
CLC Computational Linguistics Conferences: a nice website summarising submision dates, notification dates, conference dates and conference locations.
The Linguist List Current Conferences by Linguistic Sub-field.
NLP Conf. A list of conferences on natural language processing maintained by Naoki Yoshinaga.
WikiCFP Very neat and useful.
IJCAI International Joint Conferences on Artificial Intelligence
AJCAI Australian Joint Conference on Artificial Intelligence
MICAI Mexican International Conference on Artificial Intelligence
TIME International Symposium on Temporal Representation and Reasoning. See also the archive at the ACM Portal.

Journals

CL Computational Linguistics
JRPIT Journal of Research and Practice in Information Technology
NLE Journal of Natural Language Engineering
IJL International Journal of Lexicography
ML Machine Learning
CSL Computer Speech and Language
CSc Journal of Cognitive Science
DP Discourse Processes Journal
CI Computational Intelligence
IEEE IS IEEE Intelligent Systems
J. of Web Sem. Journal of Web Semantics
J. of LLI Journal of Logic, Language and Information
Int. J. of IS International Journal of Intelligent Systems
Int. J. of HC Stud International Journal of Human-Computer Studies
IntCom Interacting with Computers

Theses

ALTA theses Theses written by Australasian Language Technology Association (ALTA) Members

Blogs on Language, Computational Linguistics and Natural Language Processing

Answer Me! A weblog about commercial question-answering technologies.
NLPers A NLP- and CL-related blog of Hal Daume III.
LingPipe Blog Alias-i's blog on Natural Language Processing and Text Analytics.
Alex's Outer Thoughts Alexandre Rafalovitch's blog on computational linguistics.
Data Mining: NLP Matthew Hurst's blog on text mining, visualization and social media.
Language Log A blog maintained by Mark Liberman; most of the posts are on language use in the media and popular culture.


Software Architectures and Collections of Processing Components

UIMA Unstructured Information Management Architecture, Java, open source (by IBM).
GATE General Architecture for Text Engineering, Java, open source (by University of Sheffield).
INTEX A linguistic development environment that includes large-coverage dictionaries and grammars (by Max Silberztein).
SProUT A general-purpose framework integrating finite-state and unification-based grammar formalisms (by DFKI, Germany).
Stanford Tools A collection of Java tools developed at the Stanford NLP Group: parser, POS tagger, NER, Chinese word segmenter and other.
OpenNLP A variety of Java-based tools which perform sentence detection, tokenization, pos-tagging, chunking and parsing, named-entity detection, and coreference using the OpenNLP Maxent machine learning package.
NLTK A suite of open source Python modules, data sets and tutorials supporting research and development in natural language processing.

Information Extraction, Text Mining, Text Analytics

MUC-6 The 6th Message Understanding Conference.
MUC-7 The 7th Message Understanding Conference.
ACE Automatic Content Extraction (ACE) program run by NIST.
Text Analytics Wiki Wiki website about Text Analytics.
TextMine A collection of Perl tools useful in text mining tasks.
Megaputer One of companies working on Text Mining.

Named Entity Recognition

Stanford NER Command line, GUI and Java API available. Recognizes people, organisation and location names.
LingPipe Alias-i's NER tool. See demo.
IdentiFinder Text Suite A NER tool from BBN Technologies.
C&C NER A NER tool being a part of the C&C Tools, developed by James Curran and Stephen Clark.
NET Named Entity Tagging by Kadri Hacioglu. See demo.
Cognitive Computation Group See their demo.
Natural Language Synergy Lab Named Entity Recognition in Biomedical Domain. See demo.
Spraakdata Named Entity Recognition, (NE), for Swedish, without the use of Gazeteers. See demo.
MUC-6 The 6th Message Understanding Conference.
MUC-7 The 7th Message Understanding Conference.
CoNLL-2002 Conference on Computational Natural Language Learning.
CoNLL-2003 Conference on Computational Natural Language Learning.
TIMEX2 Temporal information annotation standard.
TimeML Temporal information and event annotation standard.
NewsTracker NewsTracker allows you to navigate the news content by person, place, organization, company and a range of other elements.
Six Degrees Six Degrees finds the networks that link people, companies and other items together.
ClearForest Gnosis Firefox plugin for finding named entities in browsed webpages.
ClearForest SWS Mashup ClearForest SWS Mashup Competition Winners.
JavaRAP JavaRAP is an implementation of the classic Resolution of Anaphora Procedure (RAP) given by Lappin and Leass (1994). It resolves third person pronouns, lexical anaphors, and identifies pleonastic pronouns.

Temporal Information Extraction

MUC-6 The 6th Message Understanding Conference.
MUC-7 The 7th Message Understanding Conference.
ACE Automatic Content Extraction (ACE) program run by NIST.
TIMEX2 Temporal information annotation standard.
TimeML Temporal information annotation standard.
TimexPortal A portal gathering community interested in processing temporal expressions.

Morphological Analysers

XTAG XTAG morphology database with Berkeley db interface and X11 maintenance tool. Also various XTAG tools for parsing, grammar development and others.

POS Taggers

TnT Statistical Part-of-Speech Tagging.

Syntactic Parsers

Minipar Shallow function dependecy grammar parser. See demo.
Conexor Shallow function dependecy grammar parser. See demo.
SS A fast CFG parser with chunk parsing developed at University of Tokyo.
Charniak Statistical parser. See demo.
Collins Statistical parser.
Bikel Statistical parser, implemented in Java. See demo.
RASP Robust Accurate Statistical Parsing (Linux only).
CCG Statistical parsers (C&C and StatCCG) using Combinatory Categorial Grammar (CCG) formalism. See demo of C&C.
The Stanford Parser Java implementations of probabilistic natural language parsers, both highly optimized PCFG and dependency parsers, and a lexicalized PCFG parser. See demo.
MSTParser Dependency parser.
MaltParser Dependency parser, implemented in Java.
Apple Pie Parser A bottom-up probabilistic chart parser.
Link Parser A parser based on a link grammar.
Chunk Parser A shallow parser developed at the Cognitive Computation Group at University of Illinois. See demo.
Penn2Malt A tool which converts phrase-structure parses to a dependency model.

Treebanks

Penn Treebank The Penn Treebank Project.
TüBa-D/Z The Tübingen Treebank of Written German.
Prague Czech-English Prague Czech-English Dependency Treebank 1.0.

Machine Learning

WEKA "A collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. (Java)"
MaxEnt The OpenNLP Maximum Entropy Package (Java).
SVM Light An implementation of Support Vector Machines (SVMs) in C.
LIBSVM A library for Support Vector Machines with source code in C++ and Java and many interfaces.
C4.5 C4.5 Release 8.
OpenHTMM The Hidden Topic Markov Model.

Other

NSP Ngram Statistics Package (by Ted Pedersen and team).

Other Collections of Resources for NLP and CL

ACLWiki Wikipage owned by the Association for Computational Linguistics (ACL).
DFKI Registry Software registry maintained by DFKI.
Stanford Links Statistical natural language processing and corpus-based computational linguistics: an annotated list of resources.
CL-Demos A collection of links to online demos of CL tools, especially for German, English and French.


Prolog Interpreters

PrologCafe Java implemented - system independent, but not particularly fast. Open source.
SWI Prolog C++ implementation, portable to many platforms, including almost all Unix/Linux platforms, Windows (95/98/ME and NT/2000/XP), MacOS X (using X11 for graphics) and many more. Both 32-bits and 64-bits hardware is supported. Open source.
SICStus Commercial product.

Programming (Java)

Eclipse Open-source and free Java editor and much more.
NetBeans Open-source and free Java editor and much more.
Code conventions Code conventions for the Java programming language recommended by Sun.
Jakarta Commons Collection of reusable Java components.
Apache Logging/a> Resources to facilitate logging functionality.
JETM Java Execution Time Measurement Library.
JGraph Java graph visualization and layout library.
SoftwareMetrics An Eclipse plugin for calcularing various sorts of software metrics.
PDF A list of libraries for managing PDF at the Java code level.
XML Good online book on processing XML with Java.

Programming (other than Prolog and Java)

ActivePerl Ready-to-install distribution of Perl, available for AIX, HP-UX, Linux, Mac OS X, Solaris, and Windows (free download).
Perl Docs Perl's Documentation.
Tizag A Perl tutorial.
Perldoc Documentation of Perl.
Xah Lee Xah Lee's Perl and Python Tutorial.
Python Python is a dynamic object-oriented programming language; it offers strong support for integration with other languages and tools;
Lisp Lisp is an expression-oriented language, already with a loooong history.
XSLT FAQ XSLT Frequently Asked Questions.


Mac OS X

Apple Docs Apple support documentation.
Macworld The Mac experts
Mac OS X Hints Lots of hints for Mac OS X.
Many Tricks Tricks and utilities.
Mac Slash A discussion forum for Mac users.

Web

Firebug Very useful add-on for Firefox for web developers.

Graphics

Icons A large collection of free icons published by Axialis Software.




If you have any comments regarding the resources or wish some information to be added here please do not hesitate to contact me.

Still haven't found the resources you need? Ask Google:

Google


Last Modified: 25th November 2008