Categories and Ontologies

Most people planning to develop linked open data for development and crises, or to mine social media for crisis information will need ontologies and stopwords.

Ontologies are lists of words that are relevant to the subject of interest. Although there is much work on development data standards, there is not so much on development data ontologies:

  • The FAO is active, with its geopolitical ontology and plkans for work on agricultural ontologies. In its own words, “”Agricultural Ontology Service Initiative” with the Webportal on Agricultural Information and Knowledge management -Standards (http://www.fao.org/aims). Overarching goal of our initiative is to create agreed information exchange standards that make it possible to have common services on heterogeneous information repositories. – A good exemple for this is the collaboration between the AGRIS and GFIS (GLobal Foresty..) Information Services. Both networks use the same Exchange Standard (AGRIS-AP) to describe publications. In this way GFIS can use AGRIS data and viceversa. – Another interesting story is AGROVOC, the multilingual thesaurus of FAO that now links semantically multiple information systems through common concepts, expressed in different languages. AGROVOC is now taken up also by systems not working strictly on science and technology. In India they have created a Hindi version of the thesaurus that makes it possible for extensionists to search for English material in Hindi. You will find more information at the AIMS website.”
  • The short answer may be to write your own. The Stanford University publication Ontology Development 101 is very useful for this.

Stopwords are lists of words that are common in text (a, and, the etc) but aren’t very useful in summaries and searches. Because of the rise in online search, there are quite a few stopword lists available in English, several in about 20 common languages, and not so much in the other 300+ languages available online.

Leave a comment