Modules

Modules are like plugins to the system, usually providing additional functionality at some cost - needs additional dependencies, supports only specific language etc. That's why they are not included into the core system, but can be easily included into your rules.

eg.

!IMPORT("rita.modules.fuzzy")

FUZZY("squirrel") -> MARK("CRITTER")

NOTE: the import path can be any proper Python import. So this actually allows you to add extra functionality by not modifying RITA's source code. More on that in Extending section

Fuzzy

This is more as an example rather than proper module. The main goal is to generate possible misspelled variants of given word, so that match matches more cases. Very useful when dealing with actual natural language, eg. comments, social media posts. Word you can be automatically matched by proper you and u, for as for and 4 etc.

Usage:

!IMPORT("rita.modules.fuzzy")

FUZZY("squirrel") -> MARK("CRITTER")

Pluralize

Takes list (or single) words, and creates plural version of each of these.

Requires: inflect library (pip install inflect) before using. Works only on english words.

Usage:

!IMPORT("rita.modules.pluralize")

vehicles={"car", "motorbike", "bicycle", "ship", "plane"}
{NUM, PLURALIZE(vehicles)}->MARK("VEHICLES")

Tag

This module offers two new macros: TAG and TAG_WORD.

TAG is used for generating POS/TAG patterns based on a Regex e.g. TAG("^NN|^JJ") for nouns or adjectives.

Works only with spaCy engine

Usage:

!IMPORT("rita.modules.tag")

{WORD*, TAG("^NN|^JJ")}->MARK("TAGGED_MATCH")

TAG_WORD is for generating TAG patterns with a word or a list.

e.g. match only "proposed" when it is in the sentence a verb (and not an adjective):

!IMPORT("rita.modules.tag")

TAG_WORD("^VB", "proposed")

or e.g. match a list of words only to verbs

!IMPORT("rita.modules.tag")

words = {"percived", "proposed"}
{TAG_WORD("^VB", words)?}->MARK("LABEL")

Orth

Ignores case-insensitive configuration and checks words as written that means case-sensitive even if configuration is case-insensitive. Especially useful for acronyms and proper names.

Works only with spaCy engine

Usage:

!IMPORT("rita.modules.orth")

{ORTH("IEEE")}->MARK("TAGGED_MATCH")

Regex

Matches words based on a Regex pattern e.g. all words that start with an 'a' would be REGEX("^a")

!IMPORT("rita.modules.regex")

{REGEX("^a")}->MARK("TAGGED_MATCH")

Names

Takes list of full person names (First + Last, or First Middle Last) and generates shortened variations, eg. F. Last, First M. Last, F. M. Last etc.

!IMPORT("rita.modules.names")

names = {"Roy Jones junior", "Roy Jones senior", "Juan-Claude van Damme", "Jon Jones"}
NAMES(names)->MARK("NAME_MATCH")

Useful when matching against fixed set of names