Quex is an instrument specially created in order to generate lexical analyzers.
A lexical analyzer is a program that transforms a stream of characters into a stream of 'atomic chunks of meaning'.
Quex's main goal is to open the door to high speed interpreted languages beyond the bounds of traditional ASCII character sets. Language constructs should be possible that include classical math symbols such as '≠' and '¬' as well as identifiers made up of characters from all scripts of the world.
The feature of sophisticated analyzer modes shall further facilitate the implementation of redundancy reduced languages.
That means, that some tokens might be derived from the current state of the analyzer or its state transitions and do not need to be triggered by source code elements. This helps to reduce the visual noise of a programming language.
Here are some key features of "Quex":
· Produces directly coded lexical analyzer, rather than table based engines.
· Sophisticated lexical analyzer modes which allow mode inheritance and mode transitions.
· Sophisticated buffer management which includes a free tell/seek based on character indices even with codings of where characters have dynamic size (e.g. UTF-8, UTF-16).
· Support for a large variety of international character encodings relying on established conversion libraries (IBM's ICU or GNU's IConv).
· Support for include stacks.
· Inherent token handling (queue or single token). Support for customized token types.
· Event handlers allow to trigger actions based on mode transitions, indentation events and other analyzis related events.
· Many examples are provided along with the software that demonstrate its usage.
Requirements:
· Python