System for Processing Formation Patterns and Restrictions (PPR)
PPR (version 19.2) is a sample implementation in SetlX of the Pattern-and-Restriction Theory of word formation (PR) (Nolda 2012 a, 2018 a). It currently provides selected word-formation patterns and a very limited lexicon for spoken and written German systems. PPR’s primary use is a grammar writer’s testbed for the soundness of his theoretical and empirical hypotheses. By no means, PPR is a production-scale system.
DWDSmor
DWDSmor is a toolbox for creating and applying a set of finite-state automata for morphological analysis and generation in written German. The automata are compiled from an SMOR-style grammar in SFST format and a lexicon which is derived at build time from XML sources of the online dictionary “Digitales Wörterbuch der deutschen Sprache” (DWDS). The compiled automata can be called from two supplied Python scripts for analysing tokenized corpus data or for generating of inflectional paradigms.
EXMARaLDA’s Dulko tools
The Dulko tools of the EXMARaLDA Partitur-Editor provide transformation scenarios (actually, XSLT 2.0 stylesheets) for the annotation of data in learner corpora and beyond. They support tokenisation, part-of-speech tagging, lemmatisation, sentence-span computation, editing of target hypotheses, detection of differences between target hypotheses and the learner text, error analysis, and metadata management (Hirschmann and Nolda 2019, Nolda 2019 b).
Prior to release version 1.7 of the EXMARaLDA Partitur-Editor, the Dulko toolset was developed separately from EXMARaLDA mainline under the name of “EXMARaLDA (Dulko)” for the Dulko learner-corpus project at the University of Szeged.
For this work, I was awarded the Innovation Prize 2018 in the engineering category from the University of Szeged.
makeDulko
makeDulko (version 1.1) is a build system for generating ANNIS data from EXMARaLDA sources annotated with the EXMARaLDA (Dulko) tools.
XGrep
The Python 3 script
xgrep.py
(version 2.12) searches XML files for patterns specified in terms of XPath 1.0 expressions.
Its options mimic the behaviour of GNU grep
.
XDiff
The Python 3 script
xdiff.py
(version 2.4) compares XML files for structural or textual differences;
differences in attribute order or whitespace formatting are ignored. Its output
mimics the unified format of GNU diff
.
PSGML-Utils
PSGML-Utils (version 2.1) is a set of extensions for Emacs’ PSGML mode. They provide additional editing functions, functions for running validation and transformation scenarios, as well as an XML mode derived from PSGML’s SGML mode.
TEI2X
TEI2X (version 2.16) provides XSLT 1.0 stylesheets for the generation of TeX files as well as DOCX files and HTML files from legacy TEI P4 source files, in a customised version with some P5 additions. The stylesheets are geared towards ‘born-digital’ documents, in particular technical documents in linguistics and other scientific fields.
TEIP4to5
The XSLT 1.0
stylesheet teip4to5.xsl
(version 1.4)
converts legacy TEI P4
documents (such as the sample TEI files in TEI2X) to TEI P5 documents.
Overlays
The overlays
package (version 2.12) for LaTeX
allows to write presentations with incremental slides. It does not presuppose
any specific document class. Rather, it is a lightweight alternative to
full-fledged presentation classes like beamer
.
Tagpair
The tagpair
package (version 1.1) for LaTeX
provides environments and commands for pairing lines, bottom lines, and tagged
lines, intended to be used in particular for word-by-word glosses, translations,
and bibliographic attributions, respectively.
Hang
The hang
package (version 2.1) for LaTeX
provides environments for hanging paragraphs and list items. In addition, it
defines environments for labeled paragraphs and list items.
Lingua Franca
The Lingua Franca OpenType and Web Open fonts (version 1.20) are a modified version of the Heuristica font family, which in turn is based on the Utopia Type 1 fonts, designed by Robert Slimbach for Adobe and licensed to the TeX Users Group (TUG) for free modification and redistribution. The Lingua Franca fonts are particularly useful for documents in linguistics. The regular typeface includes all characters of the Unicode IPA extensions as well as many spacing or combining diacritics. In addition, the typefaces support various typographic features such as ligatures, proportional figures, etc.; a stylistic set provides longer slashes, matching the parentheses in height and depth.
Goal Column
The Goal Column
macro bundle (version 1.0) for JEdit is inspired by Emacs’
set-goal-column
function.
MAgenda
The Python 3 script
magenda.py
(version 2.0) creates an agenda of task-list items in GitHub Flavored Markdown
files.
Latin Square
The Bash script
latin-square
(version 1.1) prints lines from a file according to the Latin square. It
is intended for distributing experimental items over groups of subjects in
Latin-square form.