GAN is a Python script requiring Python (>3.6), IPython, itertools, pandas (>= 1.0), xlrd (>=1.2.0).
To install the required dependencies, an option is to use the Miniconda package manager (https://docs.conda.io/en/latest/miniconda.html), and create a new environment using the bundled environment.yaml
file to create a new environment to be used to run the bundled scripts.
Commands:
# Create the environment (run it once)
conda env create -f environment.yml
# Activate the environment to use the scripts
conda activate gan
The repository comes with two scripts:
usage: gan-genus.py [-h] -1 FIRST -2 SECOND [-3 THIRD] -o OUTDIR [-p PREFIX] [-c CONNECTOR] [-v]
Generate bacterial genera with Excel input
optional arguments:
-h, --help show this help message and exit
-1 FIRST, --first FIRST
First Excel file in "GAN" format
-2 SECOND, --second SECOND
Second Excel file in "GAN" format
-3 THIRD, --third THIRD
Third Excel file in "GAN" format
-o OUTDIR, --outdir OUTDIR
Output directory
-p PREFIX, --prefix PREFIX
Output basename [default: 'gan']
-c CONNECTOR, --connector CONNECTOR
String connecting the explanatory strings [default: 'of']
-v, --verbose Increase output verbosity
The program requires two or three Excel tables, to be supplied with the -1
, -2
and -3
arguments, respectively.
The program requires an output directory to be specified (via -o
), and optionally an output "basename" prefix (via -p
).
The repository comes with small test files to check that the program is working properly. From the base directory of the repository:
mkdir test_output
./scripts/gan-genus.py -1 ./test/table1.xlsx -2 ./test/table2.xlsx -o ./test_output
This will produce three files in the test_output directory:
Each input file is an Excel file with at least one workbook (any other workbook is discarded). An empty template is provided in input_test/template.xlsx.
It should contain these columns (in any order):
Small example:
Language | Gender | Part | Word | Root | Definition | Explanation |
---|---|---|---|---|---|---|
L. | masc. | n. | admissarius | admissari | a stallion used for breeding | horses |
Gr. | masc. | n. | Balios | Balio | a mythical horse | horses |
L. | masc. | n. | caballus | caballi | a horse | horses |
The JSON object is an array of elements, each element is a dictionary having as key the compound name (e.g. _ Admissaristercoradaptatus_) and as value an array of tuples in the form of (type, value), where type specifies how to render the value. Some examples:
[ "glossary", "L." ]
[ "separator", " " ]
[ "italic", admissarius ]
An HTML formatted list of compound words and their etymology.
Each item is provided as:
Admissaristercoricola -- Etymology:
L. masc. n.
admissarius, a stallion used for breeding;L. neut. n.
stercus, excrement;N.L. masc./fem. n.
cola, an inhabitant; Admissaristercoricola: a microbe of the faeces of horses.
A LaTeX source that can be compiled to produce a PDF document. It requires a config.tex
file (supplied in the docs/
directory) and can be used to produce the PDF with this command:
pdflatex gan.tex
To install a LaTeX package, on Ubuntu (requires ~5 Gb of space):
sudo apt install texlive-full