\_sh v3.0  374  Readme Notes

\id ADAPT2B
\s Norwegian sample with Parse and Generate
\p This sample expands on the previous Norwegian adaptation
sample by adding a Parse process and a Generate process. The
result is a full-blown adaptation setup.
\p Looking at the text you can see that it first parses the text into
morphemes (\mb line) and assigns a category to each
morpheme (\ps line). It then converts each morpheme to the
target language (\e1 line) and rearranges the morphemes into
better target order (\e2 line). Finally it puts the morphemes back
together to generate target text (\e line).
\p To keep the file size down, only the first three verses are done
(feel free to process the other lines to see this full-blown
adaptation process in action.) Comments in the text point out
the places where this process differs from the Adapt2a sample.
\s2 Interlinear Setup
\p Look at the Interlinear setup of the text. You can see that it
consists of a Parse process, two Lookup processes, a
Rearrange process and a Generate process.
\s3 Parse Processing
\p Looking at the details of the Parse process, you can see that it
parses from the \lx field and \a fields in the lexicon, and outputs
the \u field if there is one. This is a standard parse setup.
Notice that if the parse fails it outputs the original word rather
than a fail mark.
\s3 Lookup Processes
\p Looking at the details of the first Lookup you can see that it is
not changed from the previous example. It looks up the \lx field
in the lexicon and outputs the contents of the \p field. If the
lookup fails, it outputs a failure mark, since the word has no
resemblance to the part of speech. The process is marked as
an adaptation process because there is no separate
interlinearization phase to a setup like this.
\p The second Lookup process is also the same as it was in the
previous example.
\s3 Rearrange Process
\p The Rearrange process is the same as in the previous example
except that it does not look at the \t line for punctuation (the
Punctuation Marker is set to "none"), because the Generate
process handles that in this example. Also, instead of outputting
the \e line, it ouputs another intermediate line \e2. 
\p As mentioned in the previous example, unlike phonological
rules, rearrange rules cannot feed each other. But if you really
need that effect, one way of getting it is to add another
intermediate line and another Rearrange process with another
rule file with a new set of rearrange rules. For example, in this
setup we could add an \e3 line and add a Rearrange process to
go from \e2 to \e3. The rules in the rule file of the second
Rearrange process would apply to the ouput of the rules in the
first Rearrange process.
\s3 Generate Process
\p The Generate process applies a Phonological rule file named
ENGPHON.RUL to handle morphophonemics. It also looks at
the \t line to restore punctuation and capitalization.
\p
\p NOTE: This example adaptation could be significantly
enhanced if we were more familiar with the syntactic structure
of Norwegian. A quick look through the Norwegian-English
lexicon file will reveal several complex phrases that are
rendered in English, not because the words that make up the
phrases are unknown, or the phrases are special Norwegian
idioms, but simply because the word order is so different.
Idioms need lexical entries, as do some standard phrases, but
simple word order issues are better handled as rearrange rules.
Doing this keeps the main lexicon from filling up with entries
that serve just "to get the adaptation process to work".
\p
\s To go to the next Adaptation tutorial, open the project in
the ADAPT3A folder.