\_sh v3.0  560  Readme Notes

\id ADAPT
\s Adaptation
\p Adaptation ("Adapt" for short) refers to the conversion or adaptation of text from one
language to another. It only works between related languages because it does only limited
rearrangements of words and morphemes. Adaptation does a very literal translation from
source to target. This is good if the source and target are closely related so that word order
and the idiomatic expressions of the source text generally work well in the target language as
well. 
\p Adaptation has usually been done with a set of DOS programs called CARLA. Shoebox can
be used to manage the lexical data while running these programs. Shoebox also has features
of its own which can be used for adaptation. Most of these samples illustrate the use of the
Shoebox features. One shows how to use Shoebox with the DOS programs. (The DOS
programs are documented elsewhere.)
\p
\p
\s The ADAPT Tutorial Samples

\s2 Overview
\p The ADAPT folder contains a series of  adaptation sample folders named ADAPT1A,
ADAPT1B, ADAPT2A, etc. The samples are numbered in the recommended order for
studying them, as they generally build on each other, with more advanced techniques shown
in the higher numbered samples.
\p The sample projects are tutorials that will teach you how to do adaptation with Shoebox.
Each sample project opens showing a README.TXT file that explains how the sample
works and points out things that are of special interest. Use project, Open to get from one
example to the next.
\s3 ADAPT1A -- Prinderella and the Cince
\p A fanciful example that demonstrates converting text in a very closely related language to
another. This example focuses on using the lexicon as the main conversion tool and addresses
dealing with ambiguity.
\s3 ADAPT1B -- Prinderella and the Cince
\p The same story as ADAPT1A, but this example adds a parsing process to the adaptation
process. Doing this lightens the load the lexicon must bear by eliminating derivable forms.
This example also introduces the use of Phonological Rules to deal with peculiarities of
English spelling, which is similar to morphological problems.
\s3 ADAPT2A -- Norwegian - English adaptation
\p A more realistic adaptation process, this example takes a Norwegian translation of Mark 14
and converts it into English. It demonstrates the use of Rearrange Rules to account for
differences in word order between the two related languages.
\s3 ADAPT2B -- Norwegian - English adaptation
\p This is a full-blown adaptation process that takes the same Norwegian text of ADAPT2A
and uses Parsing, Rearrange Rules and Phonological Rules to make the adaptation more
predictive and accurate, thereby reducing the number of phrase-level lexical entries needed.
\s3 ADAPT3A -- Yawelmani Generative Morphology
\p This project demonstrates the power of the Generate process and the Phonological rules files
that Shoebox uses by taking Yawelmani underlying forms and in one step processes them
through a complex set of rules to produce the attested surfaces forms. This capability of the
Generate process is useful both in working out the phonological analysis of a language and
for adapting one language to another.
\s3 ADAPT3B -- Yawelmani Generative Morphology
\p This sample takes the same Yawelmani data and rules but splits the rules into seven different
Generate process steps to demonstrate the stages that an underlying form must go through to
reach the surface form.
\s3 ADAPT3C -- Yawelmani Generative Morphology
\p Again the same Yawelmani data and rules, but this one constitutes a compromise between
the one-step process of ADAPT3A and the seven-step process of ADAPT3C. This setup
uses three Generate processes and rule files to derive the surface forms. This layout is more
manageable as a working environment.
\s3 ADAPT4A -- Shoebox support for DOS CARLA programs
\p This sample is mostly for those who have been using CARLA and Shoebox for DOS but are
wanting to move into this version of Shoebox for their dictionaries.
\nt Note: ADAPT4A requires that you have the CARLA and CC programs on your DOS path.

\s2 Details
\p The Adaptation samples assume you have a basic knowledge of how to do the basic
operations in Shoebox, including interlinear. If you have not used Shoebox yet, you should go
through at least the Basic Features and Interlinear chapters of the Walkthrough before you go
through these samples.
\p To open a sample, choose Project, Open. Navigate to the sample folder, and open whatever
project file you see there. The name of the project file varies from sample to sample, but each
sample folder contains only one project file.
\p Each sample is a fully working adaptation setup that you can explore freely and use as a
model for your own setup. The word "observe" is used in comments to highlight areas of
special interest in the actual language sample . If you use Edit, Find to look for the word
"observe" in comment (\co) fields, you will see discussions of all the places that are of special
interest.
\p Some of the sample projects contain exercises that you can do to reinforce your
understanding of the techniques illustrated in the sample. The README file picked up by
each project gives explanations of the exercises.
\p
\s2 Tip
\p Each sample project is sized to fit on a standard VGA screen. If you have a larger screen,
the project will show in the upper left corner of the screen. The README file is placed in the
lower right corner so that if you have a larger screen, you can drag down the lower right
corner of the project window to enlarge it, and then drag down the lower right corner of the
README file so you can see more of the text at once. If you want to see even more at
once, you can print the README file by choosing File, Print.
\p
\p
\s Details on the Interlinear and Adaptation Processes

\s2 The difference between the "Interlinear" (Alt+I) and "Adaptation" (Alt+A) shortcut keys
\p It can get confusing that the Interlinear tab contains processes for both interlinearization and
for adaptation. Basically, the Parse process is always an Interlinear process, and the
Rearrange or Generate processes are always Adaptation processes. A Lookup process is
usually an Interlinear process but can be marked as an Adapt process (useful if you are not
Parsing and will only be adapting the text). If the Interlinear setup has both Interlinear and
Adaptation processes, then here is how the shortcut keys work:
\p *  If the text line (e.g. \t) has not been interlinearized yet: 
\p           *  Alt+I (command-I on the Mac) will do only the Interlinear processes and will 
\p               process the entire line.
\p           *  Alt+A (command-A on the Mac) will do all of the processes (both the Interlinear
\p               and Adaptation processes --it does the Interlinear steps in order to get the
\p               information it needs to do the adaptation) on the entire line.
\p *  If the text line has already been Interlinearized and Adapted:
\p           *  Alt+I will rerun the Interlinear processes on a word-by-word basis. This deletes
\p               any information in the Adaptation lines underneath the word being reinterlinearized.
\p           *  Alt+A reruns all of the Adaptation processes for the entire line; it assumes the
\p               interlinear information is correct.
\p
\s2 The Interlinearize Toolbar Button 
\p The Interlinearize Button on the Toolbar always acts just like an Alt+A. If only Interlinear
processes have been defined --no Adaptation processes-- the Interlinearize toolbar button
runs the Interlinear processes as expected.
\p
\s2 Rearrange Process
\p The Rearrange process uses a "Rearrangement Rule File". This provides Shoebox with
needed syntactic information that enables it to reorder words and phrases in the adaptation
process. This file can contain any number of rearrangement rules, but only one rule at a time
can apply for any given Rearrange process; once a rule is matched, the output is produced
and the adaptation moves on to the next process. Defined symbols in a rearrangement file are
not put in square brackets in the rules and can be mixed freely with text, e.g. "N -the" can be
rearranged to "the N". You are encouraged use the REARRANG.TYP file to build your own
Rearrangement rule files. 
\p
\s2 Generate Process
\p The Generate process uses a "Phonological Rule File". This file looks very similar to the
Rearrangement rule file, but it isn't the same. This file can contain any number of phonological
rules that are all processed in order before the output is produced. Also any defined symbols
in this file must be included in the rules in square brackets, e.g. [C]. You are encouraged to
use the PHONRULE.TYP file to build your own Phonological rule files.
\p
\p
\s To start the First Adaptation Tutorial
\p * Choose Project-Open 
\p * Go into the folder ADAPT1A
\p * Open the PRINDER.PRJ
\p
\nt Have Fun!
\p
\p
\s Making Your Own Adaptation Setup

\s2 How to copy and convert a sample project as a start. 
\p After you have studied these samples, choose the one that is most like what you want to do
and use it as a model set up your own project. One way to do this is to copy the contents of
the sample folder to  your own project folder and modify the sample to use your own file
names and your own setup. Load your own lexicon and link it into the Interlinear setup
processes (rather than the one used by the original setup). You might want to keep the
original tutorial around to check as an example if you are unsure of marking Alternate and
Underlying forms.
\p You can also deactivate any rule or definition in a rule file (if you are using a setup with these)
by changing the \ru and \def fields to \dis (for "disable"). Doing this will gray out the rule or
definition, showing it is disabled. This allows you to keep the rules and definitions in the rule
file for reference, but Shoebox will ignore them when processing your data. Delete them
when you know you don't need them anymore.
\p
\s2 How to import text for adaptation 
\p Once you've copied the tutorial you want and are ready to try adapting your own text, you'll
first need to import your text into Shoebox. To import text, use the File-Open command and
select the text file you  want to import.
\p If the file has never been in Shoebox (or has been in an older version of Shoebox), you will
likely see the Import dialog box. Here you need to specify the database type for this file (if
you are using an adaptation sample as the basis for your own, this is the "Interlinear" type).
\p If you are importing plain, raw text, you will want to specify a Consistent Changes Table. The
table TEXTPREP.CCT is provided in the "SHOEBOX\STD_SET" folder for this purpose. If
you need to convert the file from ASCII to ANSI, you should do that first using
IBM_ANSI.CCT or your own customized table.
\p If you are importing standard format text that is organized with markers \id, \c, and \v, use the
table SCRPREP.CCT instead of TEXTPREP.CCT.
\p
\s2 How to export adapted text
\p Once the file has been adapted to the target language, you might want to get rid of all of the
processing lines and just keep the resulting target text line (\e in the adaptation samples). The
easiest way to get this is to export the file. 
\s3 Quick File-Export
\p Choose File-Export, select Standard Format and click OK. In the SF Export Properties
dialog box, click off the All Fields check box. Now select only the field(s) you want exported
by clicking on the Select Fields button and move the fields you want to the Include side of the
Select Fields dialog box. Click OK. If you need a Consistent Changes Table, you can
specify one by using the Browse button. Once you are done setting up the export, click OK
in the Properties dialog box. Then give Shoebox a name for the export file, and  Shoebox will
export the file for you.
\s3 Create your own File-Export Option
\p If you think you will be exporting files such as this often, you can create a special  export
option just for that purpose. To do this, choose File-Export, but rather than clicking OK,
click Add. Select Standard Format from the choices and click OK. Give the new export 
process a unique name, e.g. Adapted Text. Then set up this export option just like the one
above. When you click OK here, Shoebox will save this new export option so that it will be 
available next time you choose File-Export.