This application
converts text looking like this:
-NUMBER-
100002
-INPUT_DATE-
8301
-SUBJECT-
CARDIOVASCULAR
-TITLE-
EFFECTS OF CALCIUM ENTRY ANTAGONISTS IN HYPERTENSION
-AUTHOR-
KREBS H, GRAEFE K H, ZIEGLER R
-REFERENCE-
CLIN EXP HYPERTENS, 1982, 4:271-284
-DOCUMENT_TYPE-
REVIEW
-KEYWORDS-
CALCIUM ANTAGONISTS, HYPERTENSION, MODE OF ACTION,
VERAPAMIL, NIFEDIPINE
-END- |
into text like this:
-NUMBER-
100002
-INPUT_DATE-
8301
-SUBJECT-
Cardiovascular
-TITLE-
Effects of calcium entry antagonists in hypertension
-AUTHOR-
Krebs H, Graefe K H, Ziegler R
-REFERENCE-
CLIN EXP HYPERTENS, 1982, 4:271-284
-DOCUMENT_TYPE-
Review
-KEYWORDS-
Calcium antagonists, hypertension, mode of action,
Verapamil, Nifedipine
-END- |
If you wonder why anyone has text in all upper case then
you haven't worked in libraries. When they first computerised
(around 1983 for this example) it was often not practical to
use lower case letters.
If you wonder why a program is needed - think of 20,000
records like the one above. Think also that many of those
records will have 100 or 200 words of abstract included as
well.
Why bother?
Readability is significantly improved by having
words in the correct case.
Comprehension is improved when proper nouns and
trade names are correctly capitalised.
Professionalism is enhanced when printed documents
have correctly capitalised text and trade names are correctly
identified.
Why not use Word?
Of course you can select the text and do a Shift+F3 but
that will still leave an awful lot of manual work.
You could use the spelling checker but that will still be
very labour intensive.
In either case, when you have finished one file, almost
nothing is of value to carry forward to the next one.
How does it work?
The application was designed to deal with arbitrary
Personal Librarian source files and to give them the correct
capitalisation of terms. The examples above show a typical
Personal Librarian layout. [Personal Librarian was a major
full text windows based text retrieval application of the late
80's. It is no longer commercially available - the company was
taken over by America Online and the text retrieval engine is
what is used on AOL. It can of course be extended to
cover other file types and layouts.]
Correct capitalisation means that the first term in a
Personal Librarian field will have an initial capital as will any term following a full
stop, question mark or exclamation mark. A field is marked by –FIELDNAME–.
Other terms (proper
nouns, abbreviations, tradenames) will be capitalised only if
they are mentioned in the capitalisation dictionary or if they
are in any of the specialised fields such as Author or
Reference.
In the example above Verapamil and Nifedipine, two terms in
the KEYWORDS field are names of proprietary medicines and were
in the dictionary.
An example of such a dictionary:
ACTH
Africans
ANBPS
April
Arzneim
Arzneim-Forsch
Boston
BP
British
CAS
Nifedipine
P-Hydroxytriamterene
Verapamil |
To assist in preparing the capitalisation dictionary two
programs are provided - one to extract a list of new unique
terms from a file and another to merge two such lists.
With the three applications and their multiplicity of
command line switches, some batch files and some diligence in
properly formatting the terms you do want to be capitalised, a
large source file can quickly become much more usable by
giving most terms the correct case.
Make use of this expertise yourself by acquiring
the application or ask us to do it for
you as a service. |