Copyright (c) 1993-2000 by Kostis
Netzwerkberatung
Talstr. 25, D-63322 Rödermark, Tel. +49 6074 881056, FAX 881058
kosta@kostis.net (Kosta
Kostis), http://www.kostis.net/
This information may be used free of charge at your own risk.
This is a snapshot of a work in progress. Things may change, other things may be added/removed. Please wait for the release. Sorry, no date scheduled, yet.
You may use this package free of charge at your own risk but you may not sell this package.
Should you be interested in using any component of this package in a commercial package you must contact me first to find terms. Don't worry, it won't cost you a leg and an arm, I just want to make sure that if there is money made out of this it doesn't pass me by completely. ;)
Currently there are 80 different Character Encoding Description Files supplied with this package, not counting the following files:
iso6429, iso646 (which are included by many other files)
iso10646, iso10646.mes (these are here for reference purposes)
trans covers 7-bit encodings such as ISO 646, 8-bit encodings such as many MS-DOS Codepages (also for IBM OS/2), Microsoft Windows Codepages, ISO 8859, HP, Adobe, Apple Macintosh, Atari, NeXTSTEP Character Encodings, some EBCDIC Encodings, koi8-r and a few more...
Should your favourite Character Encoding be missing, please contribute!
The latest version of this package should be available at:
To create translators, use make to compile, link and install all trans tools, first. Installing means moving the executables into a directory included in your command search path. In order to do that, check Makefile which uses "/usr/local/bin" as the default directory for installing binaries which you probably want to change.
#
# use gcc
#
make # compile trans executables
Makefile offers a couple of options which you may want to use:
make install # this copies executables to /usr/local/bin
make clean # this deletes objects and executables - never mind warningsmake check # check cedf files (create error.log)
make html # create HTML tables from cedf files (check destination in Makefile)
make list # create list of cedf files (create encoding.lis)make date # for my personal use only ;)
make pack # for my personal use only ;)
make uni # for my personal use only :) (unierror.lis)
SunOS using gcc seems to require
make COPTS='-DFILENAME_MAX=200 -DNO_STRUPR -O6'
After that, please change your working directory to the "bin" directory.
Please set an enviroment variable TRANS that points to the directory where this package resides on your computer *including* the trailing directory separator character
e. g.: TRANS="/usr/local/src/charsets/trans130/"
All Character Encoding Description Files reside in the cedf subdir.
If you don't set a variable TRANS the default location "/usr/local/lib/trans/" will be assumed (see file "tab.h", DIR_TRANS).
To test the translator generator after have done at least "make" and "make install", type
cd "$TRANS"bin
one
This should generate two translators between ISO 8859-1 and MS-DOS Codepage 850. Each translator consists of three files (e.g.):
isox850.c the main program isox850.h the header file isox850.tab the translation table file
Each translator will #include the files
trans.c the main invariant program trans.h the main invariant header file
You should be able to compile and link isox850.c and 850xiso.c easily. Read transtab.man to learn more about the syntax for transtab.
Have a look at maketabs respectively to get an inspiration for program names.
This package is written in ANSI-C using the two non-ANSI functions strdup () and strupr (). Sources for these functions are supplied should your compiler/library not contain them. Should you encounter any problems while trying to compile this package, your compiler is very likely not ANSI-C compliant. Should your compiler be ANSI-C compliant and still report warnings and/or errors, please let me know. I'll need the following data in order to to help you:
- Version of this package (e. g. V1.30)
- Operating System and Version
- Compiler name and Version
- Compiler options used (if any)
The directory tree for this utility should look like this:
directory | file | description | ||
./ | contains the complete package | |||
index.htm | this file | |||
Makefile | sample makefile for U*IX using gcc | |||
encoding.lis | list of Character Encoding Description Files | |||
error.log | output created by checkall | |||
unierror.log | diffs between cedf and selected Unicode files | |||
src/ | contains the translation table generator source | |||
Makefile | makefile for gcc (eg. Linux) | |||
comptran.c | compute translation table and output | |||
comptran.h | header file for comptran.c | |||
datatype.h | handy data types | |||
gettrans.c | get TRANS directory | |||
gettrans.h | header file for gettrans.c | |||
head_c.h | generic translator main program | |||
head_h.h | generic translator header file | |||
head_tab.h | generic translator table file header | |||
head_u.h | generic translator Unicode FormatA file header | |||
loadtab.c | read xlt binary table and Unicode FormatA | |||
loadtab.h | header file for loadtab.c | |||
os-stuff.h | OS/compiler dependent definitions | |||
readtab.c | read character encoding description file | |||
readtab.h | header file for readtab.c | |||
scanflag.c | parse program parameters and flags | |||
scanflag.h | header file for scanflag.c | |||
strdup.c | in case your compiler doesn't have it | |||
strdup.h | header file for strdup.c | |||
strupr.c | in case your compiler doesn't have it | |||
strupr.h | header file for strupr.c | |||
tab.h | table constants | |||
taberr.h | trans error codes and messages | |||
checkiso.c | checks character encoding description names | |||
checkiso.h | header file for above program | |||
checkiso.man | man page for above program | |||
checkuni.c | compares cedf file with Unicode Format A table | |||
checkuni.h | header file for above program | |||
checkuni.man | man page for above program - for internal use | |||
transiso.c | translator generator to ISO 10646 main program | |||
transiso.h | header file for above program | |||
transiso.man | man page for above program | |||
transtab.c | translator generator main program | |||
transtab.h | header file for above program | |||
transtab.man | man page for above program | |||
transce8.c | translator program (8-bit) main program | |||
transce8.h | header file for above program | |||
transce8.man | man page for above program | |||
transhtm.c | program that displays HTML tables | |||
transhtm.h | header file for above program | |||
transhtm.man | man page for above program | |||
checkall | check all tables | |||
chkuni | for internal use only | |||
mklist | create list of all tables | |||
mkhtml | create HTML table (mkxlt may be required before running this one) | |||
mkxlt | create XLT files (binary translation files) | |||
bin/ | contains the translator main program (invariant part) and a few scripts to create translators | |||
compile | compile one program | |||
makeall | compile all programs | |||
maketabs | create many translator sources | |||
one | create one translator | |||
trans.c | invariant main translator program | |||
trans.h | invariant main translator header file | |||
utf.c | convert from/to plain 16-bit Unicode/UTF | |||
utf.h | header for utf.c | |||
utimbuf.h | helps to keep file date stamps | |||
htm/ | contains information in HTML format about the description files and other more general information | |||
cedf/ | contains Character Encoding Description Files | |||
adobeiso | Adobe ISOLatin1Encoding Encoding Vector | |||
adobestd | Adobe StandardEncoding Encoding Vector | |||
adobesym | Adobe Symbol Encoding Vector | |||
applecro | Apple Macintosh Croatian | |||
applegk2 | Apple ][ Greek extended for Macintosh | |||
applegrk | Apple Macintosh Greek | |||
appleice | Apple Macintosh Icelandic | |||
applerom | Apple Macintosh Roman | |||
applerum | Apple Macintosh Romanian | |||
appletur | Apple Macintosh Turkish | |||
atarist | Atari ST/TT | |||
cp1250 | Microsoft Windows Codepage 1250 (EE) | |||
cp1251 | Microsoft Windows Codepage 1251 (Cyrl) | |||
cp1252 | Microsoft Windows Codepage 1252 (ANSI) | |||
cp1253 | Microsoft Windows Codepage 1253 (Greek) | |||
cp1254 | Microsoft Windows Codepage 1254 (Turk) | |||
cp1255 | Microsoft Windows Codepage 1255 (Hebr) | |||
cp1256 | Microsoft Windows Codepage 1256 (Arab) | |||
cp1257 | Microsoft Windows Codepage 1256 (BaltRim) | |||
cp1258 | Microsoft Windows Codepage 1256 (Viet) | |||
mslinedr | Microsoft Windows MS LineDraw | |||
symbol | Microsoft Windows Symbol Encoding Vector | |||
wingding | Microsoft Windows Wingdings Encoding Vector | |||
cp437 | IBM Codepage 437 (US) | |||
cp737 | IBM Codepage 737 (Greek defacto Standard) | |||
cp775 | IBM Codepage 775 (BaltRim) | |||
cp850 | IBM Codepage 850 (Multilingual Latin 1) | |||
cp851 | IBM Codepage 851 (Greece) - obsolete | |||
cp852 | IBM Codepage 852 (Multilingual Latin 2) | |||
cp853 | IBM Codepage 853 (Multilingual Latin 3) | |||
cp855 | IBM Codepage 855 (Russia) - obsolete | |||
cp857 | IBM Codepage 857 (Multilingual Latin 5) | |||
cp860 | IBM Codepage 860 (Portugal) | |||
cp861 | IBM Codepage 861 (Iceland) | |||
cp862 | IBM Codepage 862 (Israel) | |||
cp863 | IBM Codepage 863 (Canada (French)) | |||
cp864 | IBM Codepage 864 (Arabic) | |||
cp865 | IBM Codepage 865 (Norway) | |||
cp866 | IBM Codepage 866 (Russia) | |||
cp869 | IBM Codepage 869 (Greece) | |||
cp874 | IBM Codepage 874 (Thai) | |||
cp895 | IBM Codepage 895 (Czech Kamenicky) | |||
decmcs | DEC Multinational Character Set (DEC MCS) | |||
ebc037 | EBCDIC Codepage 037 | |||
ebc500 | EBCDIC Codepage 500 | |||
ebc875 | EBCDIC Codepage 875 (Greek) | |||
ebc1026 | EBCDIC Codepage 1026 (Turkish) | |||
ebc1047 | EBCDIC Codepage 1047 | |||
hp48 | HP 48 Character Set | |||
hproman8 | HP Roman-8 | |||
iso10646 | ISO 10646 (sorted by name, 16-bit) | |||
iso10646.mes | ||||
iso6429 | ISO 6429 Control Characters (00-1F, 7F) | |||
iso646 | ISO 646 (common character base) | |||
iso646.ca | ISO 646 (French Canadian) | |||
iso646.ch | ISO 646 (Swiss) | |||
iso646.de | ISO 646 (German) | |||
iso646.es | ISO 646 (Spanish) | |||
iso646.fi | ISO 646 (Finnish) | |||
iso646.fr | ISO 646 (French) | |||
iso646.gb | ISO 646 (United Kingdom) | |||
iso646.irv | ISO 646 (International Reference Version) | |||
iso646.it | ISO 646 (Italian) | |||
iso646.nl | ISO 646 (Dutch) | |||
iso646.no | ISO 646 (Norwegian/Danish) | |||
iso646.pt | ISO 646 (Portuguese) | |||
iso646.se | ISO 646 (Swedish) | |||
iso8859.1 | ISO 8859-1 (Latin 1) | |||
iso8859.2 | ISO 8859-2 (Latin 2) | |||
iso8859.3 | ISO 8859-3 (Latin 3) | |||
iso8859.4 | ISO 8859-4 (Latin 4) | |||
iso8859.5 | ISO 8859-5 (Latin/Cyrillic) | |||
iso8859.6 | ISO 8859-6 (Latin/Arabic) | |||
iso8859.7 | ISO 8859-7 (Latin/Greek) | |||
iso8859.8 | ISO 8859-8 (Latin/Hebrew) | |||
iso8859.9 | ISO 8859-9 (Latin 5) | |||
iso8859.10 | ISO 8859-10 (Latin 6) | |||
iso8859.13 | ISO 8859-13 (Latin 7 - Baltic Rim) | |||
iso8859.14 | ISO 8859-14 (Latin 8 - Celtic) | |||
iso8859.15 | ISO 8859-15 (Latin 9) | |||
koi8-r | Cyrillic encoding as defined in RFC-1489 | |||
nextstep | NeXTSTEP Encoding Vector | |||
tex-dcr.in | TeX dcr input (contains non-ISO 10646 names) | |||
tex-dcr.out | TeX dcr output (contains non-ISO 10646 names) | |||
xlt/ | contains binary conversion tables (default is little endian) | |||
all files listed in
cedf/ should be here, except for iso6429, iso646, iso10646, iso10646.mes Should you not have a "little endian" CPU (Intel i386, i486, Pentium and many other brands), please do a "make bintab" to create the very same tables using your native byte order. This will most likely only work on U*IX (like) systems. |