Languages Around The World

Packaging ICU

Overview

This chapter describes, for the advanced user, how to package ICU for distribution, whether alone or as part of an application.

Making ICU Smaller

The ICU project is intended to provide everything an application might need in order to process Unicode. However, in doing so, the results may become quite large on disk. A default build of ICU normally results in over 8 MB of data, and a substantial amount of object code. This section describes some techniques to reduce the size of ICU to only the items which are required for your application.

Reduce the number of libraries used

ICU consists of a number of different libraries. The library dependency chart can be used to understand and determine the exact set of libraries needed.

Disable ICU features

Certain features of ICU may be turned on and off through preprocessor defines. These switches are located in the file "uconfig.h", and disable the code for certain features from being built.

All of these switches are defined to '0' by default, unless overridden by the build environment, or by modifying uconfig.h itself.

Switch NameLibraryEffect if #defined to '1'
UCONFIG_ONLY_COLLATIONcommon
& i18n
Turn off all other modules named here except collation and legacy conversion
UCONFIG_NO_LEGACY_CONVERSIONcommonTurn off conversion apart from UTF, CESU-8, SCSU, BOCU-1, US-ASCII, and ISO-8859-1. Not possible to turn off legacy conversion on EBCDIC platforms.
UCONFIG_NO_BREAK_ITERATIONcommonTurn off break iteration
UCONFIG_NO_COLLATIONi18nTurn off collation and collation-based string search.
UCONFIG_NO_FORMATTINGi18nTurn off all formatting (date, time, number, etc), and calendar/timezone services.
UCONFIG_NO_TRANSLITERATIONi18nTurn off script-to-script transliteration
UCONFIG_NO_REGULAR_EXPRESSIONSi18nTurn off the regular expression functionality
Note
These switches do not necessarily disable data generation. For example, disabling formatting does not prevent formatting data from being built into the resource bundles. See the section on ICU data, for information on changing data packaging.

Using UCONFIG switches with Environment Variables

This method involves setting an environment variable when ICU is built. For example, on a POSIX-like platform, settings may be chosen at the point runConfigureICU is run:

env CPPFLAGS="-DUCONFIG_NO_COLLATION=1 -DUCONFIGU_NO_FORMATTING=1" \
   runConfigureICU SOLARISCC ...

Note that when end-user code is compiled, it must also have the same CPPFLAGS set, or else calling some functions may result in a link failure.

Using UCONFIG switches by changing uconfig.h

This method involves modifying the source file icu/source/common/unicode/uconfig.h directly, before ICU is built. It has the advantage that the configuration change is propagated to all clients who compile against this build of ICU, however the altered file must be tracked when the next version of ICU is installed.

Modify 'uconfig.h' to add the following lines before the first #ifndef UCONFIG_... section



#ifndef UCONFIG_NO_COLLATION
#define UCONFIG_NO_COLLATION 1
#enddif

#ifndef UCONFIG_NO_FORMATTING
#define UCONFIG_NO_FORMATTING 1
#endif


Reduce ICU Data used

There are many ways in which ICU data may be reduced. If only certain locales or converters will be used, others may be removed. Additionally, data may be packaged as individual files or interchangeable archives (.dat files), allowing data to be installed and removed without rebuilding ICU. For details, see the ICU Data chapter.

ICU Versions

(This section assumes the reader is familiar with ICU version numbers as covered in the Design chapter, and filename conventions for libraries in the ReadMe .)

POSIX Library Names

The following table gives an example of the dynamically linked library and symbolic links built by ICU for the common ('uc') library, version 5.4.3, for Linux

FileLinks toPurpose
libicuuc.solibicuuc.so.54.3Required for link: Applications compiled with '-licuuc' will follow this symlink.
libicuuc.so.54libicuuc.so.54.3Required for runtime: This name is what applications actually link against.
libicuuc.so.54.3Actual libraryRequired for runtime and link. Contains the name 'libicuuc.so.54'.

NoteThis discussion gives Linux as an example, but it is typical for most platforms, of which AIX and 390 (zOS) are exceptions.

An application compiled with '-licuuc' will follow the symlink from libicuuc.so to libicuuc.so.54.3, and will actually read the file libicuuc.so.54.3. (fully qualified). This library file has an embedded name (SONAME) of libicuuc.so.54, that is, with only the major and minor number. The linker will write this name into the client application, because Binary compatibility is for versions that share the same major+minor number.

If ICU version 5.4.7 is subsequently installed, the following files may be updated.

FileLinks toPurpose
libicuuc.solibicuuc.so.54.7Required for link: Newly linked applications will follow this link, which should not cause any functional difference at link time.
libicuuc.so.54libicuuc.so.54.7Required for runtime: Because it now links to version .7, existing applications linked to version 5.4.3 will follow this link and use the 5.4.7 code.
libicuuc.so.54.7Actual libraryRequired for runtime and link. Contains the name 'libicuuc.so.54'.

If ICU version 5.6.3 or 3.2.9 were installed, they would not affect already-linked applications, because the major+minor numbers are different - 56 and 32, respectively, as opposed to 54. They would, however, replace the link 'libicuuc.so', which controls which version of ICU newly-linked applications use.

In summary, what files should an application distribute in order to include a functional runtime copy of ICU 5.4.3? The above application should distribute libicuuc.so.54.3 and the symbolic link libicuuc.so.54. (If symbolic links pose difficulty, libicuuc.so.54.3 may be renamed to libicuuc.so.54, and only libicuuc.so.54 distributed. This is less informative, but functional.)

POSIX Library suffix

The --with-library-suffix option may be used with runConfigureICU or configure, to distinguish on disk specially modified versions of ICU. For example, the option --with-library-suffix=myapp will produce libraries with names such as libicuucmyapp.so.54.3, thus preventing another ICU user from using myapp's custom ICU libraries.

While two or more versions of ICU may be linked into the same application as long as the major and minor numbers are different, changing the library suffix is not sufficient to allow the same version of ICU to be linked. In other words, linking ICU 5.4.3, 5.6.3, and 3.2.9 together is allowed, but 5.4.3 and 5.4.7 may not be linked together, nor may 5.4.3 and 5.4.3-myapp be linked together.

Windows library names

Assuming ICU version 5.4.3, Windows library names will follow this pattern:

FilePurpose
icuuc.libRelease Link-time library. Needed for development. Contains 'icuuc54.dll' name internally.
icuuc54.dllRelease runtime library. Needed for runtime.
icuucd.libDebug link-time library
(The 'd' suffix indicates debug)
icuuc54d.dllDebug runtime library.

Debug applications must be linked with debug libraries, and release applications with release libraries.

When a new version of ICU is installed, the .lib files will be replaced so as to keep new compiles in sync with the newly installed header files, and the latest DLL. As well, if the new ICU version has the same major+minor version (such as 5.4.7), then DLLs will be replaced, as they are binary compatible. However, if an ICU with a different major+minor version is installed, such as 5.5, then new DLLs will be copied with names such as 'icuuc55.dll'.

Modularization of ICU4J

Some clients may not wish to ship all of ICU4J with their application, since the application might only use a small part of ICU4J. ICU4J release 2.6 and later provide build options to build individual ICU4J 'modules' for a more compact distribution. The modules are based on a service and the APIs that define it, e.g., the normalizer module supports all the APIs of the Normalizer class (and some others). Tests can be run to verify that the APIs supported by the module function correctly. Because of internal code dependencies, a module contains extra classes that are not part of the module's core service API. Some or most of the APIs of these extra classes will not work. Only the module's core service API is guaranteed. Other APIs may work partially or not at all, so client code should avoid them.

Individual modules are not built directly into their own separate jar files. Since their dependencies often overlap, using separate modules to 'add on' ICU4J functionality would result in unwanted duplication of class files. Instead, building a module causes a subset of ICU4J's classes to be built and put into ICU4J's standard build directory. After one or more module targets are built, the 'moduleJar' target can then be built, which packages the class files into a 'module jar.' Other than the fact that it contains fewer class files, little distinguishes this jar file from a full ICU4J jar file, and in fact they share the same name.

Currently ICU4J can be divided into the following modules:

Key:

Module NameAnt TargetsTest Package SupportedSize‡
Package* Main Classes†

Note*com.ibm. should be prepended to the package name listed.
†Class in bold indicates core service API. Only APIs in this column are fully supported.
‡Sizes are of the compressed jar file containing only this module. These sizes are approximate for release3.6, they may change in future releases.

Modules:

Normalizernormalizer, normalizerTestscom.ibm.icu.dev.test.normalizer465KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, Uscript
icu.text:BreakIterator, CanonicalIterator, Normalizer, Replaceable, ReplaceableString, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

Collatorcollator, collatorTestscom.ibm.icu.dev.test.collator1,911KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, Uscript
icu.text:BreakDictionary, BreakIterator, CanonicalIterator, CollationElementIterator, CollationKey, Collator, DictionaryBasedBreakIterator, Normalizer, RawCollationKey, Replaceable, ReplaceableString, RuleBasedBreakIterator, RuleBasedCollator, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:ByteArrayWrapper, CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

Calendarcalendar, calendarTestscom.ibm.icu.dev.test.calendar2,176KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript
icu.math:BigDecimal, MathContext
icu.text:BreakIterator, CanonicalIterator, ChineseDateFormat, ChineseDateFormatSymbols, CollationElementIterator, CollationKey, Collator, DateFormat, DateFormatSymbols, DecimalFormat, DecimalFormatSymbols, MessageFormat, Normalizer, NumberFormat, PluralFormat, PluralRules, RawCollationKey, Replaceable, ReplaceableString, RuleBasedCollator, RuleBasedNumberFormat, RuleBasedTransliterator, SimpleDateFormat, SymbolTable, UCharacterIterator, UFormat, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:AnnualTimeZoneRule, BasicTimeZone, BuddhistCalendar, ByteArrayWrapper, Calendar, ChineseCalendar, CopticCalendar, Currency, CurrencyAmount, DateRule, DateTimeRule, EasterHoliday, EthiopicCalendar, Freezable, GregorianCalendar, HebrewCalendar, HebrewHoliday, Holiday, IndianCalendar, InitialTimeZoneRule, IslamicCalendar, JapaneseCalendar, Measure, MeasureUnit, RangeDateRule, RangeValueIterator, SimpleDateRule, SimpleHoliday, SimpleTimeZone, StringTokenizer, TaiwanCalendar, TimeZone, TimeZoneRule, TimeZoneTransition, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

BreakIteratorbreakIterator, breakIteratorTestscom.ibm.icu.dev.test.breakiterator1,889KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript
icu.text:BreakDictionary, BreakIterator, CanonicalIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, SymbolTable, Transliterator, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

Formattingformat, formatTestscom.ibm.icu.dev.test.format3,443KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript
icu.math:BigDecimal, MathContext
icu.text:BreakIterator, CanonicalIterator, ChineseDateFormat, ChineseDateFormatSymbols, CollationElementIterator, CollationKey, Collator, DateFormat, DateFormatSymbols, DecimalFormat, DecimalFormatSymbols, DurationFormat, MeasureFormat, MessageFormat, Normalizer, NumberFormat, PluralFormat, PluralRules, RawCollationKey, Replaceable, ReplaceableString, RuleBasedCollator, RuleBasedNumberFormat, SimpleDateFormat, SymbolTable, UCharacterIterator, UFormat, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:AnnualTimeZoneRule, BasicTimeZone, BuddhistCalendar, ByteArrayWrapper, Calendar, ChineseCalendar, CopticCalendar, Currency, CurrencyAmount, DateTimeRule, EthiopicCalendar, Freezable, GregorianCalendar, HebrewCalendar, IndianCalendar, InitialTimeZoneRule, IslamicCalendar, JapaneseCalendar, Measure, MeasureUnit, RangeValueIterator, SimpleTimeZone, StringTokenizer, TaiwanCalendar, TimeArrayTimeZoneRule, TimeZone, TimeZoneRule, TimeZoneTransition, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

Basic PropertiespropertiesBasic, propertiesBasicTestscom.ibm.icu.dev.test.lang554KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript, UScriptRun
icu.text:BreakDictionary, BreakIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

  

Full PropertiespropertiesFull, propertiesFullTestscom.ibm.icu.dev.test.lang1,829KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript, UScriptRun
icu.text:BreakDictionary, BreakIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

StringPrep, IDNAstringPrep, stringPrepTestscom.ibm.icu.dev.test.stringprep488KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript
icu.text:StringPrep, StringParseException, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

Transformstransliterator, transliteratorTestscom.ibm.icu.dev.test.translit890KB
icu.lang:UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript
icu.text:BreakDictionary, BreakIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, RuleBasedCollator, RuleBasedTransliterator, StringTransform, SymbolTable, Transliterator, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16
icu.util:CaseInsensitiveString, CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo

Building any of these modules is as easy as specifying a build target to the Ant build system, e.g:
To build a module that contains only the Normalizer API:

  1. Build the module.
    ant normalizer

  2. Build the jar containing the module.
    ant moduleJar

  3. Build the tests for the module.
    ant normalizerTests

  4. Run the tests and verify that the self tests pass.
    java -classpath classes com.ibm.icu.dev.test.TestAll -nothrow -w

If more than one module is required, the module build targets can be concatenated, e.g:

  1. Build the modules.
    ant normalizer collator

  2. Build the jar containing the modules.
    ant moduleJar

  3. Build the tests for the module.
    ant normalizerTests collatorTests

  4. Run the tests and verify that they pass.
    java -classpath classes com.ibm.icu.dev.test.TestAll -nothrow -w

The jar should be built before the tests, since for some targets building the tests will cause additional classes to be compiled that are not strictly necessary for the module itself.

NoteRegardless of whether ICU4J is built as a whole or as a modules, the jar file produced is named icu4j.jar.
To ascertain if an icu4j.jar contains all of ICU4J or not, please see the manifest file in the jar
The target moduleJar does not depend on any other target. It just creates a jar of all class files under $icu4j_root/classes/com/ibm/icu/, excluding the classes files in $icu4j_root/classes/com/ibm/icu/dev folder
The list of module build targets can be obtained by running the command: ant -projecthelp



Copyright (c) 2000 - 2008 IBM and Others - PDF Version - Feedback: http://icu-project.org/contacts.html

User Guide for ICU v4.0 Generated 2008-06-02.