
Packaging ICU
Overview
This chapter describes, for the advanced user, how to package ICU for distribution, whether alone or as part of an application.
Making ICU Smaller
The ICU project is intended to provide everything an application might need in order to process Unicode. However, in doing so, the results may become quite large on disk. A default build of ICU normally results in over 8 MB of data, and a substantial amount of object code. This section describes some techniques to reduce the size of ICU to only the items which are required for your application.
Reduce the number of libraries used
ICU consists of a number of different libraries. The library dependency chart can be used to understand and determine the exact set of libraries needed.
Disable ICU features
Certain features of ICU may be turned on and off through preprocessor defines. These switches are located in the file "uconfig.h", and disable the code for certain features from being built.
All of these switches are defined to '0' by default, unless overridden by the build environment, or by modifying uconfig.h itself.
Switch Name | Library | Effect if #defined to '1' |
---|---|---|
UCONFIG_ONLY_COLLATION | common & i18n | Turn off all other modules named here except collation and legacy conversion |
UCONFIG_NO_LEGACY_CONVERSION | common | Turn off conversion apart from UTF, CESU-8, SCSU, BOCU-1, US-ASCII, and ISO-8859-1. Not possible to turn off legacy conversion on EBCDIC platforms. |
UCONFIG_NO_BREAK_ITERATION | common | Turn off break iteration |
UCONFIG_NO_COLLATION | i18n | Turn off collation and collation-based string search. |
UCONFIG_NO_FORMATTING | i18n | Turn off all formatting (date, time, number, etc), and calendar/timezone services. |
UCONFIG_NO_TRANSLITERATION | i18n | Turn off script-to-script transliteration |
UCONFIG_NO_REGULAR_EXPRESSIONS | i18n | Turn off the regular expression functionality |
![]() |
These switches do not necessarily disable data generation. For example, disabling formatting does not prevent formatting data from being built into the resource bundles. See the section on ICU data, for information on changing data packaging. |
Using UCONFIG switches with Environment Variables
This method involves setting an environment variable when ICU is built. For example, on a POSIX-like platform, settings may be chosen at the point runConfigureICU is run:
env CPPFLAGS="-DUCONFIG_NO_COLLATION=1 -DUCONFIGU_NO_FORMATTING=1" \ runConfigureICU SOLARISCC ... |
Note that when end-user code is compiled, it must also have the same CPPFLAGS set, or else calling some functions may result in a link failure.
Using UCONFIG switches by changing uconfig.h
This method involves modifying the source file icu/source/common/unicode/uconfig.h directly, before ICU is built. It has the advantage that the configuration change is propagated to all clients who compile against this build of ICU, however the altered file must be tracked when the next version of ICU is installed.
Modify 'uconfig.h' to add the following lines before the first #ifndef UCONFIG_... section
#ifndef UCONFIG_NO_COLLATION #define UCONFIG_NO_COLLATION 1 #enddif #ifndef UCONFIG_NO_FORMATTING #define UCONFIG_NO_FORMATTING 1 #endif |
Reduce ICU Data used
There are many ways in which ICU data may be reduced. If only certain locales or converters will be used, others may be removed. Additionally, data may be packaged as individual files or interchangeable archives (.dat files), allowing data to be installed and removed without rebuilding ICU. For details, see the ICU Data chapter.
ICU Versions
(This section assumes the reader is familiar with ICU version numbers as covered in the Design chapter, and filename conventions for libraries in the ReadMe .)
POSIX Library Names
The following table gives an example of the dynamically linked library and symbolic links built by ICU for the common ('uc') library, version 5.4.3, for Linux
File | Links to | Purpose |
---|---|---|
libicuuc.so | libicuuc.so.54.3 | Required for link: Applications compiled with '-licuuc' will follow this symlink. |
libicuuc.so.54 | libicuuc.so.54.3 | Required for runtime: This name is what applications actually link against. |
libicuuc.so.54.3 | Actual library | Required for runtime and link. Contains the name 'libicuuc.so.54'. |
![]() | This discussion gives Linux as an example, but it is typical for most platforms, of which AIX and 390 (zOS) are exceptions. |
An application compiled with '-licuuc' will follow the symlink from libicuuc.so to libicuuc.so.54.3, and will actually read the file libicuuc.so.54.3. (fully qualified). This library file has an embedded name (SONAME) of libicuuc.so.54, that is, with only the major and minor number. The linker will write this name into the client application, because Binary compatibility is for versions that share the same major+minor number.
If ICU version 5.4.7 is subsequently installed, the following files may be updated.
File | Links to | Purpose |
---|---|---|
libicuuc.so | libicuuc.so.54.7 | Required for link: Newly linked applications will follow this link, which should not cause any functional difference at link time. |
libicuuc.so.54 | libicuuc.so.54.7 | Required for runtime: Because it now links to version .7, existing applications linked to version 5.4.3 will follow this link and use the 5.4.7 code. |
libicuuc.so.54.7 | Actual library | Required for runtime and link. Contains the name 'libicuuc.so.54'. |
If ICU version 5.6.3 or 3.2.9 were installed, they would not affect already-linked applications, because the major+minor numbers are different - 56 and 32, respectively, as opposed to 54. They would, however, replace the link 'libicuuc.so', which controls which version of ICU newly-linked applications use.
In summary, what files should an application distribute in order to include a functional runtime copy of ICU 5.4.3? The above application should distribute libicuuc.so.54.3 and the symbolic link libicuuc.so.54. (If symbolic links pose difficulty, libicuuc.so.54.3 may be renamed to libicuuc.so.54, and only libicuuc.so.54 distributed. This is less informative, but functional.)
POSIX Library suffix
The --with-library-suffix option may be used with runConfigureICU or configure, to distinguish on disk specially modified versions of ICU. For example, the option --with-library-suffix=myapp will produce libraries with names such as libicuucmyapp.so.54.3, thus preventing another ICU user from using myapp's custom ICU libraries.
While two or more versions of ICU may be linked into the same application as long as the major and minor numbers are different, changing the library suffix is not sufficient to allow the same version of ICU to be linked. In other words, linking ICU 5.4.3, 5.6.3, and 3.2.9 together is allowed, but 5.4.3 and 5.4.7 may not be linked together, nor may 5.4.3 and 5.4.3-myapp be linked together.
Windows library names
Assuming ICU version 5.4.3, Windows library names will follow this pattern:
File | Purpose |
---|---|
icuuc.lib | Release Link-time library. Needed for development. Contains 'icuuc54.dll' name internally. |
icuuc54.dll | Release runtime library. Needed for runtime. |
icuucd.lib | Debug link-time library (The 'd' suffix indicates debug) |
icuuc54d.dll | Debug runtime library. |
Debug applications must be linked with debug libraries, and release applications with release libraries.
When a new version of ICU is installed, the .lib files will be replaced so as to keep new compiles in sync with the newly installed header files, and the latest DLL. As well, if the new ICU version has the same major+minor version (such as 5.4.7), then DLLs will be replaced, as they are binary compatible. However, if an ICU with a different major+minor version is installed, such as 5.5, then new DLLs will be copied with names such as 'icuuc55.dll'.
Modularization of ICU4J
Some clients may not wish to ship all of ICU4J with their application, since the application might only use a small part of ICU4J. ICU4J release 2.6 and later provide build options to build individual ICU4J 'modules' for a more compact distribution. The modules are based on a service and the APIs that define it, e.g., the normalizer module supports all the APIs of the Normalizer class (and some others). Tests can be run to verify that the APIs supported by the module function correctly. Because of internal code dependencies, a module contains extra classes that are not part of the module's core service API. Some or most of the APIs of these extra classes will not work. Only the module's core service API is guaranteed. Other APIs may work partially or not at all, so client code should avoid them.
Individual modules are not built directly into their own separate jar files. Since their dependencies often overlap, using separate modules to 'add on' ICU4J functionality would result in unwanted duplication of class files. Instead, building a module causes a subset of ICU4J's classes to be built and put into ICU4J's standard build directory. After one or more module targets are built, the 'moduleJar' target can then be built, which packages the class files into a 'module jar.' Other than the fact that it contains fewer class files, little distinguishes this jar file from a full ICU4J jar file, and in fact they share the same name.
Currently ICU4J can be divided into the following modules:
Key:
Module Name | Ant Targets | Test Package Supported | Size‡ |
---|---|---|---|
Package* | Main Classes† |
![]() | *com.ibm. should be prepended to the package name listed. |
†Class in bold indicates core service API. Only APIs in this column are fully supported. |
‡Sizes are of the compressed jar file containing only this module. These sizes are approximate for release3.6, they may change in future releases. |
Modules:
Normalizer | normalizer, normalizerTests | com.ibm.icu.dev.test.normalizer | 465KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, Uscript | ||
icu.text: | BreakIterator, CanonicalIterator, Normalizer, Replaceable, ReplaceableString, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
Collator | collator, collatorTests | com.ibm.icu.dev.test.collator | 1,911KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, Uscript | ||
icu.text: | BreakDictionary, BreakIterator, CanonicalIterator, CollationElementIterator, CollationKey, Collator, DictionaryBasedBreakIterator, Normalizer, RawCollationKey, Replaceable, ReplaceableString, RuleBasedBreakIterator, RuleBasedCollator, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | ByteArrayWrapper, CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
Calendar | calendar, calendarTests | com.ibm.icu.dev.test.calendar | 2,176KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript | ||
icu.math: | BigDecimal, MathContext | ||
icu.text: | BreakIterator, CanonicalIterator, ChineseDateFormat, ChineseDateFormatSymbols, CollationElementIterator, CollationKey, Collator, DateFormat, DateFormatSymbols, DecimalFormat, DecimalFormatSymbols, MessageFormat, Normalizer, NumberFormat, PluralFormat, PluralRules, RawCollationKey, Replaceable, ReplaceableString, RuleBasedCollator, RuleBasedNumberFormat, RuleBasedTransliterator, SimpleDateFormat, SymbolTable, UCharacterIterator, UFormat, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | AnnualTimeZoneRule, BasicTimeZone, BuddhistCalendar, ByteArrayWrapper, Calendar, ChineseCalendar, CopticCalendar, Currency, CurrencyAmount, DateRule, DateTimeRule, EasterHoliday, EthiopicCalendar, Freezable, GregorianCalendar, HebrewCalendar, HebrewHoliday, Holiday, IndianCalendar, InitialTimeZoneRule, IslamicCalendar, JapaneseCalendar, Measure, MeasureUnit, RangeDateRule, RangeValueIterator, SimpleDateRule, SimpleHoliday, SimpleTimeZone, StringTokenizer, TaiwanCalendar, TimeZone, TimeZoneRule, TimeZoneTransition, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
BreakIterator | breakIterator, breakIteratorTests | com.ibm.icu.dev.test.breakiterator | 1,889KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript | ||
icu.text: | BreakDictionary, BreakIterator, CanonicalIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, SymbolTable, Transliterator, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
Formatting | format, formatTests | com.ibm.icu.dev.test.format | 3,443KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript | ||
icu.math: | BigDecimal, MathContext | ||
icu.text: | BreakIterator, CanonicalIterator, ChineseDateFormat, ChineseDateFormatSymbols, CollationElementIterator, CollationKey, Collator, DateFormat, DateFormatSymbols, DecimalFormat, DecimalFormatSymbols, DurationFormat, MeasureFormat, MessageFormat, Normalizer, NumberFormat, PluralFormat, PluralRules, RawCollationKey, Replaceable, ReplaceableString, RuleBasedCollator, RuleBasedNumberFormat, SimpleDateFormat, SymbolTable, UCharacterIterator, UFormat, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | AnnualTimeZoneRule, BasicTimeZone, BuddhistCalendar, ByteArrayWrapper, Calendar, ChineseCalendar, CopticCalendar, Currency, CurrencyAmount, DateTimeRule, EthiopicCalendar, Freezable, GregorianCalendar, HebrewCalendar, IndianCalendar, InitialTimeZoneRule, IslamicCalendar, JapaneseCalendar, Measure, MeasureUnit, RangeValueIterator, SimpleTimeZone, StringTokenizer, TaiwanCalendar, TimeArrayTimeZoneRule, TimeZone, TimeZoneRule, TimeZoneTransition, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
Basic Properties | propertiesBasic, propertiesBasicTests | com.ibm.icu.dev.test.lang | 554KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript, UScriptRun | ||
icu.text: | BreakDictionary, BreakIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
Full Properties | propertiesFull, propertiesFullTests | com.ibm.icu.dev.test.lang | 1,829KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript, UScriptRun | ||
icu.text: | BreakDictionary, BreakIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
StringPrep, IDNA | stringPrep, stringPrepTests | com.ibm.icu.dev.test.stringprep | 488KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript | ||
icu.text: | StringPrep, StringParseException, SymbolTable, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
Transforms | transliterator, transliteratorTests | com.ibm.icu.dev.test.translit | 890KB |
---|---|---|---|
icu.lang: | UCharacter, UCharacterCategory, UCharacterDirection, UCharacterEnums, UProperty, UScript | ||
icu.text: | BreakDictionary, BreakIterator, DictionaryBasedBreakIterator, Normalizer, Replaceable, ReplaceableString, RuleBasedBreakIterator, RuleBasedCollator, RuleBasedTransliterator, StringTransform, SymbolTable, Transliterator, UCharacterIterator, UForwardCharacterIterator, UnicodeFilter, UnicodeMatcher, UnicodeSet, UnicodeSetIterator, UTF16 | ||
icu.util: | CaseInsensitiveString, CompactByteArray, Freezable, RangeValueIterator, StringTokenizer, ULocale, UResourceBundle, UResourceBundleIterator, UResourceTypeMismatchException, ValueIterator, VersionInfo |
Building any of these modules is as easy as specifying a build target to the Ant build system, e.g:
To build a module that contains only the Normalizer API:
Build the module.
ant normalizerBuild the jar containing the module.
ant moduleJarBuild the tests for the module.
ant normalizerTestsRun the tests and verify that the self tests pass.
java -classpath classes com.ibm.icu.dev.test.TestAll -nothrow -w
If more than one module is required, the module build targets can be concatenated, e.g:
Build the modules.
ant normalizer collatorBuild the jar containing the modules.
ant moduleJarBuild the tests for the module.
ant normalizerTests collatorTestsRun the tests and verify that they pass.
java -classpath classes com.ibm.icu.dev.test.TestAll -nothrow -w
The jar should be built before the tests, since for some targets building the tests will cause additional classes to be compiled that are not strictly necessary for the module itself.
![]() | Regardless of whether ICU4J is built as a whole or as a modules, the jar file produced is named icu4j.jar. |
To ascertain if an icu4j.jar contains all of ICU4J or not, please see the manifest file in the jar |
The target moduleJar does not depend on any other target. It just creates a jar of all class files under $icu4j_root/classes/com/ibm/icu/, excluding the classes files in $icu4j_root/classes/com/ibm/icu/dev folder |
The list of module build targets can be obtained by running the command: ant -projecthelp |
Copyright (c) 2000 - 2008 IBM and Others - PDF Version - Feedback: http://icu-project.org/contacts.html
User Guide for ICU v4.0 Generated 2008-06-02.