<LinkOptions>

The LinkOptions element specifies options for handling hyperlinks.

Full element example

    <LinkOptions>
      <MaximumDepth             value="1"/>
      <FollowOffsite            value="yes"/>
      <MaximumOffsiteDepth      value="1"/>
      <SubDirOnly               value="no"/>
      <UnresolvedDetail         value="include"/>
      <Exclude>
        <Pattern>RE::table[1-9].jpg</Pattern>
        <Pattern>WC::figures*plant?blue</Pattern>
      </Exclude>
      <Include>
        <Pattern>RE:C:Table8</Pattern>
      </Include>
      <ExternalDocuments>
        <Specification>
          <Path>DocumentB</Path>
          <Prefix>DocumentB</Prefix>
          <KeepPrefix             value="no"/>
          <MapFile>c:\Docs\DocumentB\DocumentB.map</MapFile>
          <Lookup                 value="ByID"/>
        </Specification>
      </ExternalDocuments>
    </LinkOptions>

Sub-element summary

Tag	Type	Default	Description
`<MaximumDepth>`	value	1	Maximum link depth to follow
`<FollowOffsite>`	value	yes	Whether to follow offsite links
`<MaximumOffsiteDepth>`	value	1	Maximum link depth for following offsite links. This option requires version 4.25 or later of iSiloX and iSiloXC.
`<SubDirOnly>`	value	no	Whether to limit followed links to subdirectories. This option requires version 3.15 or later of iSiloX and iSiloXC.
`<UnresolvedDetail>`	value	include	Whether to include unresolved URLs
`<Exclude>`	multi-string	n/a	URL exclusion filters. This option requires version 3.3 or later of iSiloX and iSiloXC.
`<Include>`	multi-string	n/a	URL exclusion exception filters. This option requires version 3.3 or later of iSiloX and iSiloXC.
`<ExternalDocuments>`	container	n/a	Holds the specifications for links to external documents. This option requires version 4.1 or later of iSiloX and iSiloXC.

Sub-element descriptions

<MaximumDepth>

Description
Provide a positive integer in the value attribute of the MaximumDepth element to specify how far to follow hyperlinks. If you do not specify the MaximumDepth element, it defaults to one. The root source files are considered to be at a depth of zero. Files to which they link are at a depth of one. Files to which those files link are at a depth of two, and so on.

If you are creating a document based on a Web site, you are recommended to use a maximum depth value of one because each additional increment in depth beyond one will likely cause an exponential increase in the size of the document. For example, at a link depth of one, if the converted document is one megabyte in size, at a link depth of two, it might be ten megabytes, and at a link depth of three, it could be 100 megabytes.

Examples
This example specifies a maximum depth value of three, which results in all content up to a link depth of three being included in the converted document.

  <MaximumDepth             value="3"/>

This example specifies a maximum depth value of zero, which results in no additional content other than the root source file content being included in the converted document.

  <MaximumDepth             value="0"/>

<FollowOffsite>

Description
In the value attribute of the FollowOffsite element, specify the value yes to follow off-site links or provide the value no to not follow off-site links.

An off-site link is defined as a link to a target in a different domain. iSiloX and iSiloXC treat all file paths as belonging to the same domain. For URLs, they treat the the protocol (e.g., http://) and the hostname as comprising the domain. To tell iSiloX or iSiloXC to not follow links to targets in different domains, set the value attribute of the FollowOffsite element to no. This is useful to limit the amount of irrelevant content brought into the document.

iSiloX and iSiloXC perform the off-site link check anew for each root source file. What this means is that you can have root source files in different domains. For example, you can have two root source files, one with the URL <http://www.iSilo.com> and another with the URL <http://www.palm.com>. Assuming that you have set the value attribute of the FollowOffsite element to no, then when iSiloX or iSiloXC convert the content at <http://www.iSilo.com>, only links from there with target URLs that begin with <http://www.iSilo.com> are followed. When either converts the content at <http://www.palm.com>, only links from there with target URLs that begin with <http://www.palm.com> are followed. If the content at <http://www.palm.com> had a link to <http://www.iSilo.com/whatsnew.htm>, the link will not be followed.

Examples
This example specifies that off-site links should be followed.

  <FollowOffsite            value="yes"/>

This example specifies that off-site links should not be followed.

  <FollowOffsite            value="no"/>

<MaximumOffsiteDepth>

Description
Provide a positive integer in the value attribute of the MaximumOffsiteDepth element to specify how far to follow hyperlinks that go off-site. A value of zero is equivalent to setting the FollowOffsite element to no. If you do not specify the MaximumOffsiteDepth element, it defaults to one. The depth is relative to the source file containing the off-site link, rather than relative to the root source files. If the FollowOffsite element described above is set to no, then the MaximumOffsiteDepth element has no effect.

Note that the value of the MaximumDepth element still limits the total link depth. So if the MaximumDepth element is set to two and the MaximumOffsiteDepth element is set to one, and there is an off-site link from a source file at depth two, that link is not followed, although it is at a depth of one relative to the source file with that off-site link.

The MaximumOffsiteDepth element is useful in the case where you specify a MaximumDepth value greater than one in order to include more content from a given site but want to allow links to off-site articles.

Examples
This example specifies a maximum off-site depth value of two, which results in all off-site content up to a link depth of two being included in the converted document.

  <MaximumOffsiteDepth      value="2"/>

This example specifies a maximum off-site depth value of zero, which results in no off-site content being included in the converted document.

  <MaximumOffsiteDepth      value="0"/>

<SubDirOnly>

Description
In the value attribute of the SubDirOnly element, specify the value yes to limit followed links to those matching the subdirectory of the root source path. Specify the value no to allow all links to be followed.

In many cases, websites are structured hierarchically within folders and sub-folders. And in such cases, it is also probably the case that the URLs referencing the pages of such a site are also orgznied as such, with slashes separating the different levels of folders. For example, the iSiloX.com website has all support pages within a folder named "support". Within the support folder, there are sub-folders for different categories of support, such as a sub-folder named "manual" where the manuals are located. However, such sub-folder pages may also have links to pages outside of the folder. If you want to limit followed links to only sub-folders of the root source pages then you can set the value attribute of the SubDirOnly element to the value yes. If you do, then iSiloX only follows links which match up to the last slash of any of the root source URLs.

As an example, if you wanted to get all the support pages from the iSiloX.com website, you might specify http://www.iSiloX.com/support/index.htm as the root source URL and set the value attribute of the SubDirOnly element to the value yes. The page http://www.iSiloX.com/support/index.htm has a reference to the home page of the site http://www.iSiloX.com. However, because you have set the value attribute of the SubDirOnly element to the value yes, that link will not be followed. However, a link such as http://www.iSiloX.com/support/faq.htm to the frequently asked questions page will be followed.

Examples
This example specifies that only links to targets within subdirectories should be followed.

  <SubDirOnly               value="yes"/>

This example specifies that even links to targets outside of the root subdirectories can be followed.

  <SubDirOnly               value="no"/>

<UnresolvedDetail>

Description
In most cases, since you can tell iSiloX and iSiloXC to only follow links up to a given maximum depth and to not follow off-site links, you end up with a document that has hyperlinks to content not brought into the document. These hyperlinks are referred to as unresolved links. You can choose whether to include the target URLs of these unresolved links in the document or not by setting the value attribute of the UnresolvedDetail element to either include or exclude, respectively.

If you choose to include the unresolved link detail, iSiloX and iSiloXC create a document with an additional page at the end that lists the URLs of all unresolved links. The target of each unresolved link in the document jumps to its corresponding target URL on this last page. This is useful for later reference and for finding broken hyperlinks.

If you choose not to include the unresolved link detail, the unresolved hyperlinks essentially have no target. When viewing the document within a reader and attempting to follow such a hyperlink, the reader will tell you that the hyperlink was unresolved, but gives no indication of the target URL.

The most common sources of unresolved links are the following:

Links that are at a depth greater than that specified by the MaximumDepth element.
Links that are outdated and thus are broken because the target has moved.
Links whose targets are specified incorrectly.

Examples
This example specifies that unresolved link detail should be included.

  <UnresolvedDetail         value="include"/>

This example specifies that unresolved link detail should not be included.

  <UnresolvedDetail         value="exclude"/>

<Exclude>

Description
Within the <Exclude> start tag and the </Exclude> end tag, specify one or more Pattern elements. Each Pattern element is an exclusion filter specified using either a wildcard or regular expression pattern matching string. If the URL of an image matches against one of the exclusion patterns, it is not included in the document. If the target URL of a link matches against one of the exclusion patterns, the link is not followed and hence the target content is not included in the document. Exceptions to exclusions can be specified using the Include element.

The format of a pattern string is:

  type:options:pattern

In the above, type is either WC for a wildcard pattern or RE for a regular expression pattern, with pattern being the pattern in the format of the specified type to match against. For option, specify C to perform a case-sensitive match. By default, matching is case-insensitive, with the lowercase letters 'a' through 'z' matching the uppercase letters 'A' through 'Z'.

A pattern can be either a wildcard pattern or a regular expression pattern:

wildcard: A wildcard pattern provides a simple way to specify simple patterns. In such a pattern, the character '*' matches zero or more of any mix of characters and the character '?' matches exactly one of any character. A URL matches against a wildcard pattern if the pattern appears anywhere in the URL.
regular expression: Regular expression patterns use a powerful pattern matching language. This implementation uses the PCRE (Perl Compatible Regular Expressions) library, version 3.9. For more information about PCRE and the syntax for regular expressions, you can consult the PCRE website. In particular, follow the link there labeld "PCRE man page" and then go to the section with the heading "REGULAR EXPRESSION DETAILS".

Example
This example specifies two patterns.

  <Exclude>
    <Pattern>RE::table[1-9].jpg</Pattern>
    <Pattern>WC::figures*plant?blue</Pattern>
  </Exclude>

The first pattern specifies a regular expression pattern with no options, so the match will be case-insensitive. The pattern matches the text "table" followed by any digit character from '0' through '9' and then followed by the text ".jpg". So the pattern will match against any of the following:

Table1.jpg
http://www.acme.org/table3.jpg
c:\My Documents\table5.jpg
/home/acme/docs/TABLE9.jpg

But the pattern will not match against any of the following:

Table1.gif
http://www.acme.org/table0.jpg
c:\My Documents\table5
/home/acme/docs/tables.htm

The second pattern is a wildcard pattern and with no options, matching is also case-insensitive. The pattern matches the text "figures" followed by zero or more of any mix of characters, followed by the text "plant", followed by any single character, and finally followed by the text "blue". The pattern will thus match against any of the following:

http://blueflowers.com/figures/plantablue.htm
http://blueflowers.com/figuresplant1blue.htm

But the pattern will not match against any of the following:

http://blueflowers.com/figures/plantabblue.htm
http://blueflowers.com/figuresplantblue.htm

<Include>

Description
Within the <Include> start tag and the </Include> end tag, specify one or more Pattern elements. Each Pattern element is an inclusion filter specified using either a wildcard or regular expression pattern matching string. An inclusion filter serves as an exception to the exclusion filters. If a given URL matches against an exclusion filter the inclusion filters are applied to the URL, and if there is a match against an inclusion filter, the URL is not excluded. For details on how to specify the pattern, see the section on the Exclude element.

Example

  <Include>
    <Pattern>RE:C:Table8</Pattern>
  </Include>

If this example is taken in conjunction with the example given for the Exclude element, then although the exclusion filters exclude the URL "http://www.acme.org/Table8.jpg", this inclusion filter notes it as an exception and causes it not to be excluded. Note that in this inclusion pattern, the option C has been specified for a case-sensitive match, and so "http://www.acme.org/table8.jpg" would not be noted as an exception.

<ExternalDocuments> sub-element summary

The ExternalDocuments container element holds the specifications for links to external documents. A document may have links to zero or more external documents. Generally you will have one external document specification for each external document to which the document will link. Each such specification is represented by its own Specification container sub-element within the ExternalDocuments container element. See below for example external document specifications.

Tag	Type	Default	Description
`<Specification>`	container	n/a	Holds information about an external document specification.

<Specification> sub-element summary

The Specification container element specifies the relative path to an external document, a prefix string to match against for identifying a link as one to the external document, a possible map file to use during conversion for looking up identifying information for targets in the external document, and what method to use for looking up targets in the external document. It has the following sub-elements:

Tag	Type	Default	Description
`<Path>`	string	n/a	Relative path to the external document.
`<Prefix>`	string	n/a	External link identification prefix.
`<KeepPrefix>`	value	no	Whether to include the prefix for lookup.
`<MapFile>`	string	n/a	Path to a map file.
`<Lookup>`	value	ByName	Method to use for target lookup.

<Path>

Description
Between the <Path> start tag and the </Path> end tag specify the relative path to the external document as it will be when the user accesses it. If the external document is a .pdb file, the .pdb extension is optional. The reader application will attempt to open the file with the exact path name you provide first. If the open is unsuccessful, another attempt is made to open it with the .pdb extension if it was not provided or without the .pdb extension if it was provided.

Note that on Palm OS® that if a file is stored in the internal storage memory that the document title serves as the file name, so when converting external documents, it is best to ensure that the document title and document file name are the same. Also, on Palm OS®, when a document is stored in the internal storage memory, any external documents to which it links must also be stored in the internal memory, and in this case, the reader application ignores the directory part of external document paths.

Version 4.3 and later of iSilo™ support searching for the first of multiple possibilities. You can specify multiple names to search for by enclosing each name within double-quote characters and separating each double-quote enclosed name from the next with a space. When you do this, iSilo™ opens the first document that it finds in the order listed. This is especially useful in the case for Palm OS®, where when a document is in the internal database storage memory, its internal database name is used since there is no notion of a file name, but when a document is on a memory card, its file name is used.

Examples
This example specifies that the external document will be a file named "DocumentB" in the same directory.

  <Path>DocumentB</Path>

This example specifies that the external document will be a file named "Main Index.pdb" in the directory one level above.

  <Path>../Main Index.pdb</Path>

This example specifies that the external document will be a file named "The Art of War" in the directory named "Classics".

  <Path>Classics/The Art of War</Path>

This example specifies three different possibilites for the external document:

  <Path>"Gulliver's Travels" "Gulliver_s_Travels.pdb" "Gul. Travels"</Path>

<Prefix>

Description
Between the <Prefix> start tag and the </Prefix> end tag specify the text to match links against for identifying a link as one to an external document. When performing the match, the converter takes the URL of the link and removes all leading periods, forward slashes, and baskslashes. Then it performs a case-insensitive comparison of the prefix string against the beginning of the remaining URL. A match indicates that the link is to that of an external URL.

Examples
This example specifies that all links whose URLs start with "DocumentB" are links to an external document.

  <Prefix>DocumentB</Prefix>

The prefix would match against a URL such as "DocumentB/index.htm#TableOfContents" or to a URL such as "../DocumentB/How-To/Write A Story.htm".

<KeepPrefix>

Description
The setting of the value attribute of the <KeepPrefix> tag determines whether the prefix is kept for lookup. Set the value attribute of the tag to no to not include the prefix or to yes to include the prefix.

As an example of a scenario where the prefix should not be included in the lookup, consider two documents, call them document A and document B, that externally link to one another such that each document's content is wholly contained in its own directory. Say that the directory containing document A's content is named DirA and that the directory for document B's content is named DirB. In order for document A to link to document B, for document A, you would specify DirB as the prefix for identifying links as those to document B. For document B, you would specify DirA as the prefix for identifying links as those to document A. The target names within a given document are relative to the first source, which would presumably be some file immediately within the document's directory. Hence, the directory name would not be part of the target name and thus the prefix, which would be the same as the directory name, should not be included for lookup.

As an example of a scenario where the prefix should be included in the lookup, consider two documents, call them document A and document B, that externally link to one another such that each document's content is spread across two directories. Say that the directories containing document A's content are DirA1 and DirA2 and that the directories containing document B's content are named DirB1 and DirB2. Further, say that the directory containing all four directories is named DirAB. In addition, say that an index file immediately within DirAB links to content in all four subdirectories DirA1, DirA2, DirB1, and DirB2. To create the two documents that link externally to one another, for document A, you would specify two external document specifications, both for externally linking to document B. For the first specification, DirB1 would be the prefix. For the second specification, DirB2 would be the prefix. But since the index file is at the same level as those two directories, you would want to keep the prefix.

Examples
This example specifies that the prefix not be included for lookup.

  <KeepPrefix               value="no"/>

This example specifies that the prefix should be included for lookup.

  <KeepPrefix               value="yes"/>

<MapFile>

Description
Between the <MapFile> start tag and the </MapFile> end tag specify the full path of the map file for the external document. When using the ByID or ByOffset lookup methods the map file is necessary for determining the target ID or target offset value for links to the external document. The path must be a full path and can be an HTTP URL. This latter capability allows the targets of a document to be easily made public.

When converting the target external document, use the <Targets> element to generate the map file. If two documents link to one another, it is necessary to perform two conversion passes. The first pass generates the map file and the second pass uses the map files for looking up the associated target IDs or offsets.

Examples
This example specifies the full path of a map file on a Windows® based computer.

  <MapFile>c:\Docs\DocumentB\DocumentB.map</MapFile>

<Lookup>

Description
The setting of the value attribute of the <Lookup> tag determines the format in which the link information is stored as well as how the lookup is performed in the external document. Set it to one of the following:

ByName: The part of the URL of the link after the prefix is considered the target name and stored as the value to use to identify the target location within the external document when a jump to the target occurs. In order for the links to the target document to work properly, the target document must have been converted with the value attribute of the <Lookup> tag within its <Targets> element set to ByName.
ByID: A numeric value, also known as the target ID, that uniquely identifies the target is stored and used to identify the target location within the external document when a jump to the target occurs. A map file for the external document is needed to lookup the target ID values of the external document during conversion.
ByOffset: A numeric value, also known as the target offset, that represents the location of the target in the external document is stored and used when a jump to the target occurs. A map file for the external document is needed to lookup the target offset values of the external document during conversion.

The lookup methods each have their own individual advantages and disadvantages.

For the document storage space tradeoffs among the methods, the ByName method requires the largest amount of storage space in the linking document as well as in the targeted external document unless the number of target names are very few and short in length. The ByID and ByOffset methods require approximately the same amount of storage space as each other in the linking document. In the targeted external document, the ByOffset method requires no additional storage space, while the ByID method requires an amount of storage space that is generally less than the ByName method.

In terms of the speed of performing the lookup when a jump occurs to an external document, the difference perceived by the user is probably negligible. But the ByName method requires the most amount of processing. The ByID method comes next, while the ByOffset method requires the least amount of processing for lookup.

The other important tradeoff among the methods concerns synchronization between a document and the external documents to which it links. For the purposes of this discussion, let us say that we have a document named DocSource that has links to an external document named DocTarget and that DocTarget is updated indepedent of DocSource. The content and targets in DocTarget change periodically such that content and targets may be added and removed. Assume though that the targets to which DocSource links to in DocTarget are always there, though the specific location of the targets within the content of DocTarget may change.

Given the scenario just described, if the lookup method is ByName, even though DocTarget may undergo many changes and DocSource stays the same, the links from DocSource to DocTarget will always work.

If the lookup method is ByID this may not be the case. The IDs assigned to each target within DocTarget depend to some extent on all other external targets within DocTarget. If DocTarget gets a new target or one is removed, the target IDs for the other targets may change. As a result, the target IDs stored in DocSource for the targets in DocTarget may become invalid. However, if only the content in DocTarget changes, the target IDs will still be valid.

If the lookup method is ByOffset, then neither the content nor the targets in DocTarget may change if the links from DocSource to DocTarget are to remain valid.

The ByName lookup method, though requiring the most storage space, is the best method to use for documents that can change independent of one another. The ByOffset lookup method requires the least amount of storage space and is a good method to use for documents that will change together. The ByID lookup method generally requires only a modest amount of storage space compared to the ByName method and is a good method to use when only changes to the content, such as minor corrections, are expected to occur in an external document.

Examples
This example specifies that the target lookup be by name.

  <Lookup                   value="ByName"/>

This example specifies that the target lookup be by ID.

  <Lookup                   value="ByID"/>

This example specifies that the target lookup be by offset.

  <Lookup                   value="ByOffset"/>