PSTextMerge merges lists of tabular data into text templates to create fully populated text files. The output files can be Web pages, XML files, RSS Feeds or any other varieties of text files. The input data can be obtained from a tab-delimited data file, or directly from a Microsoft Excel (‘.xls’) spreadsheet, or from a number of other sources. You can record and easily play back recorded scripts, making it easy to re-generate your output text files when your data changes.
The input data can be sorted and filtered before being used to generate output, and these operations can be scripted as well.
Tab-delimited files can be easily exported from most spreadsheets, databases, address book, and many other programs.
PSTextMerge can also perform other operations with tabular data, extracting and merging data from multiple bookmarks files, address books, and other formats.
PSTextMerge was formerly known as TDF Czar.
PSTextMerge is written in Java and can run on any reasonably modern operating system, including Mac OS X, Windows and Linux. PSTextMerge requires a Java Runtime Environment (JRE), also known as a Java Virtual Machine (JVM). The version of this JRE/JVM must be at least 6. Visit www.java.com to download a recent version for most operating systems. Installation happens a bit differently under Mac OS X, but generally will occur fairly automatically when you try to launch a Java app for the first time.
Because PSTextMerge may be run on multiple platforms, it may look slightly different on different operating systems, and will obey slightly different conventions (using the CMD key on a Mac, vs. an ALT key on a PC, for example).
PSTextMerge Copyright 1999 - 2014 by Herb Bowie
PSTextMerge is open source software. Source code is available at GitHub.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
PSTextMerge also incorporates or adapts the following open source software libraries.
JExcelAPI — Copyright 2002 Andrew Khan, used under the terms of the GNU General Public License.
parboiled — Copyright 2009-2011 Mathias Doenitz, used under the terms of the Apache License, Version 2.0.
pegdown — Copyright 2009-2011 Mathias Doenitz, used under the terms of the Apache License, Version 2.0.
Xerces — Copyright 1999-2012 The Apache Software Foundation, used under the terms of the Apache License, Version 2.0.
Saxon — Copyright Michael H. Kay, used under the terms of the Mozilla Public License, Version 1.0.
Download the latest version from PowerSurgePub.com. Decompress the downloaded file. Drag the resulting file or folder into the location where you normally store your applications. Double-click on the jar file (or the application, if you've downloaded the Mac app) to launch.
PSTextMerge has two arguments which allow the program to be run completely from the command line, without any graphical user interface. The first argument is specified as “-q” and tells the program to run in “quiet mode” — that is, without any GUI. When this option is specified, the log file will be written to disk, as described in the Logging section.
The second argument specifies the location and name of a script file to be played, the location being relative to the PSTextMerge program.
Together, these two options allow PSTextMerge to be executed from a script or batch file, without any user interaction.
PSTextMerge works with any data that can be represented in columns and rows. Each column has a title. The number of columns, and the titles of the columns, is completely variable.
File operations may be accessed via the File menu.
Most of PSTextMerge's user interface elements are laid out in a series of tabs that proceed in a natural progression from left to right. Most of the functions performed within these tabs may also be triggered by their corresponding script commands. See the Special Functions section for a description of each tab.
This function allows PSTextMerge to populate a list to be used in later processing. The following controls are available.
Clicking on this button will allow you to select the file or directory to be input. Make sure that all of the following input options are properly set before pressing this button. This function can also be invoked via the File/Open Menu item or with the O shortcut key.
Note that, when a directory (aka folder) is selected, then all eligible files within that directory will be processed, and the resulting list will consist of the concatenation of all the rows of data generated from all of the eligible files.
The default input is a tab-delimited text file, which may have been saved or exported from a spreadsheet, database, address book or other tabular data source. But other options are available as well. Select the one you want from the drop-down list.
Following are descriptions of each of the currently available options.
This is a text file. Each line in the file must use tabs (or commas, if the file extension is '.csv') to separate each field, or column, in the line. The first line of the file must contain column headings.
This is a text file format that can be used to help plan events for a club, such as an alumni club.
See the Club Planner specification for details.
Setting this option will cause the program to treat the input file as a Microsoft Excel spreadsheet, in Excel 97 - 2004 format (with an '.xls’ extension). The first or only worksheet (tab) will be accessed. The first row will be expected to contain column headings, with data in following rows. The first blank row will terminate the list. Each row in the spreadsheet (after the first, containing headings) will be treated as a data record, and each column will be treated as a separate field. Columns containing hyperlinks will also generate fields containing the hyperlinks, and named by appending "link" to the column heading. For example, a column named "ISBN" could have its content accessed with the variable "isbn" and its link accessed with the variable "isbnlink".
Setting this option will also cause the program to treat the input file as a Microsoft Excel spreadsheet, in Excel 97 - 2004 format (with an '.xls’ extension). The first or only worksheet (tab) will be accessed. The first blank row will terminate the list. With this option, each row in the spreadsheet will be returned as a single data field, identified by a variable name of "Table Row". The data returned will include beginning and ending td tags for each column, with appropriate formatting and cell dimensions and hyperlinks, mimicking the format of the Excel spreadsheet as closely as possible. The data returned does not include beginning or ending tr tags.
When you specify a file directory as your data source, each entry in that directory will then be treated as if it were a single record, or line, or row, from a data file. The "Maximum Directory Depth" field below will control the depth to which sub-directories are read.
See the File Directory specification for details.
Setting this option will cause the program to treat the input file as HTML. It will expect the file to contain bookmarks, using nested lists to indicate a folder hierarchy. A table will be created with the following columns. Many bookmark managers, including popular browsers, can save or export bookmarks into this format.
Setting this option will cause the program to treat the input file as HTML. It will expect the file to contain bookmarks, using varying levels of heading tags to indicate a folder hierarchy. A table will be created with the same columns as the HTML Bookmarks using Lists, described above. The contents of the varying heading levels will be placed in Category 1 through 6.
Setting this option will cause the program to treat the input file as HTML. It will expect the file to contain a table, using table, tr, th and/or td tags. The first row in the table will be expected to contain column headings, with following rows containing the corresponding data.
Note that many HTML pages are laid out using tables, not to present columns and rows of data, but to format the page in a desired fashion. Tables containing column headings and following data may often be embedded in such layout tables. In order to try to separate the two, PSTextMerge in this mode will look for the first table cell (within th or td tags) containing 1 to 40 characters of text, with the idea that text beyond either extreme is not likely to be a true column heading. During this search for the first cell of a data table, table cells with colspan or rowspan parameters greater than 1 will also be ignored.
In some cases, however, you may need to edit the prospective input file with your favorite text editor, and delete lines preceding and following the table containing the data you are interested in, saving the resulting file as a separate file to be input to PSTextMerge. If a true first column heading is longer than 40 characters, then you will also need to reduce its length to the acceptable 1-40 range.
Note that for any HTML input option, character entities found within HTML text will be translated to their equivalent ASCII characters. For now, translation is only provided for characters that are not platform-specific: " " (non-breaking space), "<" (less than sign), ">" (greater than sign), "&" (ampersand) and """ (double quotation marks). Entities may be specified using mnemonics or their numeric equivalents.
Setting this option will cause the program to treat the input file as HTML. It will extract all of the links within this file, identifying the from page, the link type, and to file, for each link found. Link type may be linkhref, imgsrc, or ahref, for the corresponding tag and attribute combinations.
Selecting this option will cause the input file to be treated as an iTunes Library.xml file.
This file can generally be found in your Music / My Music folder, and then within your iTunes folder. See Apple Support articles Where are my iTunes files located? and What are the iTunes library files? for details.
This information can be used to publish your list of albums to a Web site, or to import your list of albums into a database, such as Bento.
This input routine will extract and summarize information about albums contained in your iTunes library. For each unique album title, the following fields will be extracted.
If you select a folder containing Markdown files, then this input option will extract metadata about each file and make them available in a list.
See the Markdown Metadata specification for details.
Setting this option will cause the program to treat the input file as XML. Each field will be returned as a separate record. Field names will be stored in columns 1 - 4, with column names "Name1" through "Name4", and field values will be stored in column 5, with column name "Data".
Setting this option will cause the program to treat the input file as XML. Each set of fields at the same level will be returned as a separate record, with the assumption that the XML file consists of a series of records with the same fields. Field names will be stored in columns 1 - 4, with column names "Name1" through "Name4", and field values will be stored in column 5, with column name "Data".
Clicking on this check box causes the program to look for a special data dictionary file to accompany the tab-delimited data source. The data dictionary file must have the same name as the primary file, but with a file extension of ".dic".
The data dictionary file itself is in tab-delimited format. Each row in the file (after the column headings) represents one column in the primary data file.
Note that the easiest way to create a data dictionary for a file is to input the file to PSTextMerge (without a dictionary), then Output it with a dictionary (see the Output section for details). You can then edit the resulting file, only modifying the values you wish to change.
A dictionary may have the following columns describing the fields in the primary file.
This is the column heading that identifies this field in the primary data file. It would normally include mixed-case, spaces and punctuation to make it readable.
This is really a lowest-common denominator form of the column heading, with capitals, spaces and punctuation all removed. This is used internally by PSTextMerge to allow slight variations in punctuation, etc., without recognizing them as two separate names.
If the names previously given should be treated as an alias for another field name, then this column should contain the primary name for the field, and the remaining columns for the alias should be left blank. This feature doesn’t do much in this release of PSTextMerge, but should become more meaningful in later releases.
The default value here is "DataFormatRule", which causes no special formatting of the input data. But specifying another value can cause the input data to be formatted and even converted according to any of the following special rules.
If a valid function name is specified here, then the field value will be calculated using the specified function. The following functions are available. The functions make use of the parameters specified in the following columns.
Parm | Used As |
---|---|
Parm1 | Name of the file containing the lookup table. |
Parm2 | Name of the field in the lookup table that should be used as the table’s key. |
Parm3 | A value of "Yes" or "True" indicates that the key comparison should be case-sensitive. |
Parm4 | Name of the field in this file that should be used as the lookup key. |
Parm5 | Name of the field in the lookup file that should be returned and used as the value for this field. |
Used as input to the specified function.
Used as input to the specified function.
Used as input to the specified function.
Used as input to the specified function.
Used as input to the specified function.
Select one of the following radio buttons to indicate how and whether you want new data to be merged with existing data.
This is the default. The next data source to be input will overlay any data previously input.
The next input data source to be opened will be merged with the current data visible on the View Tab. Existing sort keys and filters may need to be reapplied. The new data and the old data should have column names that are at least partially overlapping, if not identical.
The next input data source to be opened will be merged with the current data visible on the View Tab. The program will not look for column headings in the new input file, but will instead assume that the column names for the new file are the same, and in the same order, as those currently visible on the View Tab. Existing sort keys and filters may need to be reapplied.
Note that this merge option is useful for programs (such as AppleWorks) that do not include column headings in their export files. If a separate file is created containing only column headings, then it can be input first, followed by the headerless file, with this merge option.
If you are about to read a file directory, then this field controls whether sub-directories are read, and to what depth. A value of 1 is the default, and indicates that only files and directories in the specified directory will be listed, with no sub-directory contents. A value of 2 indicates one level of sub-directories, and so forth. Use the Increment and Decrement buttons to change the depth.
This tab allows the user to view a tab-delimited data file that has been opened for input. Subsequent tabs, such as Sort and Filter, will affect the data that is displayed on the View tab.
The user can scroll from left to right and up and down, assuming there is more data available than will fit within the current window. Columns can also be resized by clicking on their right borders and dragging. Each initial column size will be approximately proportional to the largest data field within the column.
This tab allows the user to sort the data that has been input. Sorting is accomplished by using the following buttons that appear on this tab.
This is a drop down list of all the columns in your data. Select the next field name on which you wish to sort, by starting with the most significant fields and proceeding to less significance.
This is a drop down list. Pick either ascending (lower values towards the top, higher values towards the bottom) or descending. This sequence applies to the currently selected field name (see above).
Pressing this button will add the field and sequence currently specified to the current sort parameters being built. The sort parameters added will appear in the text area shown below on this tab. After pressing the Add button, the user may go back and specify additional fields to be used in the sort criteria.
Pressing this button will clear the sort parameters being built, so that you can start over.
Once your desired sort parameters have been completely built, by pressing the Add button one or more times, you must press the Set button to cause your parameters to be applied to the data you are currently processing.
After setting the desired sort sequence, you may optionally press this button to combine records with duplicate sort keys. The following buttons allow you to adjust the parameters controlling the combination process.
Record combination can be done with varying degrees of tolerance for data loss. Select one of the following radio buttons.
If you specify some data loss to be acceptable, then this field may be used to specify a minimum number of data (non-key) fields that must be lossless (equal or one blank) before combination will be allowed to take place. This should be used if the sort keys are not guaranteed to establish uniqueness. Specifying a non-zero value here may help to prevent completely disparate records from being inadvertently combined. For example, names can be used to identify people, but two different people may have the same name.
This tab allows the user to filter the data that has been input, selecting some rows to appear and others to be suppressed. Filtering is accomplished by using the following buttons that appear on this tab.
This is a drop down list of all the columns in your data. Select the next field name on which you wish to filter.
This is a drop down list. Pick the operator that you want to use to compare your selected field to the following value. The following operators are available.
This is the value to which the selected field will be compared. Only rows that satisfy this comparison will be visible after the filtering operation. You may type in a desired value, or select from the drop down list. The drop down list will consist of all the values found in this field within your data.
Pressing this button will add the field, operator and value currently specified to the current filter parameters being built. The filter parameters added will appear in the text area shown below on this tab. After pressing the Add button, the user may go back and specify additional fields to be used in the filtering criteria.
Pressing this button will clear the filter parameters being built, so that you can start over.
Once your desired filter parameters have been completely built, by pressing the Add button one or more times, you must press the Set button to cause your parameters to be applied to the data you are currently processing.
If you specify more than one filter parameter, then you may specify whether all of them must be true (and) or only any one of them must be true (or) in order to satisfy the filtering criteria. This choice applies to the entire set of criteria, so this need only be selected once before pressing the Set button.
This tab allows the user to save the current data to an output disk file. The data will be saved in the sequence set by the current sort parameters, if any. If a current filter is in effect, then only visible, filtered data will be saved. If any data transformation or formatting occurred on input, then the data will be saved in its new form. The following controls appear on this tab.
Clicking on this button allows you to select a location and name for the output file, and then writes the file (with optional dictionary) as specified. This function can also be invoked via the File/Save Menu item or with the S shortcut key.
Clicking on this check box causes the program to write a special data dictionary file to accompany the tab-delimited data source being output. The data dictionary file will have the same name as the primary file, but with a file extension of “.dic”. Clicking this check box again will cause it to revert to its original state, in which the dictionary will not be saved. See the Input section for a complete description of the dictionary file.
This tab allows the user to merge the currently loaded data into a template file, producing one or more output text files. The greatest anticipated use for this function is to create Web pages, based on input template files containing a mixture of HTML tags and special PSTextMerge tags. This allows tab-delimited data to be periodically merged into an HTML template that determines the format in which the data will be displayed on a Web site.
This screen contains the following buttons.
PSTextMerge supports the concept of a central template library where you can store reusable templates. The initial location for this folder is the “templates” folder within the PSTextMerge Folder that comes as part of the software distribution. However, this button can be used to allow you to select another folder as your template library. After installation of PSTextMerge, you may wish to copy the templates folder to another location, perhaps within your home folder, or your documents folder, and then use this button to specify that new location.
This button allows you to specify the location and name of the template file you wish to use. (This file must have previously been created using a text editor.) This function may also be invoked via the Template/Open Menu item or with the T shortcut key.
This button also opens a template file, but uses your template library as the starting location.
This button processes the template file you have selected, and creates whatever output file(s) you have specified in the template file. The function may also be invoked via the Template/Generate Menu item or with the G shortcut key.
See the Template File Format specification for details.
This tab allows the user to record and playback sequences of PSTextMerge commands. The following buttons and menu commands are available.
Clicking on this button once causes the program to begin recording your subsequent actions as part of a script that can be edited and played back later. This function may also be invoked via the Script/Record Menu item, or with the R shortcut key. You will need to specify the location and name for your script file. It is recommended that “.tcz” be used as a file extension for PSTextMerge script files (the original name for the program was “TDF Czar”). This will be supplied as a default if no extension is specified by the user.
Clicking on this button causes recording of the current script to stop. This function may also be invoked via the Script/End Recording Menu item, or with the E shortcut key. The script file will be closed, and can now be opened for editing, if desired, using the text editor, spreadsheet or database program of your choice.
This button allows you to select a script file to be played back. This function may also be invoked via the Script/Play Menu item, or with the P shortcut key. At the end of a script’s execution, the input file options will be reset to their initial default values, to ensure consistent execution when multiple scripts are executed consecutively.
This button allows you to replay the last script file either played or recorded. Using this button allows you to bypass the file selection dialog. It can be handy if you are developing, modifying or debugging a series of actions and associated files. This function may also be invoked via the Script/Play Again Menu item, or with the A shortcut key.
Clicking this button will allow you to select a script to be automatically played every time the application is launched.
After selecting a script to play automatically, the label of this button will change to “Turn Autoplay Off”.
Clicking this button will allow you to select a folder of scripts that you want easy access to. A new tab will then be added to the interface, labeled “Easy”. The new tab will contain a button for every script found in the folder. Clicking on a button will then play the corresponding script.
After selecting an Easy Play folder, the label of this button will change to “Turn Easy Play Off”.
This menu item, within the Script menu, allows you to select a recently played script to run. The most recent 10 scripts will be available to select from.
See the Script File Format specification for details.
This tab allows the user to control logging operations. PSTextMerge writes information about certain events to a log file. Reviewing this data can be useful, especially if the program is not performing as expected. The following sections appear on this tab.
This determines where the log output is sent. You can select any of the following options.
This determines the quantity and severity of messages that will appear in the log. You have the following options.
Input data files are often passed to the logger, primarily so that significant events that are data-related can include a display of the data record that generated the event. Checking the “Log All Data?” box will result in all data passed to the logger being written to the log. This may be helpful if the log is otherwise showing insufficient data to let you understand the workings of the program.
The following commands are available. Note that the first two commands open local documentation installed with your application, while the next group of commands will access the Internet and access the latest program documentation, where applicable.
Program History -- Opens the program's version history in your preferred Web browser.
User Guide -- Opens the program's user guide in your preferred Web browser.
Check for Updates -- Checks the PowerSurgePub web site to see if you're running the latest version of the application.
PSTextMerge Home Page -- Open's the PSTextMerge product page on the World-Wide Web.
Reduce Window Size -- Restores the main PSTextMerge window to its default size and location. Note that this command has a shortcut so that it may be executed even when the PSTextMerge window is not visible. This command may sometimes prove useful if you use multiple monitors, but occasionally in different configurations. On Windows in particular, this sometimes results in PSTextMerge opening on a monitor that is no longer present, making it difficult to see.
The following file formats are used by PSTextMerge.
This section describes the contents of a template file, used for producing formatted output from a table of rows and columns.
This program will look for two sorts of special strings embedded within the template file: variables and commands.
Beginning with version 3.0, PSTextMerge will recognize either of two sets of command and variable delimiters automatically. The choice of delimiters will be triggered by the first command beginning delimiters encountered. The new delimiters are generally recommended, since they are more likely to be treated kindly by various HTML editors on the market when you are editing your template files.
Meaning | Original Delimiters | New Delimiters |
---|---|---|
Start of Command | << | <? |
End of Command | >> | ?> |
Start of Variable | << | =$ |
End of Variable | >> | $= |
Start of Variable Modifiers | & | & |
Variables will be replaced by values taken from the corresponding columns of the current data record, or from an internal table of global variables. Variables must be enclosed in the chosen delimiters. Each variable name must match a column heading from the data file, or a global name specified in a SET command. The comparison ignores case (upper or lower), embedded spaces and embedded punctuation when looking for a matching column heading. So a column heading of "First Name" will match with a variable of "firstname", for example.
A variable, unlike a command, can appear anywhere within the template file, and need not be isolated on a line by itself. More than one variable can appear on the same line. Variables can be used within PSTextMerge commands, as well as other places within the template file.
The following special variables are predefined and available for substitution, no matter what data source is being used.
A variable can be optionally followed (within the less than/greater than signs) by a modifier indicator and one or more modifiers. The default modifier character is the ampersand (&).
The letters "U" or "L" (in either upper- or lower-case) will indicate that the variable is to be converted, respectively, to upper- or lower-case. If the letter "i" is also supplied (again in either upper- or lower-case), then only the first character of the variable value will be converted to the requested case. (The letter "i" stands for "initial".)
The letter "X" will cause selected special characters to be translated to their equivalent XML entities. This is recommended, for example, when publishing an RSS (Really Simple Syndication) feed.
The letter "H" will cause selected special characters to be translated to their equivalent HTML entities.
Convert a URL to an HTML anchor tag with that URL as the href value.
Remove HTML break (br) tags from the string.
The letter "B" will cause the file extension, including the period, to be removed from a file name. This can be used, for example, to generate an output file name with the same name as the input data file (using the variable name "datafilename"), but with a different extension.
Converts a string to a conventional, universal file name, changing spaces to dashes, removing any odd characters, making all letters lower-case, and converting white space to hyphens.
Remove awkward punctuation characters.
The letter "R", in combination with a length modifier (see below), will cause the variable to be truncated to the given length, truncating characters on the left and keeping characters on the right.
One or more digits following the modifier indicator will be interpreted as the length to which the variable should be truncated or padded. If the length modifier is shorter than the variable length, then by default characters will be truncated on the right (and preserved on the left) of the variable to bring it to the specified length (if it is desired to keep characters on the right, then also use the "R" modifier, described above). If the length modifier is longer than the initial variable length, then the variable will be padded with zeroes on the left to bring it to the specified length.
An underscore character ("_") following the modifier indicator will cause all spaces in the variable to be replaced by underscores. This can be useful when creating a file name, for example.
Any punctuation character other than an underscore following the modifier indicator will be interpreted as a separator that will be placed before the current variable, if the variable is non-blank, and if the preceding variable was also non-blank and also marked by a similar variable modifier. A space will be added after the separator, and before the current variable, if the punctuation is not a forwards or backwards slash ("/" or ""). This is an easy way to list several variables on a single line, separating non-blank ones from others with commas (or other punctuation).
If a variable may be interpreted as a series of "words," with the words delimited by white space, punctuation, or transitions from lower to upper case ("two words", "TWO_WORDS" or "twoWords"), then these variable modifiers may be used to change the way in which the words are delimited.
Letter | Meaning |
---|---|
c | This letter must begin the string, to indicate that modified word demarcation is desired. This should be followed by three letters, each with one of the following values. The first occurrence indicates what should be done with the first letter of the variable; the second occurrence indicates what should be done with the first letter of all other words; the third occurrence indicates what should be done with all other letters in the variable. |
u | This letter indicates that upper-case is desired. |
l | This letter indicates that lower-case is desired. |
a | This letter indicates that the case should be left as-is. |
- | Any character(s) following the 'c', other than 'u', 'l' or 'a', will be used as delimiters separating each word. |
For example, if the template file contained the following:
AM32;
And the name variable was equal to:
HERB BOWIE
Then the resulting name in the output text file would be:
Herb Bowie
A string of characters indicating how the variable is to be formatted. The formatting string, if specified, should follow any other variable modifiers. Any character other than those listed above will cause the remainder of the variable modifiers to be treated as a formatting string. Currently, a formatting string is valid only for dates -- either for the special variable today, or for any variable date in "mm/dd/yy" format.
A date formatting string follows the normal rules for Java date formatting. One or more occurrences of an upper-case "M" indicates a month, a lower-case "y" is used for a year, and a lower-case "d" is used for the day of the month. An upper-case "E" can be used for the day of the week. Generally, the number of occurrences of each letter you specify will be used to indicate the width of the field you want ("yyyy" for a 4-digit year, for example). Specifying more than two occurrences of "M" indicates you want the month represented by letters rather than numbers, with 4 or more occurrences indicating you want the month spelled out, and 3 occurrences indicating you want a three-letter abbreviation.
See below for full definition of allowable characters and their meanings.
Symbol | Meaning | Presentation | Example |
---|---|---|---|
G | era designator | Text | AD |
y | year | Number | 1996 |
M | month in year | Text & Number | July & 07 |
d | day in month | Number | 10 |
h | hour in am/pm | 1~12 | 12 |
H | hour in day | 0~23 | 0 |
m | minute in hour | Number | 30 |
s | second in minute | Number | 55 |
S | millisecond | Number | 978 |
E | day in week | Text | Tuesday |
D | day in year | Number | 189 |
F | day of week in month | Number | 2 (2nd Wed in July) |
w | week in year | Number | 27 |
W | week in month | Number | 2 |
a | am/pm marker | Text | PM |
k | hour in day | Number | 24 |
K | hour in am/pm | Number | 0 |
z | time zone | Text | Pacific Standard Time |
' | escape for text | Delimiter | |
single quote | Literal |
The count of pattern letters determine the format.
(Text): 4 or more pattern letters--use full form, < 4--use short or abbreviated form if one exists.
(Number): the minimum number of digits. Shorter numbers are zero-padded to this amount. Year is handled specially; that is, if the count of 'y' is 2, the Year will be truncated to 2 digits.
(Text & Number): 3 or over, use text, otherwise use number.
Any characters in the pattern that are not in the ranges of ['a'..'z'] and ['A'..'Z'] will be treated as quoted text. For instance, characters like ':', '.', ' ', '#' and '@' will appear in the resulting time text even they are not embraced within single quotes.
All commands must be enclosed in the chosen delimiters. In addition, all commands must appear on lines by themselves. Command names can be in upper- or lower-case. Each command may have zero or more operands. Operands may be separated by any of the following delimiters: space, comma (','), semi-colon (';') or colon (':'). Operands that contain any of these delimiters must be enclosed in single or double-quotation marks.
The following commands are recognized. They are presented in the typical sequence in which they would be used.
<?delims new delimiters?>
<?output "filename.ext"?>
<?set global = 0?>
<?nextrec?>
<?include "filename.ext" ?>
<?ifchange ?>
<?if ?>
<?definegroup group-number ?>
<?ifendgroup group-number?>
<?ifendlist group-number?>
<?ifnewlist group-number?>
<?ifnewgroup group-number?>
<?else?>
<?endif?>
<?loop?>
If used at all, this command should be the first command in the template file. This command overrides the standard delimiters used to recognize the beginnings and ends of commands and variables, for the remainder of the current template file. The command can have one to five operands. Each operand will become a new delimiter. They should be specified in the following order.
Note that, when using this command, this command itself must use the standard delimiters. The new delimiters should only begin to be used on following lines.
This command names and opens the output file. The single operand is the name of the output file. filename.ext should be the desired name of your output file. This command would normally be the first line in your template file. Subsequent template records will be written to the output file. Note, however, that the filename can contain a variable name. In this case, the output command would immediately follow the nextrec command, and a new output file would be opened for each tab-delimited data record.
This command can define a global variable and set its value. This command would normally have three operands: the name of the global variable, an operator, and a value.
One intended use for the SET command is to support a line counter. By initializing the value to 0, and then adding to it whenever an output line is generated, the IF command can be used to check for page overflow (in a table column, for example), and then start a new page or column, resetting the counter to 0 again.
Another common use for the SET command is to preserve record variables in global variables so that they will be available within an IFENDGROUP block.
This command indicates the beginning of the code that will be written out once per data record. Lines prior to the nextrec command will only be written out once.
This command allows you to include text from another file into the output stream being generated by the template.
An optional operand of "copy" will ensure that the include file is included without conversion; otherwise, if the input and output file extensions are different, and are capable of conversion, the input file will be converted to the output file's format (for example, Markdown or Textile can be converted to html).
Markdown conversion will be done using the Pegdown processor, using the options for typographic conversions (as with SmartyPants) and table generation.
If converting from Markdown, then an optional operand of "nometa" will cause metadata lines to be skipped when generating the HTML output; otherwise, they will be included.
The filename may include variables, allowing you to tailor the included content based on one or more fields from your input data source. This is especially useful when you would like to include output from another template in the output generated by this template (effectively combining outputs from two separate templates into a single output). If an include file is not found, then it will simply be skipped and processing will continue, with a log message to note the event.
For any conversion resulting in HTML, a pseudo-tag of <toc> can be used to generate a table of contents based on following heading tags. An optional attribute of "from" can be used to specify the beginning of a range of heading levels to be included; an optional attribute of "through" or "thru" can be used to specify the end of a range of heading levels to be included. See the following example.
<toc from="h2" thru="h4" />
The ifchange command can be used to test a variable to see if it has a different value than it did on the last data record. If the variable has changed, then the following lines up to the closing endif command will be subjected to normal output processing. If the variable has not changed, then following lines will be skipped until the closing endif command is encountered. This command can be used to generate some special header information whenever a key field changes. Note that only one variable can be used with ifchange commands in one template file, since the value of any ifchange command is simply compared to the variable for the last ifchange command encountered.
The if command can be used to test a variable to see if it is non-blank. If the variable is non-blank, then the following lines up to the closing endif command will be subject to normal output processing. If the variable is blank, then following lines will be skipped until the closing endif command is encountered. In this case, the first and only operand would be the variable to be tested.
The if command can also be used to test a variable to compare it to one or more constants. In this case, the command would have three or more operands: the name of the variable, a logical operator, and one or more values.
This is the first of five commands that define key fields and then conditionally write output when there is a break on any of those fields. Up to ten group break fields can be defined. Each must be assigned a number from 1 to 10. Numbers should be assigned sequentially beginning with 1. Input data should normally be sorted by the same fields used in any definegroup commands. Definegroup commands should precede ifendgroup and ifnewgroup commands, and should generally be specified in ascending order by group number. The definegroup command has two operands.
Group Number. This must be a number from 1 to 10. Numbers should be assigned sequentially beginning with 1. Lower-numbered groups are considered more major than higher-numbered groups, in the sense that lower-numbered group breaks will automatically trigger higher-numbered group breaks.
Variable Name. This is the name of the key field variable.
This is the second of the five group commands. Lines following this command and preceding the next group or endif command will be written to the output file at the end of a group of records sharing a common value for this key field. Ifendgroup commands should follow definegroup commands and precede ifnewgroup commands, and should generally be specified in descending order by group number. The ifendgroup command has one operand.
Note that references to record variables within an IFENDGROUP block will retrieve the data from the record causing the break (i.e., the first record in the new group), not the last record in the group just ended. Use the SET command to save data in global variables if you need to later access it when a group break has been detected.
This is the third of the five group commands. Lines following this command and preceding the next group or endif command will be written to the output file at the end of a list of records containing this key field. The end of a list will be triggered by a change in key values at the next higher level, or by a record containing blanks at the current group level. Ifendlist commands should follow ifendgroup commands and precede ifnewlist commands, and should generally be specified in descending order by group number. The ifendlist command has one operand. Note that the ifendlist and ifnewlist commands can generally be used to insert HTML tags to end a list and begin a list.
Note that references to record variables within an IFENDLIST block will retrieve the data from the record causing the break (i.e., the first record in the new group), not the last record in the group just ended. Use the SET command to save data in global variables if you need to later access it when a list break has been detected. Note that the ifendlist and ifnewlist commands can generally be used to insert HTML tags to end a list and begin a list.
This is the fourth of the five group commands. Lines following this command and preceding the next group or endif command will be written to the output file at the beginning of a new list of records at this group level. Ifnewlist commands should follow definegroup, ifendgroup and ifendlist commands, should precede ifnewgroup commands, and should generally be specified in ascending order by group number. The ifnewlist command has one operand.
This is the fifth of the five group commands. Lines following this command and preceding the next group or endif command will be written to the output file at the beginning of a group of records sharing a common value for this key field. Ifnewgroup commands should follow all other group commands, and should generally be specified in ascending order by group number. The ifnewgroup command has one operand.
The else command terminates the scope of its corresponding if, ifchange, ifendgroup or ifnewgroup command, and applies the opposite logical condition to the following template lines.
The endif command terminates the scope of its corresponding if, ifchange, ifendgroup or ifnewgroup command.
This command indicates the end of the code that will be written out once per data record. Lines after the loop command will be written out once per output file created, at the end of each file.
The script file is a tab-delimited text file, and you can edit one using your favorite tool for such things. You can create one completely from scratch if you want, but it usually easiest to record one first, and then edit the results.
The script file has the following columns.
Following is a complete list of all the allowable forms for script commands. Constants are displayed in normal type. Variables appear in italics. Blank cells indicate fields that are not applicable to a particular command, and therefore can be left blank or empty. Forward slashes are used to separate alternate values: only one of them must appear (without the slash) in an actual script command. Most of the values correspond directly to equivalent buttons on the tabs, as described elsewhere in this user guide. The one non-intuitive value is probably the Filter values for the andor object: True sets “and” logic on, while False sets “or” logic on.
Note that file names may begin with the literal “PATH” surrounded by “#” symbols. When recording a script, the program will automatically replace the path containing the script file with this literal. In addition, upwards references from the location of the script file will be indicated by two consecutive periods for each level in the folder hierarchy. On playback, the reversing decoding will occur. In effect this means that files within the same path structure as the script file, or a sub-folder, will have their locations identified relative to the location of the script file. Files on a completely different path will have their locations identified with absolute drive and path information. The overall effect of this is to make a script file, along with the input files referenced by the script file, portable packages that can be moved from one location to another, or executed with different drive identifiers, and still execute correctly. Normally all of this will be transparent to the user.
Similarly, the literal “#TEMPLATES#” will be used as a placeholder for the path to the current template library, as set with the Set Template Library button on the Template tab.
The “epubin” and “epubout” actions require some additional description, since they have no correlates on the Script tab just described. The former identifies a directory containing the contents of an e-book in the EPUB format; the latter identifies the “.epub” file to be created using that directory as input.
module | action | modifier | object | value |
---|---|---|---|---|
input | open | url | merge/blank | url name |
input | open | file | merge/blank | file name |
input | open | dir | merge/blank | directory name |
input | open | html1 | merge/blank | file name |
input | open | html2 | merge/blank | file name |
input | open | html3 | merge/blank | file name |
input | open | xml | merge/blank | file name |
input | open | xls | merge/blank | file name |
input | epubin | dir | blank | directory name |
input | epubout | file | blank | file name |
sort | add | Ascending/ Descending | field name | |
sort | clear | |||
sort | set | params | ||
combine | add | dataloss | integer | |
0 = no data loss, 1 = one record overrides, 2 = allow concatenation | ||||
combine | add | precedence | integer | |
+1 = later overrides earlier, -1 = earlier overrides later | ||||
combine | add | minnoloss | integer | |
combine | set | params | ||
filter | set | andor | True/ False | |
filter | add | operator | field name | comparison value |
filter | clear | |||
filter | set | params | ||
output | set | usedict | True/ False | |
output | open | file name | ||
template | open | file | file name | |
template | generate |
The following special column headings are predefined for file directory entries.
The following special column headings are predefined for metadata gathered from Markdown files.
Metadata is provided in the spirit of, although not in complete conformance with, the MultiMarkdown syntax. That is, special lines are expected at the top of the file, each line starting with a key, followed by a colon and then a value, as in the following example.
Title: Markdown Metadata Author: Herb Bowie Tags: Java, Documentation Date: July 4, 2012
Note that there are two variants of this file type, one simply labeled "Markdown Metadata" and the other labeled "Markdown Metadata Tags". The first has only one row per Markdown file, and identifies all tags for that file. The second has one row per tag per file, and identifies only one tag at a time. The first file format would normally be used for a simple index of the files, while the second format would be used to generate an index by tag.
This is a special text file format to allow easy representation of an outline structure. Indention is used to indicate outline levels. The first character of the first line is assumed to be the "bullet" character that will subsequently identify all list items. Blank lines indicate paragraph breaks. A line beginning with "a:" (or simply with "http:") identifies a URL to be associated with the preceding outline item.
This structure is then converted to a columnar data structure, with one row for each paragraph/list item, and with the following columns.
Information about events and other items for club consideration are stored in text files, with one event/item per file.
Note that there are two variants of this file, one labeled "Club Planner" and the other labeled "Club Notes". The first has only one row per event, while the second has one row per note header in the Notes field for each event.
The text files should be organized into folders whose names provide additional data about the items.
The event files should be placed into something like the following folder structure.
The text files themselves consist of a series of field names, each followed by a colon, and then followed by the field data on the same and/or successive lines.
The following lines can be used as a template for creating a Club Planner event.
/*
Fill out this form for a new event, then save it with the name of the event. Use a plain
text editor to complete the form. Fill out as many fields as you can, and leave the
rest blank. Note that text such as this, between a slash asterisk and an asterisk slash,
are comments and not content. Text following a pair of slashes on a line are also
treated as comments. Comments may be deleted once the form is filled out, or left in.
*/
Type:
/*
Type should be one of the following:
Active / Board / Career Networking / Close Meeting /
Collaboration w/Other Clubs / Communication / Community / Culture /
Family / Finance / Membership / Open Meeting / Organization /
Scholarship / Social / Sports / Student Connections
*/
What: // Enter a short title for the event
When: // e.g., Sat Mar 31 at 8 PM
Where: // Name of the venue, street address, city state and zip
Who: // Name of primary contact at email address
Why: // Justification for approving this event
Teaser: // Two or three sentences hitting the high points
Blurb: // One or more paragraphs with additional details
Cost: // e.g., $43 per person
Purchase: // Instructions for purchasing tickets
Tickets: // How purchasers will receive tickets
Quantity: // e.g., Block of 20 seats
Planned Income: // e.g., $43 x 20 = $860
Planned Expense: // e.g., $43 x 20 = $860
Planned Attendance: // How many people do we expect to participate?
Actual Income:
Actual Expense:
Actual Attendance:
Recap: // How did the event go? Any lessons learned?
ID: // Enter the article ID from our Web site
Link: // URL with more info about the event
Venue: // Link with more info about the venue
Image: // URL for an image about the event
News Image: // URL for an image to use in the email newsletter
Discuss: // Points to be discussed at our next board meeting
Notes:
/*
Copy and paste notes about the event. Precede each note with a header line indicating
who it came from, on what date and via what medium. For example:
-- Will Dorchak, Feb 16, via email
Follow the header with a blank line, and the the text of the note, using blank lines
to separate paragraphs.
*/
Field names and definitions follow.
Type: This is the general type of the event. The following values are suggested.
What: A brief descriptive title for the event.
When: An indication of the date and time that the event will be held, in a format emphasizing human readability. This need not be a complete date. It need not and generally should not contain the year, since this can be inferred from the operating year identified in the higher level folder. If an exact date is known, then this field should generally start with a three-character abbreviation for the day of the week. Three-character abbreviations for the month are also recognized and encouraged. Following are perfectly good examples of dates.
Where: The location of the event, including the name of the venue and its address.
Who: Who is assigned to plan, coordinate and host the event. Can include multiple names. Can include email addresses and phone numbers.
Why: Why does the club think it would be a good idea to host the event? Why do we think this would be an event deserving of the club's resources?
Teaser: One to three sentences describing the event. Not intended to provide complete information, but intended to pique the reader's interest and motivate him to read further.
Blurb: Additional information about the event. Need not repeat information in the teaser, and need not repeat additional event details available from other fields, such as When and Where. This field can contain multiple paragraphs, separated by blank lines. Markdown formatting will be applied to this section.
Cost: The cost per person to attend the event. If the event is free, then leave this field blank.
Purchase: Instructions on how to purchase tickets to the event, if any.
Tickets: For purchasers, information on how they are to receive the tickets.
Quantity: Number of seats or tickets available for the event; maximum number of attendees.
Planned Income: The amount of money the club plans to receive for the event. For this and the following dollar amount fields, multiple dollar figures may be interspersed with descriptive words. "$20 x 40" will result in a planned income of $800.00, for example.
Planned Expense: The amount of money planned/budgeted to be spent on the event.
Planned Attendance: The number of attendees built into the club's planning assumptions.
Actual Income: The club's actual income for the event.
Actual Expense: The club's actual expenses for the event.
Actual Attendance: The actual number of people who attended the event.
Recap: A brief summary of how the event went. Can include lessons learned from the event.
ID: After the event has been added to the club web site, the ID assigned to the page by the Content Management System should be entered here. This might be identified in the URL for the event as the "articleid", as in "articleid=17", meaning that an ID of "17" should be entered here.
Link: A URL pointing to a Web page with more information about the event.
Venue: A URL pointing to a Web page with more information about the venue for the event.
Image: A URL pointing to an image that can be used to help advertise the event.
News Image: A URL pointing to an image suitable for use in our newsletter.
Discuss: Identification of any issues to be discussed at an upcoming club meeting.
Notes: One or more blocks of text with information about the event. This field can contain multiple paragraphs, separated by blank lines. Markdown formatting will be applied to this section.
Each block of text should be preceded by a line similar to the following example.
-- AAUM on Feb 21 via email
Note that each such header line contains the following elements:
The following fields will be calculated and placed in the resulting list.
Year: The operating year for the event, if available from one of the enclosing folders (see section above on folder structure).
Status: The event's status, based on its immediately enclosing folder name.
Seq: This is intended as a sort key, to create an agenda for a club meeting. A type of "Open Meeting" will result in a sequence of 1; a type of "Finance" will result in a sequence of 2; a type of "Communication" will result in a sequence of 8; a type of "Close Meeting" will result in a sequence of 9; any other type will result in a sequence of 5.
YMD: This will contain the event's date, or as much of it as is known, in a predictable "yyyy-mm-dd" format that can be used for sorting. The information here is calculated based on the club's operating year and the When field.
File Name: The name of the file, without any folder information, and without a file extension.
Blurb as HTML: The blurb field, converted from Markdown to HTML, suitable for insertion into a Web page or email.
Over/Under: The amount by which the club's actual income and expenses differed from the club's planned income and expenses. A positive amount indicates the club did better than expected, with lower expenses and/or higher income than planned; a negative amount means the club did worse than expected.
Finance Projection: The projected impact of this event on the club's finances, calculated based on the planned and actual income and expenses. A negative number decreases the club's funds, while a positive number increases them. This number is calculated using the planned values if no actuals are available, or the actual values once they are entered.
Short Date: This is a short, human-readable form of the date. It includes a three-letter abbreviation for the day of the week, a three-letter abbreviation for the month, and the 2-digit day of the month.
Notes as HTML: The entire notes field, converted from Markdown to HTML, suitable for insertion into a Web page or email.
The following additional fields are extracted for the Club Notes file format, with one row for each note header.
Note For: The date of the note, in yyyy-mm-dd format, suitable for sorting and/or filtering, as extracted from the note header.
Note From: The source of the note, as extracted from the note header.
Note Via: The medium by which the note was communicated, as extracted from the note header.
Note: The text of the note, following the note header.
Note as HTML: The text of the note, converted from Markdown to HTML, suitable for insertion into a Web page or email.