database.sarang.net
UserID
Passwd
Database
ㆍDBMS
MySQL
PostgreSQL
Firebird
Oracle
Informix
Sybase
MS-SQL
DB2
Cache
CUBRID
LDAP
ALTIBASE
Tibero
DB 문서들
스터디
Community
공지사항
자유게시판
구인|구직
DSN 갤러리
도움주신분들
Admin
운영게시판
최근게시물
DBMS Tutorials 497 게시물 읽기
 News | Q&A | Columns | Tutorials | Devel | Files | Links
No. 497
Database Powered LaTeX Form Letters
작성자
정재익(advance)
작성일
2002-08-08 13:36
조회수
4,277
첨부파일: Latex-mail.zip (41,332bytes)

Powered LaTeX Form Letters

 

Use three common open source tools, LaTeX, PostgreSQL, and Python, to build database driven form letters

--------------------------------------------------------------------------------

 

Pages

1. The Objective

2. PostgreSQL database

3. LaTeX letter template

4. Python puts it together

5. The finished letter and printing

6. Other possibilities

7. The real purpose

 

The Objective

Information management is an important part of most businesses. Unfortunately, many businesses end up using tools that were not designed for data management to run their business (office-suite-data-management-syndrome). As a result,information is often difficult to find, impossible to use, or gets lost altogether. Ideally, it is best if information can be stored using specialized software that was specifically designed for the task---like databases. Many businesses are hesitant to use databases because once information is stored in a database the people using the information need the skills and the software to access it. This article will cover how to use Open Source software tools to create database driven form letters. The software that will be used is:

 

LaTeX (TeTeX v3.14159) - typesetting system capable of producing professional quality printed and electronic documents

PostgreSQL v7.1.3 - relational database system (MySQL or other database systems could also be used)

Python v2.1.2 - programming glue to stick it all together

 

PostgreSQL database

For this example a PostgreSQL database will be used to store customer information. For our simple letter we will only use the following customer information:

 

Title

First Name

Last Name

Address

City

State

Zip Code

 

There are a number of different PostgreSQL clients (graphical,web, and command line) that can be used to setup and manage the data in the PostgreSQL database. For this article we will use the stock command line psql client that comes with PostgreSQL. A script to create the table described in the article can be found here. For those not familiar with PostgreSQL and SQL should refer the PostgreSQL documentation .

 

To create our table and load some sample data into our database create a test database (or use an existing one) and run sample_cust_info.sql script.

 

psql sample_db < sample_cust_info.sql

 

This will create a table called sample_cust_info with the necessary fields for this example and load a few sample customer records.

 

LaTeX letter template

LaTeX is a document preparation system for producing high quality and professional looking documents. You can find more information about LaTeX here. LaTeX is a very powerful typesetting system, but it is not like writing a letter in a word processor. LaTeX allows authors to focus on the content of their document and not worry much about the final appearance (LaTeX takes care of that). LaTeX documents are written in plain text using special mark-up commands and syntax. These LaTeX files are then compiled into a printable format using the latex command. From one LaTeX source file you can create many different output formats. Some of the most common print formats are PDF, Postscript, and DVI. From the LaTeX source file you can also create non-printable documents like web pages, rich text, xml, or plain text files.

 

The disadvantages of using LaTeX is that a businesses must have someone who knows how to write LaTeX documents or pay to have some documents developed. Learning basic LaTeX is not difficult, but for people who are accustomed to using word processors this can be a big change.

 

 

The advantages of using LaTeX are:

 

text based---allows for easy versioning, searching, storing, and retrieving of the documents

many formats can be produced from one LaTeX source file (PDF,PS,XML,HTML,TXT,RTF,...)

produces high quality printed documents

format and layout of documents can be pre-defined and standardized for a consistent look

 

The LaTex letter template for this example is a very simple letter thanking a customer for expressing interest in our web site. Our template differs somewhat from a normal LaTeX document because we have inserted special variables like, %(firstname)s, into the document (template1.tex or a more complex example at template2.tex). Python will recognize these special variables and replace them with information from our database.

 

Python puts it together

 

Python is a very high level programming language that is fairly easy to learn. Python can be considered the super-glue of office automation. We will use python to combine the information from the database with the letter template. The python script uses the letter template to create a separate LaTeX letter file for each customer from from the database. These LaTeX letter files the python program generates can be used like any other LaTeX file. They can be used to generate printed documents, or they can be used to make "emailable" documents. They can also be stored back into the database (they are just text).

 

After the database is setup and the LaTeX letter template is ready we can run the process by doing:

 

psql -Ac "select * from sample_cust_info" sample_db | python latex_mail_merge.py -t template1.tex

 

After the command has finished there should be some LaTeX files named like output1.tex, output2.tex, ... These files look like:

 

\documentclass[ 10pt,letterpaper ]{letter}

 

\address{l4sb.com \\

12345 South Street \\

Dallas, TX 75219}

\signature{Eugene von Niederhausern\\

evonnied@l4sb.com}

\begin{document}

\begin{letter}{John Smith\\

12345 Some Street\\

Dallas, TX 76222}

\opening{Dear Mr. Smith}

We would thank you for expressing interest in the www.l4sb.com website.

We hope that you will benefit from the information we have collected. If you have \emph{any} other

questions please feel free to email use at \Large{l4sb@l4sb.com}

\closing{Sincerely}

\end{letter}

\end{document}

 

 

The python script takes input in a pipe (|) delimited text format, one line per customer record. We could actually take any pipe (|) delimited text file with the proper header record and do:

 

cat sample_input_file.txt | python latex_mail_merge.py

 

and get the same output. If you do not have access to a PostgreSQL database but would still like to try the example you can use the included sample input file. At this point we can manipulate and process the LaTeX text files.

 

The finished letter and printing

Once we have the LaTeX letter files we need to build the printable documents (output1.pdf or from template2, output2.pdf). This is done by using the LaTeX command:

 

latex output1.tex

 

This will output several files, but the one we are interested in is the output1.dvi. This is a printable document in LaTeX's special format. The output1.dvi file can be printed or converted to other printable formats. To print the file do:

dvips output1.dvi

 

Or produce a Postscript file by doing:

dvips -o output1.ps output1.dvi

 

and finally to a PDF document:

 

ps2pdf output1.ps

 

 

This may seem like a lot of extra steps just to get a printable document, but keep in mind that will a little extra scripting these steps can be automated (not done here to show how LaTeX works).

 

We can also take a shortcut and go directly from the LaTeX files to a printable PDF document by using a special version of LaTeX called pdflatex.

 

 

pdflatex output1.tex

 

This will directly create a output1.pdf file.

 

Other possibilities

This example is very simple to best illustrate how to use the tools. Just because this example is simple doesn't mean these tools can not be used in a similar way to create a more robust data driven document system. By taking advantage of the features of these tools we can also create more complex and different types of documents, deliver the same document in different ways (email,letter,fax), create different interfaces to the system (GUI,CLI, or web), or store searchable records of customer correspondence in a database.

 

LaTeX is a mature system and as a result there are many additional document types and add-ons that can be downloaded from CTAN (may or may not be free). Some of the different types of documents include envelopes/labels, invoices, articles, books, or exams. With LaTeX we can also produce several different document formats (html,PDF,text,PS,rtf,xml) from one LaTeX source. Because of the many different output formats we could deliver the document by email, fax or as a web page. With LaTeX we can also insert graphics, tables, references, and complex mathematical equations into a document.

 

We could easily keep a record of customer correspondence by storing the LaTeX files back into the database. These stored LaTeX files can then be used to reproduce or generate the same letter at a latter date. The correspondence records would remain searchable through the usual database tools, because they are just like any other text field in the database.

 

For example, suppose your company has entered into litigation with a former customer. The lawyers have requested all correspondence to the customers as well as correspondence to other customers relating to the topic in dispute. Because all of the information (including the correspondence) is stored in a database you can easily search the database for the records you need. Once you have located the information it is not difficult to regenerate the documents. You not only have the correspondence, but you also have the how, when, and where about the documents. This information can then be packaged and delivered to the attorneys working on the case. Because we can leverage the data management power of the relational database system finding information is not a difficult task.

 

The real purpose

Information management is one of the biggest challenges for businesses today. A fundamental goal of information management should be the separation of information from its presentation and the business logic that processes it. In this example just by the choice of tools we have already accomplished this goal. PostgreSQL stores the data. LaTeX provides the presentation. And, python handles the business logic. Each of these tools performs a specific task and they do it well (UNIX Philosophy 101).

 

Why would a business consider using these tools to produce form letters when many office suites will do the same thing? There is not an easy answer to this question and depending on the business needs an office suite might be a better solution. But, from an automation and data management perspective this solution has many benefits.

 

When dealing with large amounts of data or when a task is repeated frequently the more automation and hands-off processing the better. Almost all of the data manipulation and processing in this example is done with plain text. Only during the printing and final document output does LaTeX convert the document to a non-text format for printing. Text tends to be easier to manipulate and store than binary document formats. Text based processing is more dependable and easier to manage than overly complex binary data formats because it is so simple. Processing complex binary document formats is not necessary when a plain text format, like LaTeX, will accomplish the same thing. There is no need to drink from a fire hose. Automation will be easier and more dependable when processing plain text.

 

Graphical user interfaces are not conducive to office automation. A GUI implies that a user must be present to push the buttons and click the mouse. The programs in this example are not tied to a graphical interface. This does not mean that an interface cannot be created to interact, monitor and configure the process, but if automation is an objective, software that requires a graphical interface should be avoided. With these tools we are free to create the interface that best suites are needs. Web, GUI, and command line are all possible. The standard release of Python includes CGI capabilities and support for standard web document format interfaces. Python modules can be also found to create GUI's using Gtk, Qt, Tk, and MS Win32. We can also leverage many of the third party modules that have been created for Python.

 

With little effort we have successfully separated our data from its presentation and the business rules that drive the process. By keeping the processes separate we also benefit by not locking our data into software that was not designed for data storage. We free the data to be used in other applications or to be analyzed using the proper software. Another major benefit of storing information in a database system is security (accidental or deliberate). Most database systems include authentication and permission controls, while security is an after-thought for office suites. Many database systems offer ways to back up data on a regular basis so accidental loss of information is recoverable.

 

The tools used in this example, LaTeX,PostgreSQL and Python, are licensed as open source software and included with most general Linux distributions. These tools also run on other operating systems besides Linux, like FreeBSD and MS Windows (postgresql requires cygwin : http://www.cygwin.com/).

 

ATTACHMENTS

latex_mail_merg.py : http://www.l4sb.com/Members/Eugene%20von/latex_mail_merge.py/view

template2.tex : http://www.l4sb.com/Members/Eugene%20von/template2.tex/view

template1.tex : http://www.l4sb.com/Members/Eugene%20von/template1.tex/view

output1.pdf : http://www.l4sb.com/Members/Eugene%20von/output1.pdf/view

output2.pdf : http://www.l4sb.com/Members/Eugene%20von/output2.pdf/view

sample_input_data.txt : http://www.l4sb.com/Members/Eugene%20von/sample_input_data.txt/view

sample_cust_info.sql : http://www.l4sb.com/Members/Eugene%20von/sample_cust_info.sql/view

[Top]
No.
제목
작성자
작성일
조회
541관계형 데이터베이스에서 효율적인 데이터 연결 - 조인편
정재익
2002-09-07
9706
514Global Transactions - X/Open XA - Resource Managers
정재익
2002-08-23
4969
498Draft Specs of SQL 3 (PDF)
정재익
2002-08-08
4428
497Database Powered LaTeX Form Letters
정재익
2002-08-08
4277
494Design Transactions and Serializability
정재익
2002-08-06
4324
492파이썬에서 데이터베이스 사용하기
정재익
2002-08-05
12344
456ADO Connection Samples
정재익
2002-07-20
72166
Valid XHTML 1.0!
All about the DATABASE... Copyleft 1999-2024 DSN, All rights reserved.
작업시간: 0.016초, 이곳 서비스는
	PostgreSQL v16.4로 자료를 관리합니다