Skip to main content

Generating Data Matrix and PDF417 in Python

Β· 5 min read
Serhii Hrekov
software engineer, creator, artist, programmer, projects founder

While QR codes are great for marketing and consumers, industrial and government applications often require formats that are either more compact or hold significantly more data. This article covers how to generate Data Matrix (popular in healthcare and aerospace) and PDF417 (the standard for ID cards and shipping) using Python.

1. Data Matrix vs. PDF417: Which do you need?​

FormatAppearanceKey AdvantageCommon Use Case
Data MatrixSmall square/rectHighest data density; can be etched into metal.Medical devices, microchips, small parts.
PDF417Wide, stacked rowsHigh data capacity (up to 1.1KB); error correction.Driver's licenses, boarding passes, USPS labels.

2. The Universal Tool: treepoem​

For these advanced formats, the standard python-barcode library isn't enough. We use treepoem, which is a Python wrapper for the Barcode Writer in Pure PostScript (BWIPP). It supports virtually every barcode format in existence.

Prerequisite: You must have Ghostscript installed on your system for treepoem to render the images.

# Install the library
pip install treepoem

3. Generating a Data Matrix Code​

Data Matrix codes are preferred when space is at a premium. They can be read even with low contrast or if parts of the code are damaged.

import treepoem

def generate_data_matrix(data, output_file):
# Generate the image object
# 'datamatrix' is the symbology name in BWIPP
image = treepoem.generate_barcode(
barcode_type='datamatrix',
data=data
)

# Save the image
image.convert('1').save(output_file)
print(f"Data Matrix saved as {output_file}")

generate_data_matrix("PART-NX-7782-B", "industrial_part.png")

4. Generating a PDF417 (Stacked Barcode)​

PDF417 stands for "Portable Data File." It is essentially a "stacked" barcode that can encode huge amounts of data, making it ideal for offline verification (like reading a driver's license in a remote area).

import treepoem

def generate_pdf417(data, output_file):
# PDF417 is ideal for encoding long strings or binary data
image = treepoem.generate_barcode(
barcode_type='pdf417',
data=data,
options={"columns": 4} # You can define the width/columns
)

image.convert('1').save(output_file)
print(f"PDF417 Barcode saved as {output_file}")

# Example encoding a multi-line "ID card" string
id_data = "ID:9982\nEXP:2030-12-31\nCLASS:C\nAUTH:ADMIN"
generate_pdf417(id_data, "id_badge.png")

5. Summary: Which Library for Which Task?​

If you are building a complete identification suite, keep this hierarchy in mind:

  • Linear Barcodes (EAN, Code 128): Use python-barcode.
  • Standard QR Codes: Use qrcode or segno.
  • Industrial/Government (Data Matrix, PDF417, Aztec): Use treepoem.

πŸ“š Sources & Further Reading​

Barcode Generation (General):

  1. [1.1] python-barcode Documentation - Official guide for linear barcodes.
  2. [1.2] Pillow (PIL) Documentation - Required for image rendering in most barcode libraries.

Advanced 2D Formats (Data Matrix/PDF417): 3. [2.1] treepoem GitHub Repository - Main documentation for the BWIPP Python wrapper. 4. [2.2] Barcode Writer in Pure PostScript (BWIPP) - The engine behind treepoem that supports 100+ formats. 5. [2.3] GS1 Data Matrix Guideline - Global standard for Data Matrix usage in supply chains. 6. [2.4] PDF417 Standard (ISO/IEC 15438) - Technical specifications for stacked barcodes.

Implementation Guides: 7. [3.1] Real Python: Generating Barcodes in Python - A comparison of different library approaches.