In todayβs fast-paced business environment, efficiency is key. For back-office teams handling multiple PDF reports generated from different systems, automating the process of merging and renaming these files can save a significant amount of time and reduce errors. In this blog post, weβll explore how to use the pdftk
command in Linux to merge multiple PDF files and rename the final output based on customer names. Weβll also provide a step-by-step guide to implementing this automation using bash scripts.
The Problem β οΈ
Imagine you have a situation where multiple systems produce PDF reports for different customers. Before distributing these reports, you need to merge all the system-generated PDF reports for each customer. These reports are named after the customer’s ID and are stored in various subfolders (e.g., Part1, Part2, Part3). Your task is to:
- Merge the PDF parts for each customer into a single PDF file.
- Store the merged file in a
Final
subfolder. - Rename the merged file from the customer ID to the customerβs name.
- Automate this process for multiple customers using a CSV file that maps customer IDs to their names.
The Solution π‘
Weβll use the pdftk
command-line tool to merge the PDF files and bash scripts to automate the process. Hereβs how you can achieve this:
Step 1: Install pdftk
If you donβt already have pdftk
installed, you can install it using your package manager. For example, on Ubuntu, you can run:
sudo apt-get install pdftk
Step 2: Create the Bash Scripts
Weβll create two bash scripts: GenReport.sh
and GenReportInBulk.sh
.
GenReport.sh
This script will merge the PDF parts for a given customer ID and store the merged file in the Final
folder.
#!/bin/bash # Filename: GenReport.sh # Merge multiple PDF files pdftk A=./Part1/$1.pdf B=./Part2/$1.pdf C=./Part3/$1.pdf cat A B C output ./Final/$1.pdf
GenReportInBulk.sh
This script will read a CSV file containing customer information, call GenReport.sh
for each customer, and rename the merged PDF file based on the customerβs name.
#!/bin/bash # Filename: GenReportInBulk.sh # Read the CSV file input="sample_data.csv" # Loop through each line in the CSV file while IFS=, read -r pdf_filename customer_name customer_id do # Skip the header line if [ "$pdf_filename" != "PDF filename" ]; then # Pass the customer ID to GenReport.sh as an input parameter ./GenReport.sh "$customer_id" # Check if the report generation was successful if [ $? -eq 0 ]; then # Rename the generated PDF file (customer_id.pdf) in the Final subfolder to the value in column 1 mv "Final/${customer_id}.pdf" "Final/$pdf_filename" else echo "Failed to generate report for customer ID: $customer_id" fi fi done < "$input"
Step 3: Prepare the CSV File
The CSV file (sample_data.csv
) should contain the mapping between the customer ID and the customer name. Hereβs an example:
PDF filename,Customer Name,Customer ID john_doe.pdf,John Doe,12345678 jane_smith.pdf,Jane Smith,87654321
Step 4: Give Execute Permissions to the Scripts
Before running the scripts, make sure they have execute permissions:
chmod +x GenReport.sh chmod +x GenReportInBulk.sh
Step 5: Run the Script
Now, you can run the GenReportInBulk.sh
script to process all the customer reports:
./GenReportInBulk.sh
Expected Folder Structure After Execution
Before running the script, your folder structure might look like this:
. βββ Final β βββ GenReportInBulk.sh βββ GenReport.sh βββ Part1 β βββ 12345678.pdf β βββ 87654321.pdf βββ Part2 β βββ 12345678.pdf β βββ 87654321.pdf βββ Part3 β βββ 12345678.pdf β βββ 87654321.pdf βββ sample_data.csv
After running the script, the Final
folder will contain the merged and renamed PDF files:
. βββ Final β βββ jane_smith.pdf β βββ john_doe.pdf βββ GenReportInBulk.sh βββ GenReport.sh βββ Part1 β βββ 12345678.pdf β βββ 87654321.pdf βββ Part2 β βββ 12345678.pdf β βββ 87654321.pdf βββ Part3 β βββ 12345678.pdf β βββ 87654321.pdf βββ sample_data.csv
Conclusion π
By using pdftk
and a couple of bash scripts, you can automate the process of merging and renaming PDF files, saving your back-office team valuable time and reducing the risk of errors. This approach can be easily adapted to more complex scenarios, such as handling more parts of the report.
Weβd love to hear from you! Have you tried automating PDF workflows in your back office? Do you have other tips or tools that have helped increase operational efficiency? Share your experiences, challenges, or success stories in the comments below. Letβs start a discussion and learn from each other to make our workflows even better! π¬π
Feel free to modify the scripts to suit your specific needs, and happy automating!π€
Pingback: Automating Monthly PDF Report Delivery to Clients Using `swaks` in Linux - yewhuat.com