Can we convert a CSV file to PDF using Python Scripting?
Hi,
I am trying to convert a CSV file data in to a PDF report. I am basically following this tutorial from the link below. I have imported all the required libraries and trying to test and encountered the below error as shown in the image. Is this a right approach? Please suggest if this is achievable or any other alternate solutions. Thanks!
https://github.com/TECH-SAVVY-GUY/csv2pdf?tab=readme-ov-file
Best Answers
-
@Mohit Valluri , do you try this library?
import pandas as pd
from fpdf import FPDF
# Function to convert CSV to PDF
def csv_to_pdf(csv_file, pdf_file):
# Read the CSV file
df = pd.read_csv(csv_file)
# Create a PDF object
pdf = FPDF()
pdf.set_auto_page_break(auto=True, margin=15)
pdf.add_page()
# Set the font
pdf.set_font("Arial", size=12)
# Add a cell for each column name
for column in df.columns:
pdf.cell(40, 10, column, 1)
pdf.ln()
# Add a cell for each value in the dataframe
for row in df.itertuples(index=False):
for value in row:
pdf.cell(40, 10, str(value), 1)
pdf.ln()
# Save the PDF to a file
pdf.output(pdf_file)
# Usage example
csv_file = "example.csv"
pdf_file = "output.pdf"
csv_to_pdf(csv_file, pdf_file)
print(f"CSV file {csv_file} was converted to PDF file {pdf_file}.")0 -
Be aware that you do not have access to a file system in a Python script in ION. You can pass in a CSV document as a string, convert it and pass it out as a PDF binary object. So the library you are using needs to support that rather than directly working on files only.
The Python script can then be used in a document flow that reads the CSV file passes it to the script and writes the PDF.
Depending on your use case another option might be to use the Document Output API in IDM.
1
Answers
-
As Carsten suggested, need need to find a python library that can produce the output as binary string, After use file connector with binary type to save file a file location
1 -
@Fabiano Silva Thanks for the response. I have tried your script and getting error with numpy and pytz in pandas library. Please refer the error below. Do you have the correct pandas whl library file which supports the python version 3.9? Thanks.
0 -
0
-
It seems Numpy doesn't support Python 3.9 any more. I am unable to find any whl file. Please refer the image below
0 -
You will need to go to the numpy version that does support 3.9 python. Looks like that is version 2.0.2.
https://pypi.org/project/numpy/2.0.2/#files
https://files.pythonhosted.org/packages/12/46/de1fbd0c1b5ccaa7f9a005b66761533e2f6a3e560096682683a223631fe9/numpy-2.0.2-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl1 -
Thanks Brandon! for sharing the URL
1 -
Update: I successfully generated the binary data and downloaded the PDF file from the script. Now, I have created a new data flow to pick the CSV file from the sFTP location, convert the data into a binary format using the script, and write back the data to a File using a document with binary type and writing back the file to output location using sFTP. However, I was encountering the below issue.
Can anyone help me on this, thanks!
com.infor.ion.container.common.exception.ServiceContainerException: Failed to build dom document from the message. Cause: Content is not allowed in prolog.
at com.infor.ion.plugins.file.adapter.internal.format.BodToRawData.createData(BodToRawData.java:165)
at com.infor.ion.plugins.file.adapter.inbound.Pusher.insertEntry(Pusher.java:272)
at com.infor.ion.plugins.file.adapter.internal.FileAdapter.insertEntry(FileAdapter.java:965)
at com.infor.ion.plugins.file.adapter.internal.FileAdapter.handleMessage(FileAdapter.java:470)
at com.infor.ion.container.element.base.adapter.BaseElement.handleMessage(BaseElement.java:459)
at com.infor.ion.container.element.base.adapter.BaseElement.process(BaseElement.java:228)
at com.infor.ion.broker.process.channel.queue.md.MDDontOwnQueuePollOneQueue.handleMessage(MDDontOwnQueuePollOneQueue.java:1490)
at com.infor.ion.broker.process.channel.queue.md.MDDontOwnQueuePollOneQueue.lambda$bodyOfReceivingLoop$1(MDDontOwnQueuePollOneQueue.java:945)
at com.infor.ion.container.common.memory.MemoryProtectionUtils$MemoryProtectionContext.doWithProtectionOfHeap(MemoryProtectionUtils.java:1214)
at com.infor.ion.container.common.memory.MemoryProtectionUtils$MemoryProtectionContext.access$800(MemoryProtectionUtils.java:895)
at com.infor.ion.container.common.memory.MemoryProtectionUtils.doWithProtectionOfHeap(MemoryProtectionUtils.java:558)
at com.infor.ion.broker.process.channel.queue.md.MDDontOwnQueuePollOneQueue.bodyOfReceivingLoop(MDDontOwnQueuePollOneQueue.java:820)
at com.infor.ion.broker.process.channel.queue.md.MDDontOwnQueuePollOneQueue$1.run(MDDontOwnQueuePollOneQueue.java:754)
at com.infor.ion.shared.tenant.ThreadLocalRunnable.run(ThreadLocalRunnable.java:193)
at com.infor.ion.grid.common.util.ObservableSingletonThread.lambda$start$0(ObservableSingletonThread.java:115)
at com.infor.ion.shared.tenant.ThreadLocalRunnable.run(ThreadLocalRunnable.java:193)
at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at java.xml/com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:262)
at java.xml/com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:342)
at com.infor.ion.plugins.file.adapter.internal.format.BodToData.getNodesFromMessage(BodToData.java:260)
at com.infor.ion.plugins.file.adapter.internal.format.BodToRawData.createData(BodToRawData.java:130)
... 16 more0 -
You will not want to write it binary. Typically, what I have done, is convert the binary file to Base64 encoded string. Then you can drop that into a File Template XML Object. That BOD can be sent to the Connection Point and it will automatically base64 decode and drop the file.
1 -
Thanks for your input. I have changed my script to produce a Base64-encoded string, created a new template with the type of XML, and used it in the data flow. Now, I am encountering the error below in the confirm BOD. Do you have any thoughts?
Confirm BOD1 occurred when the script outputs base64 encoded data
Conform BOD2 occurs when the script outputs base64 encoded data in an XML tag.
0 -
<SyncFileTransfer_B64_XML releaseID="9.2"><ApplicationArea><Sender><LogicalID>infor.file.inventoryfile_den</LogicalID><ComponentID>External</ComponentID><ConfirmationCode>OnError</ConfirmationCode></Sender><CreationDateTime>2024-11-01T13:32:31.897Z</CreationDateTime><BODID>infor.file.inventoryfile_den:1730467951896:b80da6d7-5aa5-4fe8-a814-791a3d232690</BODID></ApplicationArea><DataArea><Sync><AccountingEntityID/><LocationID/><ActionCriteria><ActionExpression actionCode="Replace"/></ActionCriteria></Sync><FileTransfer_B64_XML FileName="SupplierInvoice" FileExtension="pdf" FilePath="InforOSShare/SupplierInvoice"><DocumentID>Invoice12345</DocumentID><RawData>VGhpcyBpcyBhIHRlc3Qu</RawData></FileTransfer_B64_XML></DataArea></SyncFileTransfer_B64_XML>
You seem to be outputting only the base64 encoded string as the whole document. If you create a Binary File Template, like the screenshot above, that will create an XML like above code block. The base64 encoded string goes in the RawData node and you have ability to set the Filename and extension.
0 -
Thanks! Could you please share the input file template as well. I was able to generate the PDF now but the content in it is somehow printing XML data with the input data.
Currently I am using the below file template for input.
0 -
The input file could technically use the same File Template, just a change to base64, filename and extension. This is how you can pickup files with Connection Points. You will need to change the File Type Binary and Format Type of Raw Data. Look on the left panel where yours has Text File Type.
0