Question
I get CSV files from an S3 bucket using the Read File/Folder operation.
As I saw the files got in HULFT Square's storage, their sizes did not match with the sizes of the original files in S3.
There are no problems with small files but with large files, the files in the storage are smaller than the original files in S3.
Is there any limitation to the file size that the S3 connector can handle?
Answer
It is assumed that the question corresponds to the following limitation.
HULFT Integrate Service may run out of its disk space when reading a large file(s) in file reading operations. Even in the event of insufficient capacity, script execution will not result in an error and the process will continue with missing data in the file.
If a single HULFT Integrate Service executes multiple file operations at the same time, or executes other operations that exceed the maximum file size that can be handled, the capacity shortage will occur.
As of May 2023, the file size that can be handled simultaneously by a HULFT Integrate Service is approximately 12GB.
However, for cloud connectors, the limit is approximately 6GB due to the internal operation to create temporary files.
The connectors related to the file operations are as follows:
Basic
・Assertion
・Compare File
File
・CSV
・Excel
・Excel (POI)
・XML
・Fixed-Length
・Variable-Length
・File Operations
・Filesystem
Network
・FTP
Cloud
・Amazon S3*
・Azure BLOB Storage*
・Google BigQuery*
・Google Cloud Storage*
・Google Drive*
・Google Sheets*
・Box*
・SharePoint*
Encryption
・PGP
*For cloud connectors, disk consumption may be twice as large as the size of the file being handled.
Please also refer to the following page for more information on this limitation.
Known Issues
https://www.hulft.com/help/en-us/HULFTSquare/Content/TOP/KnownIssues.htm
Supplementary information
The execution log in Designer and the script's event log will not indicate disk capacity shortages when they occur.
However, for the following cloud connectors, it is possible to check if disk space shortage has occurred by monitoring the value of the component's output schema [file] - [status] ("Error" is stored to the schema when disk space is insufficient).
By using the value of the output schema [file] - [status] and combining it with Conditional Branch operation, error handling is possible.
Please refer to the following pages for more information on output schema values.
Amazon S3 - Read File/Folder
https://www.hulft.com/help/en-us/HULFTSquare/Content/Designer/Connector/amazons3_get.htm
Azure BLOB Storage - Read File/Directory
https://www.hulft.com/help/en-us/HULFTSquare/Content/Designer/Connector/azure_storage_blob_get.htm
Google Cloud Storage - Read File/Folder
https://www.hulft.com/help/en-us/HULFTSquare/Content/Designer/Connector/google_cloud_storage_get.htm
Google Drive- Read File/Folder
https://www.hulft.com/help/en-us/HULFTSquare/Content/Designer/Connector/google_drive_get.htm
Related FAQs
Are there any recommended permission settings when connecting to AWS S3?
https://support.square.hulft.com/hc/articles/12145594708116
Comments
0 comments
Article is closed for comments.