Documentation Index
Fetch the complete documentation index at: https://unstructured-53-docs-245-multimodal.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
The following information applies only to the Unstructured Ingest CLI and the Unstructured Ingest Python library.The Unstructured SDKs for Python and JavaScript/TypeScript and the Unstructured open-source library do not support this functionality.
Task
You want to process only files with specified extensions, only files at or below a specified size, or both.Approach
For the Ingest CLI, use the following command options. For the Ingest Python library, use the following parameters for theFiltererConfig object.
- Use
--file-glob(CLI) orfile_glob(Python) to specify the list of file extensions to process. - Use
--max-file-size(CLI) ormax_file_size(Python) to specify the maximum size of files to process, in bytes.
To run this example
The following example processes only.pdf and .eml files that have a file size of 100 KB or less. To run this example, you should have a directory
with a mixture of files, including at least one .pdf file and one .eml file, and with at least one of these files having a file size of 100 KB or less.

