KAFKA_BOOTSTRAP_SERVER
- The hostname of the bootstrap Kafka cluster to connect to, represented by --bootstrap-server
(CLI) or bootstrap_server
(Python).KAFKA_PORT
- The port number of the cluster, represented by --port
(CLI) or port
(Python).KAFKA_TOPIC
- The unique name of the topic to read messages from and write messages to on the cluster, represented by --topic
(CLI) or topic
(Python).KAFKA_API_KEY
- The Kafka API key value, represented by --kafka-api-key
(CLI) or kafka_api_key
(Python).KAFKA_SECRET
- The secret value for the Kafka API key, represented by --secret
(CLI) or secret
(Python).--confluent
(CLI) or confluent
(Python): True to indicate that the cluster is running Confluent Kafka.--num-messages-to-consume
(CLI) or num_messages_to_consume
(Python): The maximum number of messages to get from the topic. The default is 1
if not otherwise specified.--timeout
(CLI) or timeout
(Python): The maximum amount of time to wait for the response of a request to the topic, expressed in seconds. The default is 1.0
if not otherwise specified.--group-id
(CLI) or group_id
(Python): The ID of the consumer group, if any, that is associated with the target Kafka cluster.
(A consumer group is a way to allow a pool of consumers to divide the consumption of data
over topics and partitions.) The default is default_group_id
if not otherwise specified.--partition-by-api
option (CLI) or partition_by_api
(Python) parameter to specify where files are processed:
--partition-by-api
(CLI) or partition_by_api
(Python), or explicitly specify partition_by_api=False
(Python).
Local file processing does not use an Unstructured API key or API URL, so you can also omit the following, if they appear:
--api-key $UNSTRUCTURED_API_KEY
(CLI) or api_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) or partition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)UNSTRUCTURED_API_KEY
and UNSTRUCTURED_API_URL
--partition-by-api
(CLI) or partition_by_api=True
(Python).
Unstructured also requires an Unstructured API key and API URL, by adding the following:
--api-key $UNSTRUCTURED_API_KEY
(CLI) or api_key=os.getenv("UNSTRUCTURED_API_KEY")
(Python)--partition-endpoint $UNSTRUCTURED_API_URL
(CLI) or partition_endpoint=os.getenv("UNSTRUCTURED_API_URL")
(Python)UNSTRUCTURED_API_KEY
and UNSTRUCTURED_API_URL
, representing your API key and API URL, respectively.https://api.unstructuredapp.io/general/v0/general
, which is the API URL for the Unstructured Partition Endpoint.If you do not have an API key, get one now.If the Unstructured API is self-hosted, the process
for generating Unstructured API keys, and the Unstructured API URL that you use, are different.
For details, contact Unstructured Sales at
sales@unstructured.io.