title: "Instill Artifact"
lang: "en-US"
draft: false
description: "Learn about how to set up a VDP Instill Artifact component https://github.com/instill-ai/instill-core"
The Instill Artifact component is a data component that allows users to manipulate and smart search files and data in the artifact store.
It can carry out the following tasks:
To use Artifact Component, you will need to set up the OpenAI API key for self-hosted deployment of Instill Core.
You can do this by setting the OPENAI_API_KEY
environment variable.
Please refer to configuring-the-embedding-feature
p.s. In Instill Cloud case, you do not need to set up the OpenAI API key.
Release Stage
Alpha
Configuration
The component definition and tasks are defined in the definition.json and tasks.json files respectively.
Supported Tasks
Upload File
Upload and process the files into chunks into Catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_UPLOAD_FILE |
Options (required) | options | object | Choose to upload the files to existing catalog or create a new catalog |
The options
Object
Options
options
must fulfill one of the following schemas:
Existing Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID that you input in the Catalog |
File | file | string | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog |
File Name | file-name | string | The name of the file, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "existing catalog" |
Create New Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID for new catalog you want to create |
Description | description | string | Description of the catalog |
File | file | string | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog |
File Name | file-name | string | The name of the file, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "create new catalog" |
Tags | tags | array | Tags for the catalog |
Output | ID | Type | Description |
---|
File | file | object | Result of uploading file into catalog |
Status | status | boolean | The status of trigger file processing, if succeeded, return true |
Output Objects in Upload File
File
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | The ID of the catalog that you upload files |
Create Time | create-time | string | The creation time of the file in ISO 8601 format |
File Name | file-name | string | The name of the file |
Type | file-type | string | The type of the file |
File UID | file-uid | string | The unique identifier of the file |
Size | size | number | The size of the file in bytes |
Update Time | update-time | string | The update time of the file in ISO 8601 format |
Upload Files
Upload and process the files into chunks into Catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_UPLOAD_FILES |
Options (required) | options | object | Choose to upload the files to existing catalog or create a new catalog |
The options
Object
Options
options
must fulfill one of the following schemas:
Existing Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID that you input in the Catalog |
File Names | file-names | array | The names of the files, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Files | files | array | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "existing catalog" |
Create New Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID for new catalog you want to create |
Description | description | string | Description of the catalog |
File Names | file-names | array | The names of the files, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Files | files | array | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "create new catalog" |
Tags | tags | array | Tags for the catalog |
Output | ID | Type | Description |
---|
Files | files | array[object] | Files metadata in catalog |
Status | status | boolean | The status of trigger file processing, if ALL succeeded, return true |
Output Objects in Upload Files
Files
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | The ID of the catalog that you upload files |
Create Time | create-time | string | The creation time of the file in ISO 8601 format |
File Name | file-name | string | The name of the file |
Type | file-type | string | The type of the file |
File UID | file-uid | string | The unique identifier of the file |
Size | size | number | The size of the file in bytes |
Update Time | update-time | string | The update time of the file in ISO 8601 format |
Get Files Metadata
get the metadata of the files in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_GET_FILES_METADATA |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Output | ID | Type | Description |
---|
Files | files | array[object] | Files metadata in catalog |
Output Objects in Get Files Metadata
Files
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | The ID of the catalog that you upload files |
Create Time | create-time | string | The creation time of the file in ISO 8601 format |
File Name | file-name | string | The name of the file |
Type | file-type | string | The type of the file |
File UID | file-uid | string | The unique identifier of the file |
Size | size | number | The size of the file in bytes |
Update Time | update-time | string | The update time of the file in ISO 8601 format |
Get Chunks Metadata
get the metadata of the chunks from a file in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_GET_CHUNKS_METADATA |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
File UID (required) | file-uid | string | The unique identifier of the file |
Output | ID | Type | Description |
---|
Chunks | chunks | array[object] | Chunks metadata of the file in catalog |
Output Objects in Get Chunks Metadata
Chunks
Field | Field ID | Type | Note |
---|
Chunk UID | chunk-uid | string | The unique identifier of the chunk |
Create Time | create-time | string | The creation time of the chunk in ISO 8601 format |
End Position | end-position | integer | The end position of the chunk in the file |
File UID | original-file-uid | string | The unique identifier of the file |
Retrievable | retrievable | boolean | The retrievable status of the chunk |
Start Position | start-position | integer | The start position of the chunk in the file |
Token Count | token-count | integer | The token count of the chunk |
Get File in Markdown
get the file content in markdown format
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_GET_FILE_IN_MARKDOWN |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
File UID (required) | file-uid | string | The unique identifier of the file |
Output | ID | Type | Description |
---|
File UID | original-file-uid | string | The unique identifier of the file |
Content | content | string | The content of the file in markdown format |
Create Time | create-time | string | The creation time of the source file in ISO 8601 format |
Update Time | update-time | string | The update time of the source file in ISO 8601 format |
Match File Status
Check if the specified file's processing status is done
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_MATCH_FILE_STATUS |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to check files' processing status in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
File UID (required) | file-uid | string | The unique identifier of the file |
Output | ID | Type | Description |
---|
Status | succeeded | boolean | The status of the file processing, if succeeded, return true |
Retrieve
search the chunks in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_RETRIEVE |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Text Prompt (required) | text-prompt | string | The prompt string to search the chunks |
Top K | top-k | integer | The number of top chunks to return. The range is from 1~20, and default is 5 |
Output | ID | Type | Description |
---|
Chunks | chunks | array[object] | Chunks data from smart search |
Output Objects in Retrieve
Chunks
Field | Field ID | Type | Note |
---|
Chunk UID | chunk-uid | string | The unique identifier of the chunk |
Similarity | similarity-score | number | The similarity score of the chunk |
Source File Name | source-file-name | string | The name of the source file |
Text Content | text-content | string | The text content of the chunk |
Ask
Reply the questions based on the files in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_ASK |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Question (required) | question | string | The question to reply |
Top K | top-k | integer | The number of top answers to return. The range is from 1~20, and default is 5 |
Output | ID | Type | Description |
---|
Answer | answer | string | Answers data from smart search |
Chunks (optional) | chunks | array[object] | Chunks data to answer question |
Output Objects in Ask
Chunks
Field | Field ID | Type | Note |
---|
Chunk UID | chunk-uid | string | The unique identifier of the chunk |
Similarity | similarity-score | number | The similarity score of the chunk |
Source File Name | source-file-name | string | The name of the source file |
Text Content | text-content | string | The text content of the chunk |
## Example Recipes
Recipe for the Ask your Catalog pipeline.
version: v1beta
component:
artifact-0:
type: instill-artifact
task: TASK_ASK
input:
catalog-id: ${variable.catalog_name}
namespace: ${variable.namespace}
question: ${variable.question}
top-k: 5
variable:
catalog_name:
title: catalog-name
description: The name of your catalog i.e. "instill-ai"
instill-format: string
namespace:
title: namespace
description: The namespace of your catalog i.e. "instill-ai"
instill-format: string
question:
title: question
description: The question to ask your catalog i.e. "What is Instill AI doing?", "What is Artifact?"
instill-format: string
output:
answer:
title: answer
value: ${artifact-0.output.answer}