# README
title: "Audio" lang: "en-US" draft: false description: "Learn about how to set up a VDP Audio component https://github.com/instill-ai/instill-core"
The Audio component is an operator component that allows users to extract and manipulate audio from different sources. It can carry out the following tasks:
Release Stage
Alpha
Configuration
The component definition and tasks are defined in the definition.json and tasks.json files respectively.
Supported Tasks
Chunk Audios
Split audio file into chunks
Input | ID | Type | Description |
---|---|---|---|
Task ID (required) | task | string | TASK_CHUNK_AUDIOS |
Audio (required) | audio | string | Base64 encoded audio file to be split |
Chunk count (required) | chunk-count | integer | Number of chunks to equally split the audio into |
Output | ID | Type | Description |
---|---|---|---|
Audios | audios | array[string] | A list of base64 encoded audios |
Slice Audio
Specify a time range to slice an audio file
Input | ID | Type | Description |
---|---|---|---|
Task ID (required) | task | string | TASK_SLICE_AUDIO |
Audio (required) | audio | string | Base64 encoded audio file to be sliced |
Start time (required) | start-time | integer | Start time of the slice in seconds |
End time (required) | end-time | integer | End time of the slice in seconds |
Output | ID | Type | Description |
---|---|---|---|
Audio | audio | string | Base64 encoded audio slice |
Recipe for the Audio Transcription Generator pipeline.
version: v1beta
component:
audio-spliter:
type: audio
task: TASK_SLICE_AUDIO
input:
audio: ${variable.audio}
end-time: ${variable.end_time}
start-time: ${variable.start_time}
get-transcription:
type: openai
task: TASK_SPEECH_RECOGNITION
input:
audio: ${audio-spliter.output.audio}
model: whisper-1
setup:
api-key: ${secret.INSTILL_SECRET}
variable:
audio:
title: audio
description: the audio you want to get the transcription from
instill-format: audio/*
end_time:
title: end-time
description: the end time you want to extract in seconds i.e. 2 mins is 120 seconds
instill-format: number
start_time:
title: start-time
description: the start time you want to extract in seconds i.e. 2 mins is 120 seconds
instill-format: number
output:
result:
title: result
value: ${get-transcription.output.text}
# Functions
No description provided by the author
# Structs
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
# Type aliases
Base64 encoded audio.