pkg.gl

Categorygithub.com/instill-ai/pipeline-backendpkgcomponentoperatoraudiov0

package

0.40.0-beta

Repository: https://github.com/instill-ai/pipeline-backend.git

Documentation: pkg.go.dev

# README

title: "Audio" lang: "en-US" draft: false description: "Learn about how to set up a VDP Audio component https://github.com/instill-ai/instill-core"

The Audio component is an operator component that allows users to extract and manipulate audio from different sources. It can carry out the following tasks:

Chunk Audios
Slice Audio

Release Stage

Alpha

Configuration

The component definition and tasks are defined in the definition.json and tasks.json files respectively.

Supported Tasks

Chunk Audios

Split audio file into chunks

Input	ID	Type	Description
Task ID (required)	`task`	string	`TASK_CHUNK_AUDIOS`
Audio (required)	`audio`	string	Base64 encoded audio file to be split
Chunk count (required)	`chunk-count`	integer	Number of chunks to equally split the audio into

Output	ID	Type	Description
Audios	`audios`	array[string]	A list of base64 encoded audios

Slice Audio

Specify a time range to slice an audio file

Input	ID	Type	Description
Task ID (required)	`task`	string	`TASK_SLICE_AUDIO`
Audio (required)	`audio`	string	Base64 encoded audio file to be sliced
Start time (required)	`start-time`	integer	Start time of the slice in seconds
End time (required)	`end-time`	integer	End time of the slice in seconds

Output	ID	Type	Description
Audio	`audio`	string	Base64 encoded audio slice

## Example Recipes

Recipe for the Audio Transcription Generator pipeline.

version: v1beta
component:
  audio-spliter:
    type: audio
    task: TASK_SLICE_AUDIO
    input:
      audio: ${variable.audio}
      end-time: ${variable.end_time}
      start-time: ${variable.start_time}
  get-transcription:
    type: openai
    task: TASK_SPEECH_RECOGNITION
    input:
      audio: ${audio-spliter.output.audio}
      model: whisper-1
    setup:
      api-key: ${secret.INSTILL_SECRET}
variable:
  audio:
    title: audio
    description: the audio you want to get the transcription from
    instill-format: audio/*
  end_time:
    title: end-time
    description: the end time you want to extract in seconds i.e. 2 mins is 120 seconds
    instill-format: number
  start_time:
    title: start-time
    description: the start time you want to extract in seconds i.e. 2 mins is 120 seconds
    instill-format: number
output:
  result:
    title: result
    value: ${get-transcription.output.text}