# README
✨ NEW ✨
Google Gemini Multimodal Live support
Introducing support for the Gemini Multimodal Live feature. Here's an example Multimodal Live server showing realtime conversation and video streaming: code
Google Gen AI Go SDK
The Google Gen AI Go SDK enables developers to use Google's state-of-the-art generative AI models (like Gemini) to build AI-powered features and applications. This SDK supports use cases like:
- Generate text from text-only input
- Generate text from text-and-images input (multimodal)
- ...
For example, with just a few lines of code, you can access Gemini's multimodal capabilities to generate text from text-and-image input.
parts := []*genai.Part{
{Text: "What's this image about?"},
{InlineData: &genai.Blob{Data: imageBytes, MIMEType: "image/jpeg"}},
}
result, err := client.Models.GenerateContent(ctx, "gemini-2.0-flash-exp", []*genai.Content{{Parts: parts}}, nil)
Installation and usage
Add the SDK to your module with go get google.golang.org/genai
.
Create Clients
Imports
import "google.golang.org/genai"
Gemini API Client:
client, err := genai.NewClient(ctx, &genai.ClientConfig{
APIKey: apiKey,
Backend: genai.BackendGoogleAI,
})
Vertex AI Client:
client, err := genai.NewClient(ctx, &genai.ClientConfig{
Project: project,
Location: location,
Backend: genai.BackendVertexAI,
})
License
The contents of this repository are licensed under the Apache License, version 2.0.
# Constants
BackendGoogleAI is the Google AI backend.
BackendUnspecified causes the backend determined automatically.
BackendVertexAI is the Vertex AI backend.
Candidates blocked due to the terms which are included from the terminology blocklist.
Candidates blocked due to other reason.
Candidates blocked due to prohibited content.
Candidates blocked due to safety.
Unspecified blocked reason.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Run retrieval only when system decides it is necessary.
Always trigger retrieval.
Token generation stopped because the content contains forbidden terms.
The function call generated by the model is invalid.
Token generation reached the configured maximum output tokens.
All other reasons that stopped the token generation.
Token generation stopped for potentially containing prohibited content.
The token generation stopped because of potential recitation.
Token generation stopped because the content potentially contains safety violations.
Token generation stopped because the content potentially contains Sensitive Personally Identifiable Information (SPII).
Token generation reached a natural stopping point or a configured stop sequence.
The finish reason is unspecified.
Model is constrained to always predicting function calls only.
Default model behavior, model decides to predict either function calls or natural language response.
Model will not predict any function calls.
The function calling config mode is unspecified.
The harm block method uses the probability score.
The harm block method uses both probability and severity scores.
The harm block method is unspecified.
Block low threshold and above (i.e.
Block medium threshold and above.
Block none.
Block only high threshold (i.e.
Turn off the safety filter.
Unspecified harm block threshold.
The harm category is civic integrity.
The harm category is dangerous content.
The harm category is harassment.
The harm category is hate speech.
The harm category is sexually explicit content.
The harm category is unspecified.
High level of harm.
Low level of harm.
Medium level of harm.
Negligible level of harm.
Harm probability unspecified.
High level of harm severity.
Low level of harm severity.
Medium level of harm severity.
Negligible level of harm severity.
Harm severity unspecified.
Python >= 3.10, with numpy and simpy available.
Unspecified language.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
Media resolution set to high (zoomed reframing with 256 tokens).
Media resolution set to low (64 tokens).
Media resolution set to medium (256 tokens).
Media resolution has not been set.
Run retrieval only when system decides it is necessary.
Always trigger retrieval.
Code execution ran for too long, and was cancelled.
Code execution finished but with a failure.
Code execution completed successfully.
Unspecified status.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
# Structs
Content blob.
Class containing a response candidate generated from the model.
Source attributions for content.
Class for citation information when the model quotes another source.
Client is the GenAI client.
ClientConfig is the configuration for the GenAI client.
ClientError is an error that occurs when the GenAI API receives an invalid request from a client.
Result of executing the [ExecutableCode].
Contains the multi-part content of a message.
Configuration for a Control reference image.
Class that represents a Control reference image.
Describes the options to customize dynamic retrieval.
Code generated by the model that is meant to be executed, and the result returned to the model.
URI based data.
A function call.
Function calling config.
Defines a function that the model can generate JSON inputs for.
A function response.
Class for configuring optional model parameters.
Class for configuring the content of the request to the model.
Response message for PredictionService.GenerateContent.
Content filter results for a prompt sent in the request.
Usage metadata about response(s).
Generation config.
The configuration for routing the request to a specific model.
When automated routing is specified, the routing will be determined by the pretrained routing model and customer provided model routing preference.
When manual routing is set, the specified model will be used directly.
Tool to support Google Search in Model.
Tool to retrieve public web data for grounding, powered by Google.
Grounding chunk.
Chunk from context retrieved by the retrieval tools.
Chunk from the web.
Metadata returned to client when grounding is enabled.
Grounding support.
Class that represents an image.
Live struct encapsulates the configuration for realtime interaction with the Generative Language API.
Incremental update of the current conversation delivered from the client.
Messages sent by the client in the API call.
User input that is sent in real time.
Message contains configuration that will apply for the duration of the streaming session.
Client generated response to a `ToolCall` received from the server.
Config class for the session.
Incremental server update generated by the model in response to client messages.
Response message for API call.
Sent in response to a `LiveGenerateContentSetup` message from the client.
Request for the client to execute the `function_calls` and return the responses with the matching `id`s.
Notification for the client that a previously issued `ToolCallMessage` with the specified `id`s should have been not executed and should be cancelled.
Logprobs Result.
Candidate for the logprobs token and score.
Candidates with top log probabilities at each decoding step.
Configuration for a Mask reference image.
Class that represents a Mask reference image.
No description provided by the author
A datatype containing media content.
The configuration for the prebuilt speaker to use.
Class that represents a Raw reference image.
Defines a retrieval tool that model can call to access external knowledge.
Metadata related to retrieval in the grounding flow.
Safety rating corresponding to the generated content.
Safety settings.
Schema that defines the format of input and output data.
Google search entry point.
Segment of the content.
ServerError is an error that occurs when the GenAI API encounters an unexpected server problem.
Session struct represents a realtime connection to the API.
The speech generation configuration.
Configuration for a Style reference image.
Class that represents a Style reference image.
Configuration for a Subject reference image.
Class that represents a Subject reference image.
Tool details of a tool that the model may use to generate a response.
Tool that executes code generated by the model, and automatically returns the result to the model.
Tool config.
Used to override the default configuration.
Configuration for upscaling an image.
User-facing config UpscaleImageParameters.
Retrieve from Vertex AI Search datastore for grounding.
Retrieve from Vertex RAG Store for grounding.
The definition of the RAG resource.
Metadata describes the input video content.
The configuration for the voice to use.
# Type aliases
Backend is the GenAI backend to use for the client.
Blocked reason.
Enum representing the control type of a control reference image.
Config class for the dynamic retrieval config mode.
The reason why the model stopped generating tokens.
Config class for the function calling config mode.
Specify if the threshold is used for probability or severity score.
The harm block threshold.
Harm category.
Harm probability levels in the content.
Harm severity levels in the content.
Programming language of the `code`.
Enum representing the mask mode of a mask reference image.
The media resolution to use.
The mode of the predictor to be used in dynamic retrieval.
Outcome of the code execution.
Enum representing the subject type of a subject reference image.
A basic data type.