Releases: cequence-io/openai-scala-client
Version 1.2.0
1. ResponsesAPI
- Unified function/endpoint combining chat simplicity with tool use and state management.
- Out-of-the-box tool support - It natively supports first-party tools like
web_search
,file_search
, andcomputer_use
, enabling you to invoke these capabilities without additional orchestration. - Built-in multi-turn conversation chaining- Use the
previous_response_id
parameter to link requests into a chain of turns, and theinstructions
parameter to inject or override system/developer messages on a per-call basis. - Multimodal input and output - Beyond text, the API accepts images and audio in the same request, letting you build fully multimodal, tool-augmented experiences in a single call.
2. Core and OpenAI Enhancements
- JSON mode handling improvements and fallback json-repair implementation (port of
json-repair
by @mangiucugna) - New models:
o3
,o4-mini
,gpt-4.1
, andgpt-4.5
series - Web search support (
gpt-4o-search-preview
) - Chat completion parameters expanded (
store
,reasoning_effort
,service_tier
,parallel_tool_calls
,metadata
) - Streaming and non-streaming IO conversion adapters developed and enhanced
- Token counting updated (
jtokkit v1.1.0
) - Usage analytics improved
3. Anthropic Platform Enhancements
- Thinking and streaming settings integration
- Claude 3.7 Sonnet (Vanilla and via Bedrock)
- Citations handling, text blocks encoding improvements
- Caching support
- Enhanced token-limit error handling and mapped Anthropic to OpenAI exceptions
- A ton of new examples (also for Vision and PDF processing)
4. Google Gemini Integration
- New Google Gemini module and models introduced (Gemini 2.5 / 2.0 Pro and Flash)
- Gemini JSON schema handling improved, including OpenAI wrapper integration
- System message caching, domain content management, and usage tracking adjustments
- Btw. Google Vertex now also supports JSON schema mode
5. Perplexity Sonar Integration
- New Perplexity Sonar module and models introduced (sonar-deep-research, reasoning-pro, sonar-pro, etc.)
- Sonar JSON and regex response support, and citations formatting/handling
- OpenAI chat completion wrappers
6. Other Providers: Deepseek, Groq, Grok, FireworksAI, and Novita
- Groq JSON handling unified and adjusted, with
deepseek-r1-distill-llama-70b
integration - JSON schema handling for Grok models
- FireworksAI improvements (document inlining), Deepseek model integrations
- Message conversions, filtering thinking tokens, reasoning effort examples
- Llama 4 family
- New Deepseek models (deepseek-r1, DeepSeek-R1 distill) across providers (FireworksAI, Groq, Together AI), plus other models such as Phi-3-vision-128k-instruct, Deepseek-v2-lite-chat, and Llama-3.3-70b
- New chat completion provider:
Novita
— Welcome to the family!
7. General Project Setup and CI/CD
- Build setup adjustments (build.sbt registrations, env helpers)
- GitHub CI - upload-artifact version bump (to v4)
- Example datasets added (e.g., norway_wiki dump), imports optimized
- README extended with more examples
Version 1.1.2
-
Amazon Bedrock support: Anthropic models, payload encoding, and AWS stream decoders
-
O1 models support: system/developer messages, JSON mode, etc.
-
New non-OpenAI models: Deepseek v3, Gemini 2.0 Flash (w. thinking), and Llama3
-
Adapters: chat completion intercept (for logging and benchmarking)
-
Other: new examples, Google Vertex upgrade, custom
parseJson
(forcreateChatCompletionWithJSON
)
Version 1.1.1
Sure, here's the proofread version:
-
New Models:
- Claude 3.5 Sonnet / Haiku, Llama 3.2, grok-beta, gpt-4o-2024-11-20, etc.
-
New Providers:
- Grok and Deepseek (with examples)
-
Enhanced Anthropic Integration:
- Message/block caching
- System prompt generalized as a "message"
- PDF block processing
-
Better Support for JSON:
createChatCompletionWithJSON
-
Adapters:
- Failover models support for chat completion (
createChatCompletionWithFailover
) - Core adapters/wrappers abstracted and relocated to
ws-client
- Failover models support for chat completion (
-
Fixes:
- Scala 2.13 JSON schema serialization
-
Refactoring:
ChatCompletionBodyMaker
(removedWSClient
dependency)- Removed
org.joda.time.DateTime
Version 1.1.0
- Support for O1 Models: Introduced special handling for settings and conversion of system messages.
- New Endpoints: Added endpoints for runs, run steps, vector stores, vector store files, and vector store file batches.
- Structured Output: Enhanced output with JSON schema and introduced experimental reflective inference from a case class.
- New Providers: Added support for Google Vertex AI, TogetherAI, Cerebras, and Mistral.
- Fixes: Resolved issues related to fine-tune jobs format, handling of log-probs with special characters, etc.
- Examples/Scenarios: A plethora of new examples!
Lastly, a huge thanks 🙏 to our contributors: @bburdiliak, @branislav-burdiliak, @vega113, and @SunPj!
Version 1.0.0
API Updates
- New functions/endpoints for assistants, threads, messages, and files (thanks @branislav-burdiliak) 🎉
- Fine-tuning API updated (checkpoints, explicit dimensions, etc.)
- Chat completion with images (GPT vision), support for logprobs
- Audio endpoint updates - transcription, translation, speech
- New message hierarchy
- Support for tool calling
- New token count subproject (thanks @branislav-burdiliak , @piotrkuczko)
New Models and Providers
- Improved support for Azure OpenAI services
- Support for Azure AI models
- Support for Groq, Fireworks, OctoAI, Ollama, and FastChat (with examples) 🎉
- New Anthropic client project with OpenAI-compatible chat completion adapter (thanks @branislav-burdiliak) 🎉
- NonOpenAIModelId const object holding the most popular non-openai models introduced (e.g. Llama3-70b, Mixtral-8x22B)
Factories and Adapters
- Full, core, and chat completion factories refactored to support easier and more flexible streaming extension
- Service wrappers refactored and generalized to adapters
- Route adapter allowing the use of several providers (e.g., Groq and Anthropic) alongside OpenAI models for chat-completions 🎉
- Other adapters that have been added include round robin/random order load balancing, logging, retry, chat-to-completion, and settings/messages adapter
Bug Fixes and Improvements
- Fixed null bytes handling in TopLogprobInfo JSON format
- WS request - handling slash at the end of URLs
- Made "data:" prefix handling in
WSStreamRequestHelper
more robust MultipartWritable
- added extension-implied content type for Azure file upload- Fixed chat completion's response_format_type
Examples and Tests
- New example project demonstrating usage of most of the features, providers, and adapters (more than 50)
- Tests for token counts and JSON (de)serialization
Breaking Changes/Migrations:
- Instead of the deprecated
MessageSpec
, use typed{System, User, Assistant, Tool}Message
- Instead of
createChatFunCompletion
withFunctionSpec
(s), migrate tocreateChatToolCompletion
usingToolSpec
(s) - Use a new factory
OpenAIChatCompletionServiceFactory
with a custom URL and auth headers when only chat-completion is supported by a provider - Migrate streaming service creation from
OpenAIServiceStreamedFactory
toOpenAIStreamedServiceFactory
orOpenAIServiceFactory.withStreaming()
(required import:OpenAIStreamedServiceImplicits._
) - Note that
OpenAIServiceStreamedExtra.listFineTuneEventsStreamed
has been removed - Migrate
OpenAIMultiServiceAdapter.ofRoundRobin
andofRandomOrder
toadapters.roundRobin
andadapters.randomOrder
(where adapter is, e.g.,OpenAIServiceAdapters.forFullService
) - Migrate
OpenAIRetryServiceAdapter
toadapters.retry
CreateImageSettings
->CreateImageEditSettings
Version 0.5.0
- Fine-tuning functions/endpoints updated to match the latest API:
- Root URL changed to
/fine_tuning/jobs
. - Fine tune job holder (
FineTuneJob
) adjusted - e.g.finished_at
added,training_file
vstraining_files
(⚠️ Breaking Changes). OpenAIService.createFineTune
- settings (CreateFineTuneSettings
) adapted - all attributes exceptn_epochs
,model
, andsuffix
dropped: e.g.batch_size
,learning_rate_multiplier
,prompt_loss_weight
(⚠️ Breaking Changes).OpenAIService.listFineTunes
- new optional params:after
andlimit
OpenAIService.listFineTuneEvents
- new optional params:after
andlimit
- Root URL changed to
- Azure service factory functions fixed:
forAzureWithApiKey
andforAzureWithAccessToken
. - New models added:
gpt-3.5-turbo-instruct
(with 0914 snapshot),davinci-002
, andbabbage-002
. - Deprecations:
OpenAIService.createEdit
andOpenAIServiceStreamedExtra.listFineTuneEventsStreamed
(not supported anymore). - Links to the official OpenAI documentation updated.
Version 0.4.1
- Retries:
- RetryHelpers trait and retry adapter implementing non-blocking retries introduced (thanks @phelps-sg).
- RetryHelpers - fixing exponential delay, logging, fun(underlying) call, removing unused implicits.
- New exceptions introduced (
OpenAIScalaUnauthorizedException
,OpenAIScalaRateLimitException
, etc) with error handling/catching plus "registering" those that should be automatically retried. ⚠️ Old/deprecatedOpenAIRetryServiceAdapter
removed in favor of a new adapter in the client module (migrate if necessary).
- Support for "OpenAI-API-compatible" providers such as FastChat(Vicuna, Alpaca, LLaMA, Koala, fastchat-t5-3b-v1.0, mpt-7b-chat, etc), Azure, or any other similar service with a custom URL. Explicit factory methods for Azure (with API Key or Access Token) provided.
- Scalafmt and Scalafix configured and applied to the entire source base.
- Fixing test dependencies (e.g.,
akka-actor-testkit-typed
,scala-java8-compat
) and replacingmockito-scala-scalatest
withscalatestplus
mockito
. - Configuring GitHub CI with crossScalaVersions, newer Java, etc. (thanks @phelps-sg)
Version 0.4.0
- Function call support for the chat completion with a json output:
- Function
createChatFunCompletion
introduced with functions defined by a new data holderFunctionSpec
- To stay compatible with the standard
createChatCompletion
function the existingMessageSpec
has not been changed - Important
⚠️ : to harmonize (redundant) response classes,ChatMessage
was replaced by the identicalMessageSpec
class
- Function
- New
gpt-3.5-turbo
andgpt-4
models for the snapshot0613
added toModelId
. Old models deprecated. OpenAIService.close
method declared with parentheses (thanks @phelps-sg).- Scala
2.12
bumped to2.12.18
and Scala2.13
to2.13.11
. - sbt version bumped to
1.9.0
(thanks @phelps-sg) Command
renamed toEndPoint
andTag
renamed toParam
.OpenAIMultiServiceAdapter
-ofRotationType
deprecated in favor of a new nameofRoundRobinType
OpenAIMultiServiceAdapter
-ofRandomAccessType
deprecated in favor ofofRandomOrderType
.
Version 0.3.3
- Scala 3 compatibility refactorings -
JsonFormats
, Scala-Guice lib upgrade, Guice modules, Enums migrations OpenAIService.close
function for the underlying ws client exposed- OpenAI service wrapper with two adapters introduced: 1. Multi-service (load distribution) - rotation or random order, 2. Retry service - custom number of attempts, wait on failure, and logging
Version 0.3.2
- Removing deprecated/unused command/endpoint (
engines
) and settings (ListFineTuneEventsSettings
) - New exception
OpenAIScalaTokenCountExceededException
introduced - thrown for completion or chat completion when the number of tokens needed is higher than allowed - Migrating
Command
andTag
enums to sealed case objects to simplify Scala 3 compilation - Resolving the
scala-java8-compat
eviction problem for Scala 2.13 and Scala 3