sttp-openai

sttp is a family of Scala HTTP-related projects, and currently includes:

sttp client: The Scala HTTP client you always wanted!
sttp tapir: Typed API descRiptions
sttp openai: this project. Non-official Scala client wrapper for OpenAI (and OpenAI-compatible) API. Use the power of ChatGPT inside your code!

Intro

sttp-openai uses sttp client to describe requests and responses used in OpenAI (and OpenAI-compatible) endpoints.

Quickstart with sbt

Add the following dependency:

"com.softwaremill.sttp.openai" %% "core" % "0.2.6"

sttp-openai is available for Scala 2.13 and Scala 3

Project content

OpenAI API Official Documentation https://platform.openai.com/docs/api-reference/completions

Example

Examples are runnable using scala-cli.

To use ChatGPT

//> using dep com.softwaremill.sttp.openai::core:0.2.6

import sttp.openai.OpenAISyncClient
import sttp.openai.requests.completions.chat.ChatRequestResponseData.ChatResponse
import sttp.openai.requests.completions.chat.ChatRequestBody.{ChatBody, ChatCompletionModel}
import sttp.openai.requests.completions.chat.message._

object Main extends App {
  val apiKey = System.getenv("OPENAI_KEY")
  val openAI = OpenAISyncClient(apiKey)

  // Create body of Chat Completions Request
  val bodyMessages: Seq[Message] = Seq(
    Message.UserMessage(
      content = Content.TextContent("Hello!"),
    )
  )

  // use ChatCompletionModel.CustomChatCompletionModel("gpt-some-future-version") 
  // for models not yet supported here
  val chatRequestBody: ChatBody = ChatBody(
    model = ChatCompletionModel.GPT4oMini,
    messages = bodyMessages
  )

  // be aware that calling `createChatCompletion` may throw an OpenAIException
  // e.g. AuthenticationException, RateLimitException and many more
  val chatResponse: ChatResponse = openAI.createChatCompletion(chatRequestBody)

  println(chatResponse)
  /*
      ChatResponse(
       chatcmpl-79shQITCiqTHFlI9tgElqcbMTJCLZ,chat.completion,
       1682589572,
       gpt-4o-mini,
       Usage(10,10,20),
       List(
         Choices(
           Message(assistant, Hello there! How can I assist you today?), stop, 0)
         )
       )
  */
}

To use Ollama or Grok (OpenAI-compatible APIs)

Ollama with sync backend:

//> using dep com.softwaremill.sttp.openai::core:0.2.6

import sttp.model.Uri._
import sttp.openai.OpenAISyncClient
import sttp.openai.requests.completions.chat.ChatRequestResponseData.ChatResponse
import sttp.openai.requests.completions.chat.ChatRequestBody.{ChatBody, ChatCompletionModel}
import sttp.openai.requests.completions.chat.message._

object Main extends App {
  // Create an instance of OpenAISyncClient providing any api key 
  // and a base url of locally running instance of ollama
  val openAI: OpenAISyncClient = OpenAISyncClient("ollama", uri"http://localhost:11434/v1")

  // Create body of Chat Completions Request
  val bodyMessages: Seq[Message] = Seq(
    Message.UserMessage(
      content = Content.TextContent("Hello!"),
    )
  )
  
  val chatRequestBody: ChatBody = ChatBody(
    // assuming one has already executed `ollama pull mistral` in console
    model = ChatCompletionModel.CustomChatCompletionModel("mistral"),
    messages = bodyMessages
  )

  // be aware that calling `createChatCompletion` may throw an OpenAIException
  // e.g. AuthenticationException, RateLimitException and many more
  val chatResponse: ChatResponse = openAI.createChatCompletion(chatRequestBody)

  println(chatResponse)
  /*
    ChatResponse(
      chatcmpl-650,
      List(
        Choices(
          Message(Assistant, """Hello there! How can I help you today?""", List(), None),
          "stop",
          0
        )
      ),
      1714663831,
      "mistral",
      "chat.completion",
      Usage(0, 187, 187),
      Some("fp_ollama")
    )
  */
}

Grok with cats-effect based backend:

//> using dep com.softwaremill.sttp.openai::core:0.2.6
//> using dep com.softwaremill.sttp.client4::cats:4.0.0-M17

import cats.effect.{ExitCode, IO, IOApp}
import sttp.client4.httpclient.cats.HttpClientCatsBackend

import sttp.model.Uri._
import sttp.openai.OpenAI
import sttp.openai.OpenAIExceptions.OpenAIException
import sttp.openai.requests.completions.chat.ChatRequestResponseData.ChatResponse
import sttp.openai.requests.completions.chat.ChatRequestBody.{ChatBody, ChatCompletionModel}
import sttp.openai.requests.completions.chat.message._

object Main extends IOApp {
  override def run(args: List[String]): IO[ExitCode] = {
    val apiKey = System.getenv("OPENAI_KEY")
    val openAI = new OpenAI(apiKey, uri"https://api.groq.com/openai/v1")

    val bodyMessages: Seq[Message] = Seq(
      Message.UserMessage(
        content = Content.TextContent("Hello!"),
      )
    )

    val chatRequestBody: ChatBody = ChatBody(
      model = ChatCompletionModel.CustomChatCompletionModel("gemma-7b-it"),
      messages = bodyMessages
    )
    
    HttpClientCatsBackend.resource[IO]().use { backend =>
      val response: IO[Either[OpenAIException, ChatResponse]] =
        openAI
          .createChatCompletion(chatRequestBody)
          .send(backend)
          .map(_.body)
      val rethrownResponse: IO[ChatResponse] = response.rethrow
      val redeemedResponse: IO[String] = rethrownResponse.redeem(
        error => error.getMessage,
        chatResponse => chatResponse.toString
      )
      redeemedResponse.flatMap(IO.println)
        .as(ExitCode.Success)
    }
  } 
  /*
    ChatResponse(
      "chatcmpl-e0f9f78c-5e74-494c-9599-da02fa495ff8",
      List(
        Choices(
          Message(Assistant, "Hello! 👋 It's great to hear from you. What can I do for you today? 😊", List(), None),
          "stop",
          0
        )
      ),
      1714667435,
      "gemma-7b-it",
      "chat.completion",
      Usage(16, 21, 37),
      Some("fp_f0c35fc854")
    )
  */
}

Available client implementations:

OpenAISyncClient which provides high-level methods to interact with OpenAI. All the methods send requests synchronously and are blocking, might throw OpenAIException
OpenAI which provides raw sttp-client4 Requests and parses Responses as Either[OpenAIException, A]

If you want to make use of other effects, you have to use OpenAI and pass the chosen backend directly to request.send(backend) function.

To customize a request when using the OpenAISyncClient, e.g. by adding a header, or changing the timeout (via request options), you can use the .customizeRequest method on the client.

Example below uses HttpClientCatsBackend as a backend, make sure to add it to the dependencies or use backend of your choice.

//> using dep com.softwaremill.sttp.openai::core:0.2.6
//> using dep com.softwaremill.sttp.client4::cats:4.0.0-M17

import cats.effect.{ExitCode, IO, IOApp}
import sttp.client4.httpclient.cats.HttpClientCatsBackend

import sttp.openai.OpenAI
import sttp.openai.OpenAIExceptions.OpenAIException
import sttp.openai.requests.completions.chat.ChatRequestResponseData.ChatResponse
import sttp.openai.requests.completions.chat.ChatRequestBody.{ChatBody, ChatCompletionModel}
import sttp.openai.requests.completions.chat.message._

object Main extends IOApp {
  override def run(args: List[String]): IO[ExitCode] = {
    val apiKey = System.getenv("OPENAI_KEY")
    val openAI = new OpenAI(apiKey)

    val bodyMessages: Seq[Message] = Seq(
      Message.UserMessage(
        content = Content.TextContent("Hello!"),
      )
    )

    val chatRequestBody: ChatBody = ChatBody(
      model = ChatCompletionModel.GPT35Turbo,
      messages = bodyMessages
    )
    
    HttpClientCatsBackend.resource[IO]().use { backend =>
      val response: IO[Either[OpenAIException, ChatResponse]] =
        openAI
          .createChatCompletion(chatRequestBody)
          .send(backend)
          .map(_.body)
      val rethrownResponse: IO[ChatResponse] = response.rethrow
      val redeemedResponse: IO[String] = rethrownResponse.redeem(
        error => error.getMessage,
        chatResponse => chatResponse.toString
      )
      redeemedResponse.flatMap(IO.println)
        .as(ExitCode.Success)
    }
  } 
  /*
    ChatResponse(
      chatcmpl-79shQITCiqTHFlI9tgElqcbMTJCLZ,chat.completion,
      1682589572,
      gpt-3.5-turbo-0301,
      Usage(10,10,20),
      List(
        Choices(
          Message(assistant, Hello there! How can I assist you today?), stop, 0)
        )
      )
    )
  */
}

Create completion with streaming:

To enable streaming support for the Chat Completion API using server-sent events, you must include the appropriate dependency for your chosen streaming library. We provide support for the following libraries: fs2, ZIO, Akka / Pekko Streams and Ox.

For example, to use fs2 add the following dependency & import:

// sbt dependency
"com.softwaremill.sttp.openai" %% "fs2" % "0.2.6"

// import 
import sttp.openai.streaming.fs2._

Example below uses HttpClientFs2Backend as a backend:

//> using dep com.softwaremill.sttp.openai::fs2:0.2.6

import cats.effect.{ExitCode, IO, IOApp}
import fs2.Stream
import sttp.client4.httpclient.fs2.HttpClientFs2Backend

import sttp.openai.OpenAI
import sttp.openai.streaming.fs2._
import sttp.openai.OpenAIExceptions.OpenAIException
import sttp.openai.requests.completions.chat.ChatChunkRequestResponseData.ChatChunkResponse
import sttp.openai.requests.completions.chat.ChatRequestBody.{ChatBody, ChatCompletionModel}
import sttp.openai.requests.completions.chat.message._

object Main extends IOApp {
  override def run(args: List[String]): IO[ExitCode] = {
    val apiKey = System.getenv("OPENAI_KEY")
    val openAI = new OpenAI(apiKey)

    val bodyMessages: Seq[Message] = Seq(
      Message.UserMessage(
        content = Content.TextContent("Hello!"),
      )
    )

    val chatRequestBody: ChatBody = ChatBody(
      model = ChatCompletionModel.GPT35Turbo,
      messages = bodyMessages
    )

    HttpClientFs2Backend.resource[IO]().use { backend =>
      val response: IO[Either[OpenAIException, Stream[IO, ChatChunkResponse]]] =
        openAI
          .createStreamedChatCompletion[IO](chatRequestBody)
          .send(backend)
          .map(_.body)

      response
        .flatMap {
          case Left(exception) => IO.println(exception.getMessage)
          case Right(stream)   => stream.evalTap(IO.println).compile.drain
        }
        .as(ExitCode.Success)
    }
  }
  /*
    ...
    ChatChunkResponse(
      "chatcmpl-8HEZFNDmu2AYW8jVvNKyRO4W4KcO8",
      "chat.completion.chunk",
      1699118265,
      "gpt-3.5-turbo-0613",
      List(
        Choices(
          Delta(None, Some("Hi"), None),
          null,
          0
        )
      )
    )
    ...
    ChatChunkResponse(
      "chatcmpl-8HEZFNDmu2AYW8jVvNKyRO4W4KcO8",
      "chat.completion.chunk",
      1699118265,
      "gpt-3.5-turbo-0613",
      List(
        Choices(
          Delta(None, Some(" there"), None),
          null,
          0
        )
      )
    )
    ...
   */
}

To use direct-style streaming (requires Scala 3) add the following dependency & import:

// sbt dependency
"com.softwaremill.sttp.openai" %% "ox" % "0.2.6"

// import 
import sttp.openai.streaming.ox.*

Example code:

//> using dep com.softwaremill.sttp.openai::ox:0.2.6

import ox.*
import ox.either.orThrow
import sttp.client4.DefaultSyncBackend
import sttp.openai.OpenAI
import sttp.openai.requests.completions.chat.ChatRequestBody.{ChatBody, ChatCompletionModel}
import sttp.openai.requests.completions.chat.message.*
import sttp.openai.streaming.ox.*

object Main extends OxApp:
  override def run(args: Vector[String])(using Ox): ExitCode =
    val apiKey = System.getenv("OPENAI_KEY")
    val openAI = new OpenAI(apiKey)
    
    val bodyMessages: Seq[Message] = Seq(
      Message.UserMessage(
        content = Content.TextContent("Hello!")
      )
    )
    
    val chatRequestBody: ChatBody = ChatBody(
      model = ChatCompletionModel.GPT35Turbo,
      messages = bodyMessages
    )
    
    val backend = useCloseableInScope(DefaultSyncBackend())
    openAI
      .createStreamedChatCompletion(chatRequestBody)
      .send(backend)
      .body // this gives us an Either[OpenAIException, Flow[ChatChunkResponse]]
      .orThrow // we choose to throw any exceptions and fail the whole app
      .runForeach(el => println(el.orThrow))
    
    ExitCode.Success

See also the ChatProxy example application.

Structured Outputs/JSON Schema support

To take advantage of OpenAI's Structured Outputs and support for JSON Schema, you can use ResponseFormat.JsonSchema when creating a completion.

The example below produces a JSON object:

//> using dep com.softwaremill.sttp.openai::core:0.2.6

import scala.collection.immutable.ListMap
import sttp.apispec.{Schema, SchemaType}
import sttp.openai.OpenAISyncClient
import sttp.openai.requests.completions.chat.ChatRequestResponseData.ChatResponse
import sttp.openai.requests.completions.chat.ChatRequestBody.{ChatBody, ChatCompletionModel, ResponseFormat}
import sttp.openai.requests.completions.chat.message._

object Main extends App {
  val apiKey = System.getenv("OPENAI_KEY")
  val openAI = OpenAISyncClient(apiKey)

  val jsonSchema: Schema =
    Schema(SchemaType.Object).copy(properties =
      ListMap(
        "steps" -> Schema(SchemaType.Array).copy(items =
          Some(Schema(SchemaType.Object).copy(properties =
            ListMap(
              "explanation" -> Schema(SchemaType.String),
              "output" -> Schema(SchemaType.String)
            )
          ))
        ),
        "finalAnswer" -> Schema(SchemaType.String)
      ),
    )

  val responseFormat: ResponseFormat.JsonSchema =
    ResponseFormat.JsonSchema(
      name = "mathReasoning",
      strict = true,
      schema = jsonSchema
    )

  val bodyMessages: Seq[Message] = Seq(
    Message.SystemMessage(content = "You are a helpful math tutor. Guide the user through the solution step by step."),
    Message.UserMessage(content = Content.TextContent("How can I solve 8x + 7 = -23"))
  )

  // Create body of Chat Completions Request, using our JSON Schema as the `responseFormat`
  val chatRequestBody: ChatBody = ChatBody(
    model = ChatCompletionModel.GPT4oMini,
    messages = bodyMessages,
    responseFormat = Some(responseFormat)
  )

  val chatResponse: ChatResponse = openAI.createChatCompletion(chatRequestBody)

  println(chatResponse.choices)
  /*
    List(
      Choices(
        Message(
          Assistant,
          {
            "steps": [
              {"explanation": "Start with the original equation: 8x + 7 = -23", "output": "8x + 7 = -23"},
              {"explanation": "Subtract 7 from both sides to isolate the term with x.", "output": "8x + 7 - 7 = -23 - 7"},
              {"explanation": "This simplifies to: 8x = -30", "output": "8x = -30"},
              {"explanation": "Now, divide both sides by 8 to solve for x.", "output": "x = -30 / 8"},
              {"explanation": "Simplify -30 / 8 to its simplest form. Both the numerator and denominator can be divided by 2.", "output": "x = -15 / 4"}
            ],
            "finalAnswer": "x = -15/4"
          },
          List(),
          None
        ),
        stop,
        0
      )
    )
  */
}

Deriving a JSON Schema with tapir

To derive the same math reasoning schema used above, you can use Tapir's support for generating a JSON schema from a Tapir schema:

//> using dep com.softwaremill.sttp.tapir::tapir-apispec-docs:1.11.7

import sttp.apispec.{Schema => ASchema}
import sttp.tapir.Schema
import sttp.tapir.docs.apispec.schema.TapirSchemaToJsonSchema
import sttp.tapir.generic.auto._

case class Step(
  explanation: String,
  output: String
)

case class MathReasoning(
  steps: List[Step],
  finalAnswer: String
)

val tSchema = implicitly[Schema[MathReasoning]]

val jsonSchema: ASchema = TapirSchemaToJsonSchema(
  tSchema,
  markOptionsAsNullable = true
)

Contributing

If you have a question, or hit a problem, feel free to post on our community https://softwaremill.community/c/open-source/

Or, if you encounter a bug, something is unclear in the code or documentation, don’t hesitate and open an issue on GitHub.

Commercial Support

We offer commercial support for sttp and related technologies, as well as development services. Contact us to learn more about our offer!

softwaremill / sttp-openai 0.2.6