Integrating Ollama’s APIs with Flutter: Building a Local ChatGPT Flutter App

Lavkant Kachhwaha


LLM Conversations: Ollama + Flutter for Affordable ChatGPT Apps

In this tutorial, we’ll explore how to harness Ollama’s local APIs within a Flutter app to create a conversational AI experience similar to ChatGPT.

Prerequisites :

Before we dive in, make sure you have:

  • Basic understanding of Flutter development
  • Basic understanding of Ollama and LLMs
  • Basic knowledge of RESTful APIs

Setting up Ollama :

  • Download the Ollama application for your operating system (Mac, Windows, or Linux) from the official website.
  • Install the downloaded Ollama application by following the on-screen instructions.
  • Once installed, the CLI tools necessary for local development will be automatically installed alongside the Ollama application.
  • You’re now ready to start using Ollama locally for your development needs!

Download ↓

curl --location 'http://localhost:11434/'

After running this Curl, you will be receiveing this message “Ollama is running”

Setting Up Flutter Project for Ollama Integration :

Create a New Flutter Project

  • Open your terminal or command prompt.
  • Run the command flutter create <project_name> to create a new Flutter project.
  • Navigate into the project directory using cd <project_name>.

Add Dependencies:

  • Open the pubspec.yaml file in your Flutter project.
  • Add the dependencies for clean architecture, GetIt, auto_router, and flutter_bloc:
auto_route: ^7.8.4
dio: ^5.4.1
equatable: ^2.0.5
flutter_bloc: ^8.1.4
get_it: ^7.6.7
injectable: ^2.3.2

auto_route_generator: ^7.3.2
build_runner: ^2.4.8
injectable_generator: ^2.4.1

Will be following Clean Architecture For App

| |____datasource
| | |____remote_chat_datasource.dart
| |____repository
| | |____chat_repository.dart
| |____repository
| | |____chat_repository.dart
| |____entity
| | |____chat_response_entity.dart
| |____usecase
| | |____get_chat_response_usecase.dart
| |____cubit
| | |____chat_cubit.dart
| | |____chat_state.dart
| |____page
| | |____chat_page.dart

Will be Using Injectable Generator for Dependency Injection and Registering Modules

Will be Using AutoRoute Generator for Router Navigation

Datasource : Chat Datasource

abstract class ChatDatasource {
Stream<ChatResponseEntity> getChatResponseFromServer({required String userInput});
Future<void> abortCurrentRequest();

class RemoteChatDatasource extends ChatDatasource {
final HttpClient _client;
late HttpClientRequest request;

Stream<ChatResponseEntity> getChatResponseFromServer({required String userInput}) async* {
try {
request = await,basePort,basePath);
Map<String, dynamic> jsonMap = {"model": baseModel, "prompt": userInput};
String jsonString = json.encode(jsonMap);
List<int> bodyBytes = utf8.encode(jsonString);
HttpClientResponse response = await request.close();

await for (final chunk in response.transform(utf8.decoder)) {
final resp = json.decode(chunk);
if (!resp['done']) {
yield ChatResponseEntity.fromJson(resp);
} else {
yield ChatResponseEntity.fromJson(resp);
} catch (e) {

Future<void> abortCurrentRequest() async {
try {
request.addError('request aborted');
} catch (e) {
  • The getChatResponseFromServer method sends a message to a server and waits for responses. It then converts these responses into objects called ChatResponseEntity.
  • If the server indicates that the conversation isn’t finished (resp['done'] == false), it adds the response to a list of messages or yield
  • If the conversation is finished (resp['done'] == true), it adds the response to another list called context these are context embedding that can be use to remember earlier results.
    : Managing Context and passing it to request for memorisation is not covered in this part, Will cover this later part :
  • The abortCurrentRequest method tries to stop a message being sent to the server. If it can't, it reports an error.

Repository : ChatRepository

class ChatRepositoryImpl extends ChatRepository {
final ChatDatasource ds;
Stream<ChatResponseEntity> getChatResponseStream({required String userInput}) {
return ds.getChatResponseFromServer(userInput: userInput).asBroadcastStream();

Future<void> abortRequest() async {
return await ds.abortCurrentRequest();
  • It relies on another part called ChatDatasource to talk to the server.
  • When the chat app wants to send a message to the server, it uses getChatResponseStream(). This method tells the ChatDatasource to send the message, and it gets back a stream of responses from the server. Think of it like sending a message and waiting for replies.
  • If, for some reason, the chat app wants to cancel sending a message, it uses abortRequest(). This method tells the ChatDatasource to stop whatever it's doing.

UseCase and Bloc

class GetChatResponseUsecase {
final ChatRepository repository;

Stream<ChatResponseEntity> execute({required String userInput}) => repository.getChatResponseStream(userInput: userInput);

void abortRequest() => repository.abortRequest();
  • It takes an instance of ChatRepository as a dependency.
  • The execute method triggers the process of getting chat responses from the repository based on user input. It returns a stream of chat responses.
  • The abortRequest method is used to cancel an ongoing request if needed.
class ChatCubit extends Cubit<ChatState> {
final GetChatResponseUsecase usecase;
StreamSubscription? _responseSubscription;

ChatCubit(this.usecase) : super(ChatInitial());

getChatResponse({required String userInput}) async {
_responseSubscription = usecase.execute(userInput: userInput).listen(
(response) {
if (response.done == false) {
} else {
onError: (error) {

abortRequest() {
emit(const ChatError('request aborted try again'));

Future<void> close() {
return super.close();
  • It relies on a GetChatResponseUsecase for fetching chat responses and managing requests.
  • The getChatResponse method initiates the process of fetching chat responses based on user input. It listens for responses from the use case and updates the state accordingly.
  • The abortRequest method cancels an ongoing request if needed, updating the state accordingly.
  • The close method cancels the subscription to response streams when the cubit is closed.

UI : Chat Page

BlocConsumer<ChatCubit, ChatState>(
listener: (context, state) {
if (state is ChatNewResponse) {
messages[messages.length - 1] += state.entity.response;
if (!_isUserDragging) _scrollDown();
} else if (state is ChatLoaded) {
messages[messages.length - 1] += state.entity.response;
} else if (state is ChatError) {
messages[messages.length - 1] += state.error;
} else {
builder: (context, state) => ListView.builder(
controller: scrollController,
physics: const AlwaysScrollableScrollPhysics(),
itemCount: messages.length,
itemBuilder: (context, index) => index == messages.length - 1 || index == messages.length - 2 ?
title: Text(messages[index], style: const TextStyle(color:,
) :
title: Text(messages[index], style: const TextStyle(color: Colors.grey)),

The listener function updates the UI based on different states:

  • If a new response is received (ChatNewResponse), it appends the response to the last message and updates the UI.
  • If the chat is loaded (ChatLoaded), it adds the response as a new message and updates the UI.
  • If there’s an error (ChatError), it adds the error message to the last message and updates the UI.
  • It also handles cases where there are no changes in state by updating the loading indicator.

The builder function builds the UI based on the current state:

  • It displays the chat messages in a ListView, where the last two messages have black text and the rest have grey text.
  • It ensures the list automatically scrolls to the bottom if the user isn’t manually scrolling.

Demo App

Thank You :

A big thank you for reading our blog! We hope you found it helpful.
Keep coding, and remember that
learning never stops!

References :

Code Reference :

Bonus : Simplified Version of Dart & Ollama Using Dart CLI

import 'dart:convert';
import 'dart:io';

var client = HttpClient();

void apiCall() async {
try {
HttpClientRequest request = await'localhost', 11434, '/api/generate');
Map<String, dynamic> jsonMap = {"model": "llama2", "prompt": "hi"};
String jsonString = json.encode(jsonMap);
List<int> bodyBytes = utf8.encode(jsonString);
HttpClientResponse response = await request.close();
final responseMessage = [];
final context = [];

await response.transform(utf8.decoder).listen((event) {
final resp = json.decode(event);
if (resp['done'] == false) {
} else {

} finally {



Lavkant Kachhwaha
Lavkant Kachhwaha

Written by Lavkant Kachhwaha

Flutter Enthusiast & Engineering @ CoinDCX

No responses yet

Write a response