Integrating Ollama’s APIs with Flutter: Building a Local ChatGPT Flutter App
LLM Conversations: Ollama + Flutter for Affordable ChatGPT Apps
In this tutorial, we’ll explore how to harness Ollama’s local APIs within a Flutter app to create a conversational AI experience similar to ChatGPT.

Prerequisites :
Before we dive in, make sure you have:
- Basic understanding of Flutter development
- Basic understanding of Ollama and LLMs
- Basic knowledge of RESTful APIs
Setting up Ollama :

- Download the Ollama application for your operating system (Mac, Windows, or Linux) from the official website.
- Install the downloaded Ollama application by following the on-screen instructions.
- Once installed, the CLI tools necessary for local development will be automatically installed alongside the Ollama application.
- You’re now ready to start using Ollama locally for your development needs!

curl --location 'http://localhost:11434/'
After running this Curl, you will be receiveing this message “Ollama is running”
Setting Up Flutter Project for Ollama Integration :
Create a New Flutter Project
- Open your terminal or command prompt.
- Run the command
flutter create <project_name>
to create a new Flutter project. - Navigate into the project directory using
cd <project_name>
.
Add Dependencies:
- Open the
pubspec.yaml
file in your Flutter project. - Add the dependencies for clean architecture, GetIt, auto_router, and flutter_bloc:
dependencies:
auto_route: ^7.8.4
dio: ^5.4.1
equatable: ^2.0.5
flutter_bloc: ^8.1.4
get_it: ^7.6.7
injectable: ^2.3.2
dev_dependencies:
auto_route_generator: ^7.3.2
build_runner: ^2.4.8
injectable_generator: ^2.4.1
Will be following Clean Architecture For App
|____data
| |____datasource
| | |____remote_chat_datasource.dart
| |____repository
| | |____chat_repository.dart
|____domain
| |____repository
| | |____chat_repository.dart
| |____entity
| | |____chat_response_entity.dart
| |____usecase
| | |____get_chat_response_usecase.dart
|____presentation
| |____cubit
| | |____chat_cubit.dart
| | |____chat_state.dart
| |____page
| | |____chat_page.dart
Will be Using Injectable Generator for Dependency Injection and Registering Modules


Will be Using AutoRoute Generator for Router Navigation

Datasource : Chat Datasource
abstract class ChatDatasource {
Stream<ChatResponseEntity> getChatResponseFromServer({required String userInput});
Future<void> abortCurrentRequest();
}
class RemoteChatDatasource extends ChatDatasource {
final HttpClient _client;
RemoteChatDatasource(this._client);
late HttpClientRequest request;
@override
Stream<ChatResponseEntity> getChatResponseFromServer({required String userInput}) async* {
try {
request = await _client.post(baseUrl,basePort,basePath);
Map<String, dynamic> jsonMap = {"model": baseModel, "prompt": userInput};
String jsonString = json.encode(jsonMap);
List<int> bodyBytes = utf8.encode(jsonString);
request.add(bodyBytes);
HttpClientResponse response = await request.close();
await for (final chunk in response.transform(utf8.decoder)) {
final resp = json.decode(chunk);
if (!resp['done']) {
yield ChatResponseEntity.fromJson(resp);
responseMessage.add(resp['response'].toString());
} else {
yield ChatResponseEntity.fromJson(resp);
context.add(resp['context']);
}
}
} catch (e) {
debugPrint(e.toString());
rethrow;
}
}
@override
Future<void> abortCurrentRequest() async {
try {
request.abort();
request.addError('request aborted');
} catch (e) {
debugPrint(e.toString());
}
}
}
- The
getChatResponseFromServer
method sends a message to a server and waits for responses. It then converts these responses into objects calledChatResponseEntity
. - If the server indicates that the conversation isn’t finished (
resp['done'] == false
), it adds the response to a list of messages or yield - If the conversation is finished (
resp['done'] == true
), it adds the response to another list called context these are context embedding that can be use to remember earlier results.
: Managing Context and passing it to request for memorisation is not covered in this part, Will cover this later part : - The
abortCurrentRequest
method tries to stop a message being sent to the server. If it can't, it reports an error.
Repository : ChatRepository
class ChatRepositoryImpl extends ChatRepository {
final ChatDatasource ds;
ChatRepositoryImpl(this.ds);
@override
@override
Stream<ChatResponseEntity> getChatResponseStream({required String userInput}) {
return ds.getChatResponseFromServer(userInput: userInput).asBroadcastStream();
}
@override
Future<void> abortRequest() async {
return await ds.abortCurrentRequest();
}
}
- It relies on another part called
ChatDatasource
to talk to the server. - When the chat app wants to send a message to the server, it uses
getChatResponseStream()
. This method tells theChatDatasource
to send the message, and it gets back a stream of responses from the server. Think of it like sending a message and waiting for replies. - If, for some reason, the chat app wants to cancel sending a message, it uses
abortRequest()
. This method tells theChatDatasource
to stop whatever it's doing.
UseCase and Bloc
class GetChatResponseUsecase {
final ChatRepository repository;
GetChatResponseUsecase(this.repository);
Stream<ChatResponseEntity> execute({required String userInput}) => repository.getChatResponseStream(userInput: userInput);
void abortRequest() => repository.abortRequest();
}
- It takes an instance of
ChatRepository
as a dependency. - The
execute
method triggers the process of getting chat responses from the repository based on user input. It returns a stream of chat responses. - The
abortRequest
method is used to cancel an ongoing request if needed.
@injectable
class ChatCubit extends Cubit<ChatState> {
final GetChatResponseUsecase usecase;
StreamSubscription? _responseSubscription;
ChatCubit(this.usecase) : super(ChatInitial());
getChatResponse({required String userInput}) async {
_responseSubscription?.cancel();
_responseSubscription = usecase.execute(userInput: userInput).listen(
(response) {
if (response.done == false) {
emit(ChatLoading());
emit(ChatNewResponse(response));
} else {
emit(ChatLoaded(response));
}
},
onError: (error) {
emit(ChatError(error.toString()));
},
);
}
abortRequest() {
emit(ChatLoading());
usecase.abortRequest();
emit(const ChatError('request aborted try again'));
}
@override
Future<void> close() {
_responseSubscription?.cancel();
return super.close();
}
}
- It relies on a
GetChatResponseUsecase
for fetching chat responses and managing requests. - The
getChatResponse
method initiates the process of fetching chat responses based on user input. It listens for responses from the use case and updates the state accordingly. - The
abortRequest
method cancels an ongoing request if needed, updating the state accordingly. - The
close
method cancels the subscription to response streams when the cubit is closed.
UI : Chat Page
BlocConsumer<ChatCubit, ChatState>(
listener: (context, state) {
if (state is ChatNewResponse) {
messages[messages.length - 1] += state.entity.response;
messagesController.add(messages);
loadingController.add(true);
if (!_isUserDragging) _scrollDown();
} else if (state is ChatLoaded) {
messages[messages.length - 1] += state.entity.response;
messagesController.add(messages);
loadingController.add(false);
} else if (state is ChatError) {
messages[messages.length - 1] += state.error;
messagesController.add(messages);
loadingController.add(false);
} else {
loadingController.add(false);
}
},
builder: (context, state) => ListView.builder(
controller: scrollController,
physics: const AlwaysScrollableScrollPhysics(),
itemCount: messages.length,
itemBuilder: (context, index) => index == messages.length - 1 || index == messages.length - 2 ?
ListTile(
title: Text(messages[index], style: const TextStyle(color: Colors.black)),
) :
ListTile(
title: Text(messages[index], style: const TextStyle(color: Colors.grey)),
),
),
)
The listener
function updates the UI based on different states:
- If a new response is received (
ChatNewResponse
), it appends the response to the last message and updates the UI. - If the chat is loaded (
ChatLoaded
), it adds the response as a new message and updates the UI. - If there’s an error (
ChatError
), it adds the error message to the last message and updates the UI. - It also handles cases where there are no changes in state by updating the loading indicator.
The builder
function builds the UI based on the current state:
- It displays the chat messages in a
ListView
, where the last two messages have black text and the rest have grey text. - It ensures the list automatically scrolls to the bottom if the user isn’t manually scrolling.
Demo App



Thank You :
A big thank you for reading our blog! We hope you found it helpful.
Keep coding, and remember that learning never stops!
https://www.linkedin.com/in/lavkant-kachhawaha-075a90b4/
References :
- Ollama Documentation: https://github.com/ollama/ollama
- Clean Architecture : https://resocoder.com/flutter-clean-architecture-tdd/
- GetIt/Injectable Package: https://pub.dev/packages/injectable
https://pub.dev/packages/get_it - AutoRoute Package: https://pub.dev/packages/auto_route
- Flutter Bloc Package: https://pub.dev/packages/flutter_bloc
Code Reference :

Bonus : Simplified Version of Dart & Ollama Using Dart CLI
import 'dart:convert';
import 'dart:io';
var client = HttpClient();
void apiCall() async {
try {
HttpClientRequest request = await client.post('localhost', 11434, '/api/generate');
Map<String, dynamic> jsonMap = {"model": "llama2", "prompt": "hi"};
String jsonString = json.encode(jsonMap);
List<int> bodyBytes = utf8.encode(jsonString);
request.add(bodyBytes);
HttpClientResponse response = await request.close();
final responseMessage = [];
final context = [];
await response.transform(utf8.decoder).listen((event) {
final resp = json.decode(event);
if (resp['done'] == false) {
responseMessage.add(json.decode(event)['response'].toString());
} else {
context.add(resp['context']);
}
}).asFuture();
print(responseMessage.join(''));
} finally {
client.close();
}
}