Integrating Ollama’s APIs with Flutter: Building a Local ChatGPT Flutter App

6 min readMar 3, 2024

LLM Conversations: Ollama + Flutter for Affordable ChatGPT Apps

In this tutorial, we’ll explore how to harness Ollama’s local APIs within a Flutter app to create a conversational AI experience similar to ChatGPT.

Prerequisites :

Before we dive in, make sure you have:

Basic understanding of Flutter development
Basic understanding of Ollama and LLMs
Basic knowledge of RESTful APIs

Setting up Ollama :

Download the Ollama application for your operating system (Mac, Windows, or Linux) from the official website.
Install the downloaded Ollama application by following the on-screen instructions.
Once installed, the CLI tools necessary for local development will be automatically installed alongside the Ollama application.
You’re now ready to start using Ollama locally for your development needs!

Download ↓

curl --location 'http://localhost:11434/'

After running this Curl, you will be receiveing this message “Ollama is running”

Setting Up Flutter Project for Ollama Integration :

Create a New Flutter Project

Open your terminal or command prompt.
Run the command flutter create <project_name> to create a new Flutter project.
Navigate into the project directory using cd <project_name>.

Add Dependencies:

Open the pubspec.yaml file in your Flutter project.
Add the dependencies for clean architecture, GetIt, auto_router, and flutter_bloc:

dependencies:
  auto_route: ^7.8.4
  dio: ^5.4.1
  equatable: ^2.0.5
  flutter_bloc: ^8.1.4
  get_it: ^7.6.7
  injectable: ^2.3.2

dev_dependencies:
  auto_route_generator: ^7.3.2
  build_runner: ^2.4.8
  injectable_generator: ^2.4.1

Will be following Clean Architecture For App

|____data
| |____datasource
| | |____remote_chat_datasource.dart
| |____repository
| | |____chat_repository.dart
|____domain
| |____repository
| | |____chat_repository.dart
| |____entity
| | |____chat_response_entity.dart
| |____usecase
| | |____get_chat_response_usecase.dart
|____presentation
| |____cubit
| | |____chat_cubit.dart
| | |____chat_state.dart
| |____page
| | |____chat_page.dart

Will be Using Injectable Generator for Dependency Injection and Registering Modules

Will be Using AutoRoute Generator for Router Navigation

Datasource : Chat Datasource

abstract class ChatDatasource {
  Stream<ChatResponseEntity> getChatResponseFromServer({required String userInput});
  Future<void> abortCurrentRequest(); 
}

class RemoteChatDatasource extends ChatDatasource {
  final HttpClient _client;
  RemoteChatDatasource(this._client);
  late HttpClientRequest request;
  
  @override
  Stream<ChatResponseEntity> getChatResponseFromServer({required String userInput}) async* {
    try {
      request = await _client.post(baseUrl,basePort,basePath);
      Map<String, dynamic> jsonMap = {"model": baseModel, "prompt": userInput};
      String jsonString = json.encode(jsonMap);
      List<int> bodyBytes = utf8.encode(jsonString);
      request.add(bodyBytes);
      HttpClientResponse response = await request.close();
      
      await for (final chunk in response.transform(utf8.decoder)) {
        final resp = json.decode(chunk);
        if (!resp['done']) {
          yield ChatResponseEntity.fromJson(resp);
          responseMessage.add(resp['response'].toString());
        } else {
          yield ChatResponseEntity.fromJson(resp);
          context.add(resp['context']);
        }
      }
    } catch (e) {
      debugPrint(e.toString());
      rethrow;
    }
  }

  @override
  Future<void> abortCurrentRequest() async {
    try {
      request.abort();
      request.addError('request aborted');
    } catch (e) {
      debugPrint(e.toString());
    }
  }
}

The getChatResponseFromServer method sends a message to a server and waits for responses. It then converts these responses into objects called ChatResponseEntity.
If the server indicates that the conversation isn’t finished (resp['done'] == false), it adds the response to a list of messages or yield
If the conversation is finished (resp['done'] == true), it adds the response to another list called context these are context embedding that can be use to remember earlier results.
: Managing Context and passing it to request for memorisation is not covered in this part, Will cover this later part :
The abortCurrentRequest method tries to stop a message being sent to the server. If it can't, it reports an error.

Repository : ChatRepository

class ChatRepositoryImpl extends ChatRepository {
  final ChatDatasource ds;
  ChatRepositoryImpl(this.ds);
  @override
  @override
  Stream<ChatResponseEntity> getChatResponseStream({required String userInput}) {
    return ds.getChatResponseFromServer(userInput: userInput).asBroadcastStream();
  }

  @override
  Future<void> abortRequest() async {
    return await ds.abortCurrentRequest();
  }
}

It relies on another part called ChatDatasource to talk to the server.
When the chat app wants to send a message to the server, it uses getChatResponseStream(). This method tells the ChatDatasource to send the message, and it gets back a stream of responses from the server. Think of it like sending a message and waiting for replies.
If, for some reason, the chat app wants to cancel sending a message, it uses abortRequest(). This method tells the ChatDatasource to stop whatever it's doing.

UseCase and Bloc

class GetChatResponseUsecase {
  final ChatRepository repository;
  GetChatResponseUsecase(this.repository);

  Stream<ChatResponseEntity> execute({required String userInput}) => repository.getChatResponseStream(userInput: userInput);

  void abortRequest() => repository.abortRequest();
}

It takes an instance of ChatRepository as a dependency.
The execute method triggers the process of getting chat responses from the repository based on user input. It returns a stream of chat responses.
The abortRequest method is used to cancel an ongoing request if needed.

@injectable
class ChatCubit extends Cubit<ChatState> {
  final GetChatResponseUsecase usecase;
  StreamSubscription? _responseSubscription;

  ChatCubit(this.usecase) : super(ChatInitial());

  getChatResponse({required String userInput}) async {
    _responseSubscription?.cancel();
    _responseSubscription = usecase.execute(userInput: userInput).listen(
      (response) {
        if (response.done == false) {
          emit(ChatLoading());
          emit(ChatNewResponse(response));
        } else {
          emit(ChatLoaded(response));
        }
      },
      onError: (error) {
        emit(ChatError(error.toString()));
      },
    );
  }

  abortRequest() {
    emit(ChatLoading());
    usecase.abortRequest();
    emit(const ChatError('request aborted try again'));
  }

  @override
  Future<void> close() {
    _responseSubscription?.cancel();
    return super.close();
  }
}

It relies on a GetChatResponseUsecase for fetching chat responses and managing requests.
The getChatResponse method initiates the process of fetching chat responses based on user input. It listens for responses from the use case and updates the state accordingly.
The abortRequest method cancels an ongoing request if needed, updating the state accordingly.
The close method cancels the subscription to response streams when the cubit is closed.

UI : Chat Page

BlocConsumer<ChatCubit, ChatState>(
  listener: (context, state) {
    if (state is ChatNewResponse) {
      messages[messages.length - 1] += state.entity.response;
      messagesController.add(messages);
      loadingController.add(true);
      if (!_isUserDragging) _scrollDown();
    } else if (state is ChatLoaded) {
      messages[messages.length - 1] += state.entity.response;
      messagesController.add(messages);
      loadingController.add(false);
    } else if (state is ChatError) {
      messages[messages.length - 1] += state.error;
      messagesController.add(messages);
      loadingController.add(false);
    } else {
      loadingController.add(false);
    }
  },
  builder: (context, state) => ListView.builder(
    controller: scrollController,
    physics: const AlwaysScrollableScrollPhysics(),
    itemCount: messages.length,
    itemBuilder: (context, index) => index == messages.length - 1 || index == messages.length - 2 ?
      ListTile(
        title: Text(messages[index], style: const TextStyle(color: Colors.black)),
      ) :
      ListTile(
        title: Text(messages[index], style: const TextStyle(color: Colors.grey)),
      ),
  ),
)

The listener function updates the UI based on different states:

If a new response is received (ChatNewResponse), it appends the response to the last message and updates the UI.
If the chat is loaded (ChatLoaded), it adds the response as a new message and updates the UI.
If there’s an error (ChatError), it adds the error message to the last message and updates the UI.
It also handles cases where there are no changes in state by updating the loading indicator.

The builder function builds the UI based on the current state:

It displays the chat messages in a ListView, where the last two messages have black text and the rest have grey text.
It ensures the list automatically scrolls to the bottom if the user isn’t manually scrolling.

Demo App

Thank You :

A big thank you for reading our blog! We hope you found it helpful.
Keep coding, and remember that learning never stops!

https://www.linkedin.com/in/lavkant-kachhawaha-075a90b4/

References :

Ollama Documentation: https://github.com/ollama/ollama
Clean Architecture : https://resocoder.com/flutter-clean-architecture-tdd/
GetIt/Injectable Package: https://pub.dev/packages/injectable
https://pub.dev/packages/get_it
AutoRoute Package: https://pub.dev/packages/auto_route
Flutter Bloc Package: https://pub.dev/packages/flutter_bloc

Code Reference :

https://github.com/Lavkushwaha/ollama_local_flutter_app

Bonus : Simplified Version of Dart & Ollama Using Dart CLI

import 'dart:convert';
import 'dart:io';

var client = HttpClient();

void apiCall() async {
  try {
    HttpClientRequest request = await client.post('localhost', 11434, '/api/generate');
    Map<String, dynamic> jsonMap = {"model": "llama2", "prompt": "hi"};
    String jsonString = json.encode(jsonMap);
    List<int> bodyBytes = utf8.encode(jsonString);
    request.add(bodyBytes);
    HttpClientResponse response = await request.close();
    final responseMessage = [];
    final context = [];

    await response.transform(utf8.decoder).listen((event) {
      final resp = json.decode(event);
      if (resp['done'] == false) {
        responseMessage.add(json.decode(event)['response'].toString());
      } else {
        context.add(resp['context']);
      }
    }).asFuture();

    print(responseMessage.join(''));
  } finally {
    client.close();
  }
}

Integrating Ollama’s APIs with Flutter: Building a Local ChatGPT Flutter App

Prerequisites :

Setting up Ollama :

Setting Up Flutter Project for Ollama Integration :

Demo App

References :

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Lavkant Kachhwaha

No responses yet