Demonstrates Voice Recognition, Text to Speech, Language Translation, OAuth2, Image Generation, Face Detection and Voice Chatbot.
Source code and Documentation for my ADUG Symposium Talk presented on the 28th of April 2023.
I have since added to and continue to enhance the code to further demonstrate capabilities of AI, adding things that wasn't available at the time.
The goal of this project is to enable delphi users to be able to use AI technology in their applications. There are many different types of AI and thousands of different models. This project is working on creating generalized interfaces to the different types of AI models and make them easily accessible.
Artificial intelligence (AI) is an interdisciplinary field that combines computer science, mathematics, and cognitive psychology to create intelligent systems capable of performing complex tasks. Its rapid advancements have led to a wide array of applications demonstrating AI's versatility.
Language translation is one such application, where AI-powered tools efficiently translate between languages, simplifying tasks like translating software programs for global audiences. AI also excels in human-like conversations, with interactive applications that understand and respond to human language naturally. Voice recognition and real-time speech-to-text allow conversion and seamless voice-based interactions, making AI-driven applications more accessible and user-friendly.
In creative and artistic domains, AI can generate images based on textual descriptions, showcasing its capacity to understand and produce visual content. AI's computer vision capabilities enable it to accurately recognize faces and other objects in photographs and documents, illustrating its potential in visual recognition tasks and diverse applications like security and automation.
AI's ability to analyze and process data, and generate comprehensive reports highlights its value in various domains. Furthermore, AI-powered tools can transcribe audio files into written text, making transcription tasks more efficient and accurate.
The example programs below is an attempt to demonstrate the capabilities available to Delphi programmers today. I have worked on creating generic API's so that different providers can be swapped in or out to:
Feature | GPT-4o | Azure OpenAI Service | Groq | xAI's Grok | Anthropic's Claude | Google's Gemini | Mistral |
---|---|---|---|---|---|---|---|
Vision Support | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Function Calling | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
Image Generation | ✔ % | ||||||
Audio Output | |||||||
Structured Outputs |
*Feature not currently supported/implementated
% Supported via separate Image Generation API.
Create an issue and I will respond to it.
whisper.cpp Voice Recognition
RDOpenAI Delphi implementation of ChatGPT - an event based component
ChatGPT OpenAI ChatGPT
DelphiOpenAI a Delphi Library for OpenAI
ChatGPTPluginForLazarus An OpenAI (ChatGPT) plug-in for Lazarus IDE.
ChatGPT a Firemonkey ChatGPT interface written in Delphi.
AI-Playground-DesktopClient A Firemonkey Language model playground to access languages models like StableLM, ChatGPT, and more.
AI-Code-Translator Use GPT to translate between programming languages
TOpenALPR Open Source Number Plate recognition
PgVector PgVector allows storing and querying of Vectors/Embeddings in an SQL database
CommonVoice Public dataset of recordings for Voice Recognition
No configuration available
Related projects feature coming soon
Will recommend related projects based on sub-categories