Data in its various formats and massive amounts is stored and processed by virtually all organizations. Tools, standards, and methodologies for making usage of this data are subject of constant development in recent years. The promise to have “data driven decisions,” “data products” or “data economy” is still alive and still fuels the minds of many to find the solution that fits given business best. The data (r)evolution is on, and recently new techniques have been developed that was not seen before – generative AI.
The introduction of Large Language Models gives an opportunity to use new methods to manage access and use all available data. In the past many organizations strived to create vertical models with neural networks and machine learning for several use cases. Now, as we have big language models, we can use them is several ways to get things done with our own data:
• Use natural queries in any language to ask about insights and facts from your data.
• See and examine how problems are being solved by AI with proposed chain of thoughts and actions taken.
• Ask for charts, diagrams, summaries or simply ask a question.
• Be aware that the tool is using all available data including internal or public, structured, or unstructured in a secure way.
LLMs (Large Language Models) like ChatGPT are used to understand a problem and provide sequence of actions for getting a solution („chain of thoughts”). This process, done by specialized prompting, is transparent so people can see how the result is generated and assess its correctness. AI Buddy uses Retrieval Augmented Generation Approach (RAG) to fine tune Large Language Models with relevant data.
Data and information provided by AI Buddy tools is up-to-date and exact and can also include internal information. A tool provides requested information from a database, warehouse, public data, or any API (Application Programming Interfaces) needed in the context of AI Buddy. It can be internal and public data that are static or changing frequently like currency rates, weather forecasts, transaction data etc. We make sure that tools are safe and respect all security policies. This approach eliminates the hallucination effect since if tools are not providing relevant info for given problem, then application can respond „I have no data about it.” To do so prompts are engineered and fine-tuned to give exact and relevant information only.