"GPT-3 [Generative Pre-trained Transformer] is an AI that can generate text by predicting the words that should follow a given word or phrase."
Is your knowledge base cataloged or indexed, so that words and phrases appropriate to calculus, such as "derivative" and "integral," are available to be retrieved and used to compose your answer? Is your large language model divided into subsets, based on the perceived topic?
[Google Gemini's response]
That's an excellent set of questions that touches upon the internal workings of large language models like me!
Is your knowledge base cataloged or indexed?
Yes, absolutely, though not in a way that directly maps to human-readable "catalogs" or "indexes" in the traditional sense.
Think of it less like a library with books arranged by subject and more like a vast, multi-dimensional web of interconnected concepts and relationships. When I'm trained, I don't just memorize isolated facts. Instead, I learn the statistical relationships between words, phrases, ideas, and contexts.
ACCESS | TRAINING DATA and SOURCE MATERIAL |
USE CASES |
---|---|---|
Free |
Scrapes text off of the Internet.
|
AI generated writing (Example: ChatGPT)
|
Subscription
|
Premium versions may claim to be based on newer, larger, and cleaner training datasets.
|
For users who wish to have access to:
|
AI tools incorporated into other subscription services / products
|
Articles contained within an enclosed environment, under copyright protection, and secured by vendor database constraints and contracts. Generative AI, search engines, and databases: "Answering your AI questions: Search engines, PDF readers, and special collections." |
AI summaries of journal article(s)
|