I like to watch tarot readings on YouTube sometimes, and while chatbots usually know about the most popular tarot and oracle decks, I love expanding their knowledge with more unique indie decks! To do this, I want to use RAG (Retrieval-augmented generation) through Mistral by adding two indie decks that I bought through TarotStack: The Prisma Vision Tarot and Your Wise Animal Body Oracle.
RAG Definition
According to Mistral’s documentation, Retrieval-augmented generation (RAG) is an AI framework that synergizes the capabilities of LLMs and information retrieval systems. It’s useful to answer questions or generate content leveraging external knowledge. There are two main steps in RAG:
- Retrieval: retrieve relevant information from a knowledge base with text embeddings stored in a vector store.
- Generation: insert the relevant information to the prompt for the LLM to generate information.
Setup
We will store our vectors in LanceDB, an open-source vector database for AI designed to store, manage, query, and retrieve embeddings on large-scale multi-modal data. The code was executed using Jupyter Notebook.
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage
import requests
import pandas as pd
import numpy as np
import lancedb
import os
# Initialize the Mistral client
mistral_api_key = os.getenv('MISTRAL_API_KEY')
client = MistralClient(api_key=mistral_api_key)
Get Data
Assume the tarot/oracle data is in CSV format. Since the function converts everything into a dictionary and then a string, the columns don’t need to match exactly between different decks. However, it’s recommended to include at least these columns: Deck Name, Card Name, Image Description, and Interpretation. You can add additional columns as needed for each deck.
# Read tarot/oracle files
def get_data(file_path):
df = pd.read_csv(file_path)
# Convert to records so each row is a dictionary, and it's a list of dictionaries
df_dict = df.to_dict(orient='records')
# Combine keys and values into a single string
texts = []
for d in df_dict:
text = ''
for key, value in d.items():
text += f"{key}: {value}; "
texts.append(text.strip())
return df_dict, texts
The Prisma Vision Tarot CSV may look something like this:
Deck Name | Category | Card Name | Card Number | Image Description | Interpretation |
Prisma Visions Tarot | Major arcana | The Fool | 0 | A figure in white with a bird perched on their head, standing on a blue-toned platform amidst clouds. | Manifestation, willpower, transformation, inspired action. |
Prisma Visions Tarot | Major arcana | The Magician | I | A silhouetted figure in black, casting magic with a glowing hand against a blue background with trees. | Intuition, inner wisdom, secrets, unconscious mind. |
Prisma Visions Tarot | Major arcana | The High Priestess | II | A shadowy figure partially obscured by a sheer curtain, in a blue and green forest setting. | Intuition, inner wisdom, secrets, unconscious mind. |
Let’s process the Prisma Vision Tarot CSV:
# The prisma vision tarot
file_path = 'data/prisma_visions_tarot.csv'
tarot_df_dict, tarot_texts = get_data(file_path)
len(tarot_df_dict), len(tarot_texts)
Response:
(81, 81)
There are 81 cards in this deck.
Next let’s add the Your Wise Animal Body oracle, which may look something like this:
Deck Name | Card Name | Card Category | Image Description | Interpretation |
Your Wise Animal Body Oracle | ATTEND | GROUND | Pink and purple gradient background. Inside a thin-lined, upside down triangle is a square containing a downward pointing triangle. Surrounding the shapes are spiraling lines that get wider apart as they move outward. | This card likely encourages grounding oneself, connecting with the earth, and finding stability. |
Your Wise Animal Body Oracle | LISTEN | Orange and grey gradient background. Thin black lines form a square with a crescent moon in the center. Surrounding the moon are wavy lines that get closer together as they reach the outer square. | This card likely encourages listening to one’s intuition, paying attention to subtle cues, and connecting with inner guidance. |
Let’s process the data:
# Your Wise animal body oracle
file_path = 'data/your_wise_animal_body_oracle.csv'
oracle_df_dict, oracle_texts = get_data(file_path)
len(oracle_df_dict), len(oracle_texts)
Response:
(37, 37)
There are 37 cards in the deck.
Next let’s combine the text strings and convert to a data frame:
# Combine texts
texts = tarot_texts + oracle_texts
# Convert to dataframe
df = pd.DataFrame(texts, columns=['text'])
Create Embeddings for Each Card
We create text embeddings for each card, using the text strings created above. According to Mistral’s tutorial, text embeddings are numeric representations of the text in the vector space. Words with similar meanings tend to be closer together (shorter distance in the vector space). We use Mistral’s mistral-embed
model for creating the embeddings, which has the dimension of 1024:
def get_text_embedding(input):
embeddings_batch_response = client.embeddings(
model="mistral-embed",
input=input
)
return embeddings_batch_response
Get embeddings in batches for efficiency:
# Get embeddings in batches
# Might take some time
def get_embeddings_by_chunks(data, batch_size=50):
batches = [data[x:x + batch_size] for x in range(0, len(data), batch_size)]
embeddings_response = [get_text_embedding(c) for c in batches]
return [d.embedding for e in embeddings_response for d in e.data]
# Get embeddings for all rows
df["vector"] = get_embeddings_by_chunks(texts)
df.head()
Response:
text | vector | |
0 | Deck Name: Prisma Visions Tarot; Category: Cup… | [-0.038909912109375, 0.00914764404296875, 0.02… |
---|---|---|
1 | Deck Name: Prisma Visions Tarot; Category: Cup… | [-0.04095458984375, 0.00315093994140625, 0.026… |
Load Data into Vector Database
A common practice is to store our embeddings in a vector database for efficient processing and retrieval. There are several vector databases to choose from. In our example, we are using an open-source vector database LanceDb.
db = lancedb.connect("./lancedb")
table = db.create_table(
"tarot_cards",
data=df,
mode="overwrite"
)
table.head()
Response (what the table looks like; notice the text and vector):
pyarrow.Table text: string vector: fixed_size_list<item: float>[1024] child 0, item: float ---- text: [["Deck Name: Prisma Visions Tarot; Category: Cups; Card Name: Ace of Chalices; Card Number: nan; Image Description: A chalice overflowing with water stands in the center of a pool beneath a starry night sky. Swirls of blue, teal, and white dominate the sky and water, while the chalice and surrounding trees feature warm oranges and yellows.; Interpretation: New beginnings, emotional abundance, joy, and fulfillment.;","Deck Name: Prisma Visions Tarot; Category: Cups; Card Name: Two of Chalices; Card Number: nan; Image Description: Two figures embrace under a tree, their backs to the viewer. Warm orange, yellow, and red tones fill the sky and background, contrasting with the cooler blues and greens of the figures and foreground.; Interpretation: Partnership, connection, harmony, and mutual attraction.;","Deck Name: Prisma Visions Tarot; Category: Cups; Card Name: Three of Chalices; Card Number: nan; Image Description: Three figures dance joyfully under a swirling sky reminiscent of Van Gogh's Starry Night. Blues, yellows, and greens dominate the scene.; Interpretation: Celebration, friendship, community, and shared joy.;","Deck Name: Prisma Visions Tarot; Category: Cups; Card Name: Four of Chalices; Card Number: nan; Image Description: A lone figure sits beneath a tree with their back turned, seemingly apathetic to the offered chalice above. The scene uses muted tones of blue, green, and brown.; Interpretation: Apathy, boredom, missed opportunities, and emotional stagnation.;","Deck Name: Prisma Visions Tarot; Category: Cups; Card Name: Five of Chalices; Card Number: nan; Image Description: A cloaked figure walks away from five overflowing chalices. Cool blues and greens dominate the scene with hints of yellow and orange in the sky.; Interpretation: Loss, grief, disappointment, and dwelling on the negative.;"]] vector: [[[-0.038909912,0.009147644,0.02406311,-0.008163452,0.038085938,...,0.026885986,0.0017938614,-0.007980347,0.03439331,-0.013824463],[-0.04095459,0.00315094,0.02619934,-0.026260376,0.032562256,...,0.026367188,0.009567261,-0.0058403015,0.04046631,-0.02180481],[-0.04006958,0.00029683113,0.02456665,-0.03857422,0.030944824,...,0.030670166,0.0011463165,-0.022705078,0.038391113,-0.020080566],[-0.0435791,0.013908386,0.01651001,0.0003979206,0.040527344,...,0.02067566,0.014083862,-0.019638062,0.03729248,-0.022369385],[-0.03640747,0.007209778,0.025527954,-0.024887085,0.04284668,...,0.026412964,0.018447876,-0.015281677,0.028167725,-0.027572632]]]
Create Embeddings for a Question
Whenever users ask a question, we also need to generate embeddings for the question using the same embedding models as before. To improve retrieval accuracy, we identify any cards mentioned within the question and create embeddings based on those cards.
First let’s create a function to interface with Mistral:
def run_mistral(user_message, model="mistral-medium-latest"):
messages = [
ChatMessage(role="user", content=user_message)
]
chat_response = client.chat(
model=model,
messages=messages
)
return (chat_response.choices[0].message.content)
Create a question for interpreting cards:
question = """
What is the interpretation of the following cards?
Tarot cards: 3 of swords, death
Your wise animal body oracle cards: void, wash over
"""
Identify the cards:
# identify the cards mentioned
rsp = run_mistral("""
Identify all the cards in the message below. They may be tarot or oracle cards. They may be non-standard tarot cards. Show the cards only, in comma-separated format, no notes or any other information.
Message: """+question)
rsp
Response:
'3 of swords, death, void, wash over'
Four cards were mentioned. Let’s create embeddings of these cards:
cards = [c.strip() for c in rsp.split(',')]
cards_embeddings = np.array([get_text_embedding(card).data[0].embedding for card in cards])
cards_embeddings
Response:
array([[-0.029953 , 0.03170776, 0.06018066, ..., -0.02897644, -0.01853943, -0.03079224], [-0.05535889, 0.00128269, 0.01238251, ..., -0.02259827, 0.00425339, -0.02177429], [-0.00959778, 0.01316071, 0.02059937, ..., -0.03820801, -0.03050232, -0.00777435], [-0.00681686, 0.01556396, 0.04187012, ..., 0.01171112, 0.01226044, -0.01138306]])
Retrieve Similar Data from the Vector Database
We perform a search on the vector database based on the embeddings and take the first result of each card retrieved. Then based on the returned embedding, we can retrieve the actual relevant text:
retrieved_chunks = []
for embedding in cards_embeddings:
results = table.search(embedding).limit(2).to_pandas()
# only take the first result of each card retrieved
retrieved_chunks.append(results['text'][0])
retrieved_chunks
Response (note the cards retrieved correctly):
['Deck Name: Prisma Visions Tarot; Category: Swords; Card Name: Three of Swords; Card Number: nan; Image Description: Three swords piercing a heart in front of snow-covered mountains. Dark blue, gray, and touches of gold.; Interpretation: Heartbreak, betrayal, grief, sorrow, pain, separation;',
'Deck Name: Prisma Visions Tarot; Category: Major arcana; Card Name: Death; Card Number: XIII; Image Description: A skeletal hand holds a white rose against a dark blue background with swirling patterns.; Interpretation: Transformation, endings, new beginnings, letting go of the old.;',
'Deck Name: Your Wise Animal Body Oracle; Card Name: VOID; Card Category: nan; Image Description: A black circle on a blue-green gradient background.; Interpretation: Embracing emptiness, stillness, and potential.;',
'Deck Name: Your Wise Animal Body Oracle; Card Name: WASH OVER; Card Category: nan; Image Description: Thin white lines make an abstract shape on a black circle on a pink and purple vertical gradient background.; Interpretation: Allowing emotions to flow through you without resistance.;']
Prompt with Context and Question
Finally we combine the context (retrieved cards’ info) and the original question to form the final prompt:
prompt = f"""
Context information is below.
---------------------
{'\n'.join(retrieved_chunks)}
---------------------
Answer the query by interpreting the cards with the context information above and any synchronicities through the images, colors, numbers, etc.
Query: {question}
Answer:
"""
Send to Mistral:
rsp = run_mistral(prompt, model="mistral-large-latest")
print(rsp)
Final response:
The combination of these cards suggests a journey of emotional pain, transformation, and eventual healing. Let's break it down: The Three of Swords from the Prisma Visions Tarot signifies heartbreak, betrayal, grief, or sorrow. The image of three swords piercing a heart is a powerful symbol of emotional pain. The dark blue and gray colors in the image further emphasize feelings of sadness and despair. The Death card, also from the Prisma Visions Tarot, represents transformation, endings, new beginnings, and letting go of the old. Although the name and skeletal imagery might seem ominous, this card does not signify physical death but rather symbolizes the end of a phase or aspect of your life. The white rose held by the skeletal hand symbolizes new beginnings and hope amidst change. The VOID card from the Your Wise Animal Body Oracle deck suggests embracing emptiness, stillness, and potential. This card encourages you to accept the current emptiness or silence in your life as it holds immense potential for growth and transformation. The WASH OVER card from the same deck advocates for allowing emotions to flow through you without resistance. This card suggests that you should not suppress your feelings but let them flow naturally, which can lead to healing and self-understanding. In summary, these cards collectively indicate a period of emotional turmoil and change. However, they also suggest that by accepting and embracing these emotions and the changes happening in your life, you can move towards healing and new beginnings.
That’s a good start! We can further improve this with better data and prompting techniques such as few-shot learning to guide the models’ answers.
Create an App with Streamlit
To take a step further, let’s create an app for a better user interface. We choose to use Streamlit, an open-source framework for building and sharing data apps all in Python. Here’s what the interface looks like with a user question and Mistral’s response:
Here’s the full Python code:
import streamlit as st
import pandas as pd
import numpy as np
import lancedb
import requests
import json
import os
from mistralai.client import MistralClient
from mistralai.models.chat_completion import ChatMessage
# Initialize the Mistral client
mistral_api_key = os.getenv('MISTRAL_API_KEY')
client = MistralClient(api_key=mistral_api_key)
# Read tarot/oracle files
def get_data(file_path):
df = pd.read_csv(file_path)
# Convert to records so each row is a dictionary, and it's a list of dictionaries
df_dict = df.to_dict(orient='records')
# Combine keys and values into a single string
texts = []
for d in df_dict:
text = ''
for key, value in d.items():
text += f"{key}: {value}; "
texts.append(text.strip())
return df_dict, texts
# The tarot cards
file_path = 'data/prisma_visions_tarot.csv'
tarot_df_dict, tarot_texts = get_data(file_path)
# The oracle cards
file_path = 'data/your_wise_animal_body_oracle.csv'
oracle_df_dict, oracle_texts = get_data(file_path)
# Combine texts
texts = tarot_texts + oracle_texts
df = pd.DataFrame(texts, columns=['text'])
# Text embedding
def get_text_embedding(input):
embeddings_batch_response = client.embeddings(
model="mistral-embed",
input=input
)
return embeddings_batch_response #.data[0].embedding
# Get embeddings in batches
# Might take some time
def get_embeddings_by_chunks(data, chunk_size=50):
chunks = [data[x:x + chunk_size] for x in range(0, len(data), chunk_size)]
embeddings_response = [get_text_embedding(c) for c in chunks]
return [d.embedding for e in embeddings_response for d in e.data]
# Get embeddings for all rows
df["vector"] = get_embeddings_by_chunks(texts)
# Create LanceDB database and table
db = lancedb.connect("./lancedb")
table = db.create_table("tarot_cards", data=df, mode="overwrite")
# Funtion to chat with Mistral
def run_mistral(user_message, model="mistral-large-latest"):
messages = [
ChatMessage(role="user", content=user_message)
]
chat_response = client.chat(
model=model,
messages=messages
)
return (chat_response.choices[0].message.content)
# Streamlit app
st.title("Tarot Reading App with Mistral AI RAG")
# User input
query = st.text_area("Enter your query with tarot/oracle cards:",
height=150,
help="You can enter multiple lines of text. Include your question and any specific cards you want to ask about.")
# Prompts
# identify the cards mentioned
identify_cards_prompt = """
Identify all the cards in the message below. They may be tarot or oracle cards. They may be non-standard tarot cards.
Show the cards only, in comma-separated format, no notes or any other information.
Message: """
# query with retrieved info
rag_prompt = """
Context information is below.
---------------------
{}
---------------------
Answer the query by interpreting cards and incorportaing:
- context information above
- notable synchronicities on the objects of card images
- notable synchronicities on colors of the card images
- notable synchronicities on the numbers of the cards
- any other notable features of these cards
The final interpretation should incorporation all individual cards' interpretations.
Query: {}
Answer:
"""
if st.button("Get Reading"):
if query:
with st.spinner("Fetching cards information..."):
# Identify the cards mentioned
response = run_mistral(identify_cards_prompt+query)
cards = [c.strip() for c in response.split(',')]
st.write("Identified cards:", cards)
# Get embeddings of requested cards
cards_embeddings = np.array([get_text_embedding(card).data[0].embedding for card in cards])
# Retrieve similar chunks from LanceDB
retrieved_chunks = []
for embedding in cards_embeddings:
results = table.search(embedding).limit(2).to_pandas()
# only take the first result of each card retrieved
retrieved_chunks.append(results['text'][0])
response = run_mistral(rag_prompt.format('\n'.join(retrieved_chunks), query),
model="mistral-large-latest")
st.success(response)
# Display retrieved chunk
st.subheader("Cards Retrieved")
st.json(retrieved_chunks)
else:
st.warning("Please enter a query before getting a reading.")
That’s all the fun for now!