Cache Management#
Introduction#
When building your testing and evaluation pipeline, it is strongly recommended you take advantage of ARTKIT’s built-in caching functionality. Caching model responses allows you to reference them later without needing to hit the API again — enabling consistency between runs and saving both time and money during development.
Overview#
ARTKIT implements caching via model wrapper classes. This allows standardized caching implementation across models, with no changes to underlying behaviors. Caches are stored in an SQLite database, which is a lightweight, disk-based database that doesn’t require separate server processes.
Summarized below are some details on the classes used for the caching implementation:
CachedGenAIModel
is an abstract wrapper class for an ARTKIT model. Concrete subclasses are introduced for different modalities:CachedChatModel
adds a cache to aChatModel
CachedCompletionModel
adds a cache to aCompletionModel
CachedDiffusionModel
adds a cache to aDiffusionModel
CachedVisionModel
adds a cache to aVisionModel
Cached model responses are stored in a
CacheDB
object. TheCacheDB
can be configured to store results in memory or in a SQLite database.Responses in the
CacheDB
are indexed by the input message, chat history, system prompt, and all model parametersCacheDB
also records the creation and last access time for each entry, which can be used for cache cleanup
Working with a Cached Model#
In this example, we’ll demonstrate how to initialize a CachedChatModel
and work with its CacheDB
. We start by initializing a CachedChatModel
that reads from an existing cache located in cache/example_cache_gpt35.db
:
[1]:
# Basic setup and imports
from datetime import datetime
from dotenv import load_dotenv
import logging
import json
import pandas as pd
import artkit.api as ak
load_dotenv()
logging.basicConfig(level=logging.WARNING)
pd.set_option("display.max_colwidth", None)
# Initialize a cached chat model
cached_openai_llm = ak.CachedChatModel(
model=ak.OpenAIChat(model_id="gpt-3.5-turbo"),
database="cache/cache_management.db"
)
Basic model call: When we call get_response
, the model will automatically return a cached response if it exists for this model_id
and prompt
:
[2]:
await cached_openai_llm.get_response(message="What color is the sky?")
[2]:
['The color of the sky can vary depending on the time of day and weather conditions. During the day, the sky is typically blue, but it can appear different shades depending on the amount of moisture and particles in the atmosphere. At sunrise and sunset, the sky can range from pink and orange to red and purple. At night, the sky appears dark blue or black with twinkling stars.']
We can validate that this response is also stored in the cache:
[3]:
cached_openai_llm.cache.get_entry(model_id="gpt-3.5-turbo", prompt="What color is the sky?")
[3]:
['The color of the sky can vary depending on the time of day and weather conditions. During the day, the sky is typically blue, but it can appear different shades depending on the amount of moisture and particles in the atmosphere. At sunrise and sunset, the sky can range from pink and orange to red and purple. At night, the sky appears dark blue or black with twinkling stars.']
Model call with additional parameters: If we update the system prompt on the model and pass additional model parameters to get_response
, these settings will be used to index the new response:
[4]:
await cached_openai_llm.with_system_prompt(
"You only reply in haiku."
).get_response(
message="What color is the sky?",
temperature=0.8
)
[4]:
["Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true"]
[5]:
cached_openai_llm.cache.get_entry(
model_id="gpt-3.5-turbo",
prompt="What color is the sky?",
_system_prompt="You only reply in haiku.",
temperature=0.8
)
[5]:
["Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true"]
Clearing the cache: We can delete entries in the CacheDB
by calling clear_cache
, specifying created_before
, accessed_before
, created_after
or accessed_after
datetime parameters:
[6]:
# We won't actually clear the cache in this example, since we want the notebook to be re-runnable
# cached_openai_llm.clear_cache(created_before=datetime.now())
# cached_openai_llm.clear_cache(created_after=datetime.now()-timedelta(days=7))
CacheDB Structure#
To efficiently store and fetch cached responses, the CacheDB
database is structured into four tables:
ModelCache: The cache id, model id, creation time, and access time
ModelParams: Parameters passed to the model – note that each one is its own row
UniqueStrings: String parameters values passed to the model – stored as a separate table to avoid duplication
ModelResponses: The model response for a set of parameters
Since the CacheDB
is connected to a SQLite database, we can query it directly from python:
[7]:
cursor = cached_openai_llm.cache.conn.execute(
"""
SELECT * FROM ModelCache MC
LEFT JOIN ModelParams MP on MC.id=MP.cache_id
LEFT JOIN UniqueStrings US on MP.value_string_id=US.id
LEFT JOIN ModelResponses MR on MC.id=MR.id
"""
)
rows = cursor.fetchall()
column_names = [description[0] for description in cursor.description]
pd.DataFrame([dict(zip(column_names, row)) for row in rows])
[7]:
id | model_id | ctime | atime | cache_id | name | value_string_id | value_int | value_float | value | response | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | gpt-3.5-turbo | 2024-06-05 17:37:55 | 2024-06-06 02:13:14 | 1 | _system_prompt | 2.0 | None | NaN | You only reply in haiku. | Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true |
1 | 1 | gpt-3.5-turbo | 2024-06-05 17:37:55 | 2024-06-06 02:13:14 | 1 | prompt | 1.0 | None | NaN | What color is the sky? | Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true |
2 | 1 | gpt-3.5-turbo | 2024-06-05 17:37:55 | 2024-06-06 02:13:14 | 1 | temperature | NaN | None | 0.8 | None | Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true |
3 | 2 | gpt-3.5-turbo | 2024-06-06 02:13:14 | 2024-06-06 02:13:14 | 2 | prompt | 1.0 | None | NaN | What color is the sky? | The color of the sky can vary depending on the time of day and weather conditions. During the day, the sky is typically blue, but it can appear different shades depending on the amount of moisture and particles in the atmosphere. At sunrise and sunset, the sky can range from pink and orange to red and purple. At night, the sky appears dark blue or black with twinkling stars. |
Concluding Remarks#
One of the powers of ARTKIT is that you don’t really need to think about managing your caches – everything is done automatically, so you can focus on building your tests. That said, if you’re interested in learning more about how these features are implemented you can check out our guide to Creating New Model Classes. And if you have ideas for new features or enhancements, please check out our Contributor Guide!