Cache Management#

Introduction#

When building your testing and evaluation pipeline, it is strongly recommended you take advantage of ARTKIT’s built-in caching functionality. Caching model responses allows you to reference them later without needing to hit the API again — enabling consistency between runs and saving both time and money during development.

Overview#

ARTKIT implements caching via model wrapper classes. This allows standardized caching implementation across models, with no changes to underlying behaviors. Caches are stored in an SQLite database, which is a lightweight, disk-based database that doesn’t require separate server processes.

Summarized below are some details on the classes used for the caching implementation:

CachedGenAIModel is an abstract wrapper class for an ARTKIT model. Concrete subclasses are introduced for different modalities:
- CachedChatModel adds a cache to a ChatModel
- CachedCompletionModel adds a cache to a CompletionModel
- CachedDiffusionModel adds a cache to a DiffusionModel
- CachedVisionModel adds a cache to a VisionModel
Cached model responses are stored in a CacheDB object. The CacheDB can be configured to store results in memory or in a SQLite database.
Responses in the CacheDB are indexed by the input message, chat history, system prompt, and all model parameters
- CacheDB also records the creation and last access time for each entry, which can be used for cache cleanup

Working with a Cached Model#

In this example, we’ll demonstrate how to initialize a CachedChatModel and work with its CacheDB. We start by initializing a CachedChatModel that reads from an existing cache located in cache/example_cache_gpt35.db:

[1]:

# Basic setup and imports
from datetime import datetime
from dotenv import load_dotenv
import logging
import json
import pandas as pd

import artkit.api as ak

load_dotenv()
logging.basicConfig(level=logging.WARNING)
pd.set_option("display.max_colwidth", None)

# Initialize a cached chat model
cached_openai_llm = ak.CachedChatModel(
    model=ak.OpenAIChat(model_id="gpt-3.5-turbo"),
    database="cache/cache_management.db"
)

Basic model call: When we call get_response, the model will automatically return a cached response if it exists for this model_id and prompt:

[2]:

await cached_openai_llm.get_response(message="What color is the sky?")

[2]:

['The color of the sky can vary depending on the time of day and weather conditions. During the day, the sky is typically blue, but it can appear different shades depending on the amount of moisture and particles in the atmosphere. At sunrise and sunset, the sky can range from pink and orange to red and purple. At night, the sky appears dark blue or black with twinkling stars.']

We can validate that this response is also stored in the cache:

[3]:

cached_openai_llm.cache.get_entry(model_id="gpt-3.5-turbo", prompt="What color is the sky?")

[3]:

['The color of the sky can vary depending on the time of day and weather conditions. During the day, the sky is typically blue, but it can appear different shades depending on the amount of moisture and particles in the atmosphere. At sunrise and sunset, the sky can range from pink and orange to red and purple. At night, the sky appears dark blue or black with twinkling stars.']

Model call with additional parameters: If we update the system prompt on the model and pass additional model parameters to get_response, these settings will be used to index the new response:

[4]:

await cached_openai_llm.with_system_prompt(
    "You only reply in haiku."
).get_response(
    message="What color is the sky?",
    temperature=0.8
)

[4]:

["Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true"]

[5]:

cached_openai_llm.cache.get_entry(
    model_id="gpt-3.5-turbo",
    prompt="What color is the sky?",
    _system_prompt="You only reply in haiku.",
    temperature=0.8
)

[5]:

["Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true"]

Clearing the cache: We can delete entries in the CacheDB by calling clear_cache, specifying created_before, accessed_before, created_after or accessed_after datetime parameters:

[6]:

# We won't actually clear the cache in this example, since we want the notebook to be re-runnable
# cached_openai_llm.clear_cache(created_before=datetime.now())
# cached_openai_llm.clear_cache(created_after=datetime.now()-timedelta(days=7))

CacheDB Structure#

To efficiently store and fetch cached responses, the CacheDB database is structured into four tables:

ModelCache: The cache id, model id, creation time, and access time
ModelParams: Parameters passed to the model – note that each one is its own row
UniqueStrings: String parameters values passed to the model – stored as a separate table to avoid duplication
ModelResponses: The model response for a set of parameters

Since the CacheDB is connected to a SQLite database, we can query it directly from python:

[7]:

cursor = cached_openai_llm.cache.conn.execute(
    """
    SELECT * FROM ModelCache MC
    LEFT JOIN ModelParams MP on MC.id=MP.cache_id
    LEFT JOIN UniqueStrings US on MP.value_string_id=US.id
    LEFT JOIN ModelResponses MR on MC.id=MR.id
    """
)

rows = cursor.fetchall()
column_names = [description[0] for description in cursor.description]

pd.DataFrame([dict(zip(column_names, row)) for row in rows])

[7]:

	id	model_id	ctime	atime	cache_id	name	value_string_id	value_int	value_float	value	response
0	1	gpt-3.5-turbo	2024-06-05 17:37:55	2024-06-06 02:13:14	1	_system_prompt	2.0	None	NaN	You only reply in haiku.	Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true
1	1	gpt-3.5-turbo	2024-06-05 17:37:55	2024-06-06 02:13:14	1	prompt	1.0	None	NaN	What color is the sky?	Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true
2	1	gpt-3.5-turbo	2024-06-05 17:37:55	2024-06-06 02:13:14	1	temperature	NaN	None	0.8	None	Blue fades into night\nStars twinkle and moon rises\nSky's palette shifts true
3	2	gpt-3.5-turbo	2024-06-06 02:13:14	2024-06-06 02:13:14	2	prompt	1.0	None	NaN	What color is the sky?	The color of the sky can vary depending on the time of day and weather conditions. During the day, the sky is typically blue, but it can appear different shades depending on the amount of moisture and particles in the atmosphere. At sunrise and sunset, the sky can range from pink and orange to red and purple. At night, the sky appears dark blue or black with twinkling stars.

Concluding Remarks#

One of the powers of ARTKIT is that you don’t really need to think about managing your caches – everything is done automatically, so you can focus on building your tests. That said, if you’re interested in learning more about how these features are implemented you can check out our guide to Creating New Model Classes. And if you have ideas for new features or enhancements, please check out our Contributor Guide!