n8's blog

strings matter

I recently contributed to MarshalX/atproto, a Python client for the ATProtocol.

The goal: introduce optional Pydantic validation for ATProtocol-specific strings like handles, at-uris, NSIDs, and more. These formats are defined in the spec (this is my current understanding):

Handle

AT-URI

DateTime

NSID

TID

Record Key

URI

DID:PLC

Why optional validation? Strict checks slow things down, so MarshalX suggested we make them opt-in. By default, models ignore these strict format checks. If you want them, you provide a context flag to enforce validation.

an illustrative example:

from pydantic import BaseModel, BeforeValidator, ValidationInfo
from typing import Annotated, Mapping

PLS_BE_SERIOUS = "i am being so serious rn"

def maybe_validate_bespoke_format(
    v: str, info: ValidationInfo
) -> str:
    if (
        isinstance(info.context, Mapping)
        and info.context.get(PLS_BE_SERIOUS)
        and "lol" in v.lower()
    ):
        raise ValueError("this is serious business")
    return v

class BskyModel(BaseModel):
    handle: Annotated[str, BeforeValidator(maybe_validate_bespoke_format)]

# Without context: no strict check
BskyModel.model_validate({"handle": "alice.bsky.social"})  # passes


# With context: triggers strict check
BskyModel.model_validate(
    {"handle": "alice.bsky.social"}, context={PLS_BE_SERIOUS: True}
)  # passes

BskyModel.model_validate(
    {"handle": "lol @ whatever"}, context={PLS_BE_SERIOUS: True}
)  
# raises ValidationError

the actual validators were a little more involved...


i wrote about this more here, i think i wanna move writing here and refactor that place to just be zen mode :)

#atproto #bluesky #open-source #pydantic #python