Blog

The LLM is just guessing and that's quite okay

Yesterday I was working on a hairy task using tracing-rs. Look, on the programmer's scale of aptitude I'm much closer to Grug than I am to Fabrice Bellard. When I see type signatures that look like this:

struct SerializableContext<'a, 'b, Span, N>(
    &'b crate::layer::Context<'a, Span>,
    std::marker::PhantomData<N>,
)
where
    Span: Subscriber + for<'lookup> crate::registry::LookupSpan<'lookup>,
    N: for<'writer> FormatFields<'writer> + 'static;

I start to wonder if I made the right career choice and whether it's about time to switch to a new life making kebabs. To my own shock and amazement I was actually making progress on my task - I was writing a custom event formatter for a tracing layer but I had an unexpected error in the output. Like any good colleague, as soon as I hit a speed bump I immediately, without any of my own attempts to look at the issue, decided to throw the problem over the cubicle wall to ChatGPT to see what it thinks. Just like a cold interviewer in a dreaded tech interview i just copied the output, the function and wrote a terse prompt to the LLM to get it to do something:

why are my formatted fields getting garbled, is this because im missing phantom data??

[common_server\src\server\logging\json_formatter.rs:130:17] &data = FormattedFields {
    fields: "\u{1b}[3mchannel_id\u{1b}[0m\u{1b}[2m=\u{1b}[0m1 \u{1b}[3mname\u{1b}[0m\u{1b}[2m=\u{1b}[0m\"Rose Online Game\" \u{1b}[3mip\u{1b}[0m\u{1b}[2m=\u{1b}[0m\"127.0.0.1\" \u{1b}[3mport\u{1b}[0m\u{1b}[2m=\u{1b}[0m29200",
    formatter: tracing_subscriber::fmt::format::DefaultFields,
    was_ansi: true,
}

struct SerializeFields<'a, 'b, S, N>
where
    S: Subscriber + for<'lookup> LookupSpan<'lookup>,
    N: for<'writer> FormatFields<'writer> + 'static,
{
    event: &'a Event<'a>,
    ctx: &'b FmtContext<'b, S, N>,
}

impl<'a, 'b, S, N> serde::ser::Serialize for SerializeFields<'a, 'b, S, N>
where
    S: Subscriber + for<'lookup> LookupSpan<'lookup>,
    N: for<'writer> FormatFields<'writer> + 'static,
{
    fn serialize<Ser>(&self, serializer: Ser) -> Result<Ser::Ok, Ser::Error>
    where
        Ser: serde::ser::Serializer,
    {
        ...
    }
}

Also just like in an interview the answer is very obviously present in the question if one could read but, alas, it was later in the evening and I was yearning for bed so I did not in fact read the output closely. Not knowing how or why or when to use PhantomData but then to see it strung around in the code I was referencing was a red herring that was throwing me off. Of course my issue must be due to this intense type magic that I don't understand, it can't possibly be connected to the very simple English was_ansi statement...

Well, thankfully, Chat Gippity is good at reading! It correctly guessed the problem with the parsing step being related to the ANSI color codes in the string. I appreciate it was polite in it's response but I still couldn't help think it was mocking me somehow for mentioning PhantomData:

Your FormattedFields data appears garbled (e.g. \u{1b}[3mchannel_id\u{1b}[0m) because it's being rendered with ANSI escape codes for formatting, like italic ([3m) and reset ([0m). This is unrelated to PhantomData

It then provided me with an elegant detailed, explanation of the problem ("...That’s not valid JSON — it’s console output with ANSI codes."). Helpfully, it then guessed up a couple of options that I could try to implement. Options 2 & 3 were completely wrong, it was referencing functions that don't exist. Seems tracing is too hard even for a computer, maybe I can postpone my kebab-retirement after all. Option 1 included the correct answer but it was kind of accidental:

Option 1: 🧹 Remove .with_ansi(true) if you're using tracing_subscriber::fmt()

tracing_subscriber::fmt()
    .with_ansi(false) // <-- disables color codes
    .json()           // <-- forces structured format
    ...

…but even then, FormattedFields is still just a flat string, not structured JSON.

I should have known this would be the best answer since it has a helpful broom emoji. Technically the with_ansi(false) here is completely redundant because the json() call will set it to false anyway. The correct answer IS to call json() but more specifically, I need to call it BEFORE I set my event formatter because that builder sets the field formatter which will ensure that FormattedFields is json deserialable in my event formatter. So the correct call looks something like:

tracing_subscriber::fmt()
    .json()
    .event_format(KebabFormatter)

I actually didn't use this answer initially because the whole point of my effort was that I was implementing my own json formatter and didn't want to call json() to use the default event formatter. I just assumed ChatGPT was wrong and ignored this option (it was wrong for 2/3 of its responses so why not 3/3, right?!). Actually, the part of the response that isolated the issue to ANSI code + JSON parsing was enough of a clue for me to dig into the tracing code to discover that the json() call was doing a bit more to the layer than I had understood initially. Specifically, while my event formatter was fine I was not setting the correct field formatter! Looking at the json() function we can see that it uses format::JsonFields::new() in conjunction with its json event formatter which would solve the issue I was having.

    pub fn json(self) -> Layer<S, format::JsonFields, format::Format<format::Json, T>, W> {
        Layer {
            fmt_event: self.fmt_event.json(),
            fmt_fields: format::JsonFields::new(),
            fmt_span: self.fmt_span,
            make_writer: self.make_writer,
            // always disable ANSI escapes in JSON mode!
            is_ansi: false,
            log_internal_errors: self.log_internal_errors,
            _inner: self._inner,
        }
    }

In a twist of irony, the only way I could find to actually set the field formatter was....to call json() first. ChatGPT you lucky bastard, you got it right. So even though the LLM was just guessing, it turns out that's quite okay. The guess was good enough to help get me on the right track and now I can draw one more X on my countdown calendar to the day I start my new kebabing life.