26 Comments

Not sure if it's smart to champion json schemas as a way to go, since its a resource format that brought challenges in the past leading to the emergence of GraphQL — that eventually established capability oriented design over resource oriented imperative implementations. Solving a long standing integration nightmare. So a neat way for now to "get started" with the ChatGPT API specifically, but should not be over invested. OpenAI is aware of this and the guys are working on an implementation that is more similar to https://lmql.ai/#cot ; being able to express natural language prompts that also contains code.

Expand full comment

Dino, it's cool to see you here!

I agree with your sentiment. JSON Schemas is a cool new tool in the toolbox for OpenAI API users; and you can even "hack" it to make it do some "programming"; but at the end of the day, it's a hack.

LMQL does feel like magic from time to time, it's really cool!

Expand full comment

Just a big thanks for an easy guide on how to use this. :) OpenAIs own documentation is lacking.

Expand full comment

Great post. I'm intrigued about how you'd use the schema and function to do chain of thought prompting?

Expand full comment

The idea would be something like this. - I would recommend to test the impact this has on the quality of results. (It might make things worse! or better!)

"steps": {

"type": "array",

"description": "Lets think step by step, before outputting the result",

"items": { "type": "string" }

},

"result": { "type": "string" }

Expand full comment

Thanks. That’s a creative idea. That was my #1 concern with functions in not being able to give the model space to think. 👍

Expand full comment

If you tell GPT to think before it starts to output, will it actually think trough its output before it starts outputting?

I dont think so. Thats not how LLMs work, no?

Expand full comment

GPT can only „think“ by outputting, and it outputs each token after spending the same amount of compute on it (regardless of the cognitive complexity of the task).

The intuition here is that you can give GPT a scratchboard to output it’s ideas (result.steps), which is ignored by us but is used by the attention mechanism of GPT to make a conclusion and later output it to result.result.

This is conceptually similar to how we humans can hold an inner monologue with ourselves before answering a question.

See also https://platform.openai.com/docs/guides/gpt-best-practices/tactic-instruct-the-model-to-work-out-its-own-solution-before-rushing-to-a-conclusion

Expand full comment

Oh.

So basically it's "thinking" by writing/outputting to a space that the user never sees?

Thank you :D

Expand full comment

Yup!

Expand full comment

Did you create the image of the Recipe Creator app yourself? It's very well-drawn and aesthetically pleasing.

Expand full comment

Yes, I made it with Excalidraw. https://excalidraw.com/

Check it out, it's awesome!

Expand full comment

Oh wow. Thank you. I didn’t realize a tool like this existed. It’s just what I’ve been looking for. 🙏🏼

Expand full comment

is there a way to produce json in different languages ? i used to be able to do so with prompt engineering, but now how can we do that ?

Expand full comment

Looks like OpenAI added a note since your writing:

> (note: the model may generate invalid JSON or hallucinate parameters)

This suggests that it’s not masking the token posterior with the schema, and just relying on the system message and improved steering.

Sad. I was getting excited at having access to something other than just bulk inference out of these most advanced models

Expand full comment

Thanks! Updated the post to reflect this. I can't manage to make it generate invalid JSON, even on 3.5 (but can easily do so if I just pass the json schema). Very interesting!

Expand full comment

There is no evidence they are using jsonformers. The performance is exactly the same as if you were to just feed it a json schema and tell it to format the output. This is literally exactly the same as prompt engineering they just do it for you.

Expand full comment

I think it's unlikely that they are literally pasting the JSON Schema into the prompt:

- The naive approach of literally pasting a json schema would use up 371 tokens for the schema alone (whereas I was billed 126)

- Adding multiple functions does not increase the token usage by as much as you'd expect if they were pasting the json schemas in the prompt

- There are specific JSON Schema features that are unsupported in the API. For example: consts, if/else cases (these are ignored by functions)

- My tests show that the model is pretty robust against malicious attempts to make it not output json or break the syntax. GPT 3.5 did not withstand the same tests. However, I believe more testing is needed to rule this out.

I am not stating with 100 % confidence that they are using the same approach as jsonformers, but it would be my best guess given my observations.

Expand full comment

You JSON schema is 134 tokens when minified

https://imgur.com/a/pQi7xtO

Here is a similar size prompt that does the same thing on 3.5 with just prompt engineering.

https://chat.openai.com/share/cdfbe292-bb6f-4f45-ae26-0a8d61c48f6c

And GPT4 was always robust against malicious attempts

https://chat.openai.com/share/a7fe1531-a504-4e36-a727-10cf0d0743ad

Expand full comment

Thanks! I have updated the post to reflect the uncertainty about the method used

By the way, the fact that we can now rely on GPT 3.5 for JSON generation is cool, even if GPT 4 could already do it!

Expand full comment

How could JSON schema be a Turing complete language?

Expand full comment

Other type systems like Typescript's type system have been proven to be Turing complete. https://github.com/microsoft/TypeScript/issues/14833, https://github.com/ronami/meta-typing

I don't think JSON Schema on its own is turing complete (I haven't seen any examples of it), but: My intuition tells me that the necessary control flow primitives for Turing completeness are in JSON Schema with the help of #ref and anyOf / allOf primitives. As for memory, you might be able to get a helping hand from GPT. That would get you pretty close to having a Turing complete execution engine inside the JSON Schema.

The motivation here would be to pass down strategies with several branches and loops in a single API call. The more expressive the JSON Schema language is, the more complex algorithms you can run in a single API call.

Expand full comment
Comment deleted
Jun 15, 2023Edited
Comment deleted
Expand full comment

There is a `function_call` parameter which lets you demand a specific function to be called.

Expand full comment
Comment deleted
Jun 15, 2023
Comment deleted
Expand full comment

Yup. You might get gibberish if its completely irrelevant.

Expand full comment
Comment deleted
Jun 15, 2023
Comment deleted
Expand full comment

Well you _can_ give it the choice of picking a function. In this example, there was no choice to be made so I set `function_call` to a specific function. You can avoid setting `function_call` to let the model decide :)

Expand full comment
Comment deleted
Jun 15, 2023Edited
Comment deleted
Expand full comment