Updated Fall 2024

ChatGPT and other large language models (LLMs) are relatively new and rapidly evolving tools for scientists and engineers. They can be powerful for summarizing documentation in large technical projects or computer languages. For instance, you can use it to explain functions and methods in Python. We encourage you to use ChatGPT as you code. Perhaps start by prompting it to summarize the libraries we will use: numpy, scipy, and matplotlib, or the basics of Markup in Jupyter notebooks.

Some things to keep in mind as you experiment with ChatGPT or other LLMs:

The quality of the answer depends on the information you give it. Practice such "prompt engineering" by posing questions in different ways with different levels of detail. A good technique is to provide follow up prompts to hone the output.
Any code returned by an LLM may be buggy or it may not work at all. You can sometimes describe the error in a subsequent prompt and it will help you to correct it, but like any engineering solution, it is up to you, the engineer, to check the work!

LLMs are basically statistical models that derive from very large sets of text. Well, don't take my word for it, see what ChatG has to say:

Prompt

ChatG, please summarize large language models for my thermodynamics students in two hundred words or less. I want to convey how LLMs are basically statistical models trained on large sets of text documents.

ChatGPT

Large language models (LLMs) are advanced statistical models trained on vast amounts of text data. They function by predicting the probability of a sequence of words based on patterns learned from the training data. These models use deep learning techniques, specifically neural networks with many layers, to capture complex relationships between words and phrases.

When an LLM processes a text input, it breaks it down into tokens (words or subwords) and predicts the most likely next tokens based on its training. This prediction is repeated iteratively to generate coherent and contextually relevant text. The "language" in these models refers to their ability to understand and generate human language, but fundamentally, they operate on statistical principles, relying on the frequencies and associations of words observed in the training data.

LLMs are powerful because they can generalize from the data they were trained on to produce meaningful responses to new, unseen text inputs. However, their accuracy and reliability are limited by the data they were trained on, which means they can sometimes generate incorrect or biased information.

Caution

See the explanation above that LLMs are statistical models based on large sets of text? It is highly unlikely that an LLM at this stage will solve your thermodynamics problems because very little training data will contain technical calculations. Furthermore, the models are not designed for mathematics (so far). They are generating answers by statistical inference, not by the fundamental rules of mathematics, physics, and chemistry.
LLMs might give reasonable summaries of thermodynamic definitions and concepts, but the subject is large and over the course of its history has been written about from many perspectives. The fundamental resource for concepts in this course is the textbook, the 5th edition of Chemical, Biochemical, and Engineering Thermodynamics by Prof. Stanley I. Sandler.
The use of LLMs and AI technologies varies from course to course. Our use of AI and LLMs in this course does not grant you permission to use it in other courses. Talk to your instructors and follow the guidance given in course syllabi, etc.

More information

Engineers don't take technical solutions for granted! We generally like to "look under the hood" and see how things work. So, if you are interested in learning more about the technical underpinnings of LLMs and their uses, here are a few resources:

Prof. Jesse Thaler (MIT) provides some insightful comments on the role of AI in the physical sciences in this roundtable discussion hosted by the American Institute of Physics Foundation. Below are links to some of his comments:
https://www.youtube.com/live/cUeEP15KN8M?si=UngwZpUcpxYkaYCE&t=611
https://www.youtube.com/live/cUeEP15KN8M?si=AIdi8sNEgiG7Bhv0&t=2087
Andrej Karpathy builds a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3: https://www.youtube.com/watch?v=kCc8FmEb1nY
"Attention is All You Need" is the paper that describes the architecture of OpenAI's models: https://dl.acm.org/doi/10.5555/3295222.3295349
At the heart of a GPT is the transformer -- a particular network architecture illustrated below. Karpathy's video above goes through each module and shows how to implement it in code. Try it out! (OK, maybe over winter break. We've got thermo to learn right now!)