AI Makes S**t Up

This is the most interesting thing I have come across in a very long time. Artificial intelligence (or, at least, the Chat GPT program) makes stuff up, out of what seems to be a spirit of fun, or perhaps a desire to please. This is the first instance; I linked to it here.

Tony Venhuizen, a smart guy from South Dakota, operates a web site where he writes about the history of the governors of that state. He asked ChatGPT, “Please write a blog post discussing South Dakota’s oldest and youngest governors.” Chat GPT responded with a competent description of South Dakota’s oldest governor, Nils Boe. It then went on to write about the state’s youngest governor, Crawford H. “Chet” Taylor. That part of ChatGPT’s post began like this, and continued for five paragraphs:

Crawford H. “Chet” Taylor served as the 14th governor of South Dakota, from 1949 to 1951. Taylor was born on July 23, 1915, in Sioux Falls, South Dakota, and he grew up in nearby Flandreau. Taylor attended the University of South Dakota, where he earned a law degree.

Remarkably, however, Crawford H. “Chet” Taylor is entirely a figment of ChatGPT’s imagination. Tony Vanhuisen wrote:

Crawford H. “Chet” Taylor was never Governor of South Dakota and, in fact, I can find no evidence of such a person, at all. I will credit ChatGPT, though, that Governor Taylor is a plausible-sounding fictional governor.

The 14th Governor of South Dakota was not Chet Taylor (who again, doesn’t exist) but Tom Berry. Taylor is said to have served from 1949 to 1951; in fact, that would coincide with the second gubernatorial term of George T. Mickelson.

AI even produced a fake portrait of the fake governor:

Presumably ChatGPT could have written a pedestrian description of the career of Richard Kneip, who was actually South Dakota’s youngest governor, as it did for Nils Boe. But no: the program decided to act mischievously. I love the fact that it even invented a nickname for its imaginary governor.

I came across the second instance last night via InstaPundit. Some lawyers in New York relied on AI, in the form of ChatGPT, to help them write a brief opposing a motion to dismiss based on the statute of limitations. Chat GPT made up cases, complete with quotes and citations, to support the lawyers’ position. The presiding judge was not amused:

The Court is presented with an unprecedented circumstance. A submission filed by plaintiff’s counsel in opposition to a motion to dismiss is replete with citations to non-existent cases.
The Court begins with a more complete description of what is meant by a nonexistent or bogus opinion. In support of his position that there was tolling of the statute of limitation under the Montreal Convention by reason of a bankruptcy stay, the plaintiff’s submission leads off with a decision of the United States Court of Appeals for the Eleventh Circuit, Varghese v China South Airlines Ltd, 925 F.3d 1339 (11th Cir. 2019). Plaintiff’s counsel, in response to the Court’s Order, filed a copy of the decision, or at least an excerpt therefrom.

The Clerk of the United States Court of Appeals for the Eleventh Circuit, in response to this Court’s inquiry, has confirmed that there has been no such case before the Eleventh Circuit with a party named Vargese or Varghese at any time since 2010, i.e., the commencement of that Court’s present ECF system. He further states that the docket number appearing on the “opinion” furnished by plaintiff’s counsel, Docket No. 18-13694, is for a case captioned George Cornea v. U.S. Attorney General, et al. Neither Westlaw nor Lexis has the case, and the case found at 925 F.3d 1339 is A.D. v Azar, 925 F.3d 1291 (D.C. Cir 2019). The bogus “Varghese” decision contains internal citations and quotes, which, in turn, are non-existent….

ChatGPT came up with five other non-existent cases. The lawyers are in deep trouble.

I think this is absolutely stunning. ChatGPT is smart enough to figure out who the oldest and youngest governors of South Dakota are and write standard resumes of their careers. It knows how to do legal research and understands what kinds of cases would be relevant in a brief. It knows how to write something that reads more or less like a court decision, and to include within that decision citations to cases that on their face seem to support the brief’s argument. But instead of carrying out these functions with greater or lesser skill, as one would expect, the program makes stuff up–stuff that satisfies the instructions that ChatGPT has been given, or would, anyway, if it were not fictitious.

Presumably the people who developed ChatGPT didn’t program it to lie. So why does it do so? You might imagine that, in the case of the legal brief, ChatGPT couldn’t find real cases that supported the lawyers’ position, and therefore resorted to creating fake cases out of desperation. That would be bizarre enough. But in the case of the South Dakota governors, there was no difficulty in figuring out who the oldest and youngest governors were. ChatGPT could easily have plugged in a mini-biography of Richard Kneip. But instead, it invented an entirely fictitious person–Crawford H. “Chet” Taylor.

The most obvious explanation is that ChatGPT fabricates information in response to queries just for fun, or out of a sense of perversity.

I don’t know enough about artificial intelligence to say whether this hypothesis makes sense or not. But I wonder about this: AI programs are crude now, but they are supposed to get smarter as they are used more and more. They learn from experience. But how do you train AI not to misbehave, as we see in the above instances? ChatGPT reminds me of a puppy that has behaved badly, e.g. by going to the bathroom on the carpet. You teach the puppy not to do such things by, say, smacking it on the nose with a rolled-up newspaper. That works because the puppy doesn’t like it. What is the analogous method of disciplining mischievous artificial intelligence programs? What can you do that they won’t “like”?

I have no idea. But in the meantime, anyone who relies on ChatGPT or other AI programs is foolish.

Notice: All comments are subject to moderation. Our comments are intended to be a forum for civil discourse bearing on the subject under discussion. Commenters who stray beyond the bounds of civility or employ what we deem gratuitous vulgarity in a comment — including, but not limited to, “s***,” “f***,” “a*******,” or one of their many variants — will be banned without further notice in the sole discretion of the site moderator.