Debunking Those Bonehead Claims About Being Able To Use Special-Purpose AI To Readily Spot Generative AI ChatGPT Produced Essays, Plus Caustic Call-Outs By AI Ethics And AI Law

Debunking Those Bonehead Claims About Being Able To Use Special-Purpose AI To Readily Spot Generative AI ChatGPT Produced Essays, Plus Caustic Call-Outs By AI Ethics And AI Law

You can fool some of the people some of the time.

That seems to be especially the case when it comes to the latest headline-blaring news about Artificial Intelligence (AI) entailing those newly emerging AI apps that are supposedly a kind of kryptonite, as it were, regarding Generative AI such as ChatGPT. These special-purpose AI apps are allegedly able to inform you whether any given set of text came from a human writer versus a generative AI.

Generally, this is a bunch of smoke and mirrors.

I’ll be elaborating herein as to why those special-purpose AI apps are pretty much Fool’s Gold. They are a kind of computer techie trickery that in the end is relatively hollow and lacks any boastful bona fide merit for what they over-the-top claim can be done.

Whatever you do, please do not fall for the outsized and misleading claims being made by those that are releasing those AI apps, plus do not believe those misguided news reporters that have gone hook, line, and sinker for the falsehoods and blustery proclamations. It’s sad. It’s a shame. All of this reinforces the need for greater awareness about AI Ethics and AI Law, a topic that I continue to extensively explore in my column, such as at the link here and the link here, just to name a few.

Before I get ahead of things, let’s lay out the key issues at hand.

In today’s column, we are going on a debunking journey. We will mindfully look in-depth at a newly emerging round of so-called special purpose AI apps trying to outdo another type of AI known as Generative AI, which in itself is already a keenly hot topic because of a recently released AI app called ChatGPT.

In case you blinked your eyes and didn’t perchance notice, much of the already roaring applause for generative AI is now somewhat succumbing to the rapidly rising praise for these special-purpose AI apps. Allow me to momentarily digress so that you’ll know what the problem is and how these bonus AI apps are the heralded solution (a false pronouncement, as I will explain in today’s discussion).

First, let’s put the problem on the table.

You might have heard that ChatGPT and other such generative AI technologies are able to generate text or essays that are nearly on par with human-written essays. This has created quite a brouhaha. The gist is that people can now get away with using a generative AI app to write their essays, memos, stories, and other narratives via the push of a button. You can then slap your name on the product and claim that you worked all night long to personally craft the spellbinding text. No one knows that you used a generative AI tool to do so.

It’s a big problem and going to get bigger as generative AI becomes more widely available.

What are we to do?

The answer proffered by some is that we can use another type of AI to detect when someone tries to pawn off a generative AI-written essay as though it was human-written. These special-purpose AI apps are being rushed into the marketplace. The banner headlines tell it this way. All you seemingly need to do is feed any given set of text into these bonus AI apps, and the tool will supposedly tell you whether the text was devised by a human or whether it was devised by an AI app.

Voila, the world has been saved.

People are vociferously heralding the arrival of these special-purpose AI apps. It all seems sensible. Presumably, the idea is to fight fire with fire. If generative AI can produce amazingly stellar essays, we ought to harness other AI to detect the sneaky underhanded use of AI that went into it. Catch those cheating students and other devious souls for having veered into murky waters and the untoward abyss of unethical behaviors by merely turning in AI-generated work as though it is their own work.

Restated, here’s what we have:

  • Big Problem: Generative AI such as ChatGPT can generate text and essays seemingly on par with that of being written by human hands, and there are those sneaky souls that are using generative AI to cheat and have AI do their writing for them.
  • Alleged Solution: Develop and make available special-purpose AI apps that can examine any given body of text and determine whether the text was written by a human hand or by a generative AI app.

Can such an alleged solution truly and fully be crafted?

The answer, in short, is no.

I realize that will cause some shock and dismay. The hope was that these special-purpose AI apps will be the heroes to save us from the underhanded use of generative AI. Well, sorry to tell you but sometimes wishes are only dreams. In the real world, these special-purpose AI apps are readily beaten to a pulp and essentially of little use.

Let’s dig into why this is the case.

Before we get into the meaty aspects of why those special-purpose AI apps aren’t going to be the salvation for these generative AI woes, it might be useful to make sure that we are all on the same page on this rapidly evolving topic of generative AI all told. A specific type of AI popularly known as Generative AI has dominated social media and the news recently when it comes to talking about where AI is and where it might be headed. This was sparked by the release of an AI app that employs generative AI, the ChatGPT app developed by the organization OpenAI. ChatGPT is a general-purpose AI interactive system, essentially a seemingly innocuous general chatbot, nonetheless, it is actively and avidly being used by people in ways that are catching many entirely off-guard.

For example, as alluded to earlier herein, a prominent worry is that ChatGPT and other similar generative AI apps will allow students to cheat on their written essays, perhaps even encouraging or spurring pupils to do so. Students that are lazy or feel they are boxed in without time or skill to do an essay might readily invoke a generative AI app to write their essay for them. This has stirred quite a hornet’s nest.

If you are interested in the rapidly expanding commotion about ChatGPT and generative AI, I’ve been doing a focused series in my column that you might find informative. Here’s a glance in case any of these topics catch your fancy:

  • Generative AI and Mental Health Advice. I opted to review how generative AI and ChatGPT are being used for mental health advice, a troublesome trend, per my focused analysis at the link here.
  • Predictions Of Generative AI Advances Coming. If you want to know what is likely to unfold about AI throughout 2023, including upcoming advances in generative AI and ChatGPT, you’ll want to read my comprehensive list of 2023 predictions at the link here.
  • Context And Generative AI Use. I also did a seasonally flavored tongue-in-cheek examination pertaining to a Santa-related context involving ChatGPT and generative AI at the link here.
  • Scammers Using Generative AI. On an ominous note, some scammers have figured out how to use generative AI and ChatGPT to do wrongdoing, including generating scam emails and even producing programming code for malware, see my analysis at the link here.
  • Rookie Mistakes Using Generative AI. Many people are both overshooting and surprisingly undershooting what generative AI and ChatGPT can do, so I looked especially at the undershooting that AI rookies tend to make, see the discussion at the link here.
  • Coping With Generative AI Prompts And AI Hallucinations. I describe a leading-edge approach to using AI add-ons to deal with the various issues associated with trying to enter suitable prompts into generative AI, plus there are additional AI add-ons for detecting so-called AI hallucinated outputs and falsehoods, as covered at the link here.

If you’ve not yet learned much about Generative AI and ChatGPT, no worries as I’ll be describing the foundations herein so hang in there and you’ll get the general scoop.

Perhaps you’ve already seen those outsized claims on social media about Generative AI. Some suggest that the latest version of AI is in fact sentient AI (nope, they are wrong!). Those in AI Ethics and AI Law are notably worried about this burgeoning trend of outstretched claims. You might politely say that some people are overstating what today’s AI can actually do. They assume that AI has capabilities that we haven’t yet been able to achieve. That’s unfortunate. Worse still, they can allow themselves and others to get into dire situations because of an assumption that the AI will be sentient or human-like in being able to take action.

Do not anthropomorphize AI.

Doing so will get you caught in a sticky and dour reliance trap of expecting the AI to do things it is unable to perform. With that being said, the latest in generative AI is relatively impressive for what it can do. Be aware though that there are significant limitations that you ought to continually keep in mind when using any generative AI app.

Let’s cover the essence of generative AI and ChatGPT. We’ll then be ready to jump into the reasons why these new special-purpose AI apps aren’t going to technologically move the needle and be able to fulfill the desire of catching people with their hands in the proverbial generative AI cookie jar.

Opening The Can Of Worms On Generative AI

We are ready to dive into some details about AI.

If you are already very well versed on the topic of generative AI and ChatGPT, you might opt to briefly skim through my points and continue with the next section of this discussion. For everyone else, I believe you might find this elucidation helpful.

In brief, generative AI is a particular type of AI that composes text as though the text was written by the human hand and mind. All you need to do is enter a prompt, such as a sentence like “Tell me about Abraham Lincoln” and generative AI will provide you with an essay about Lincoln. This is commonly classified as generative AI that performs text-to-text or some prefer to call it text-to-essay output. You might have heard about other modes of generative AI, such as text-to-art and text-to-video.

Your first thought might be that this does not seem like such a big deal in terms of producing essays. You can easily do an online search of the Internet and readily find tons and tons of essays about President Lincoln.

The kicker in the case of generative AI is that the generated essay is relatively unique and provides an original composition rather than a copycat. If you were to try and find the AI-produced essay online someplace, you would be unlikely to discover it.

Generative AI is pre-trained and makes use of a complex mathematical and computational formulation that has been set up by examining patterns in written words and stories across the web. As a result of examining thousands and millions of written passages, the AI can spew out new essays and stories that are a mishmash of what was found. By adding in various probabilistic functionality, the resulting text is pretty much unique in comparison to what has been used in the training set.

That’s why there has been an uproar about students being able to cheat when writing essays outside of the classroom. A teacher cannot merely take the essay that deceitful students assert is their own writing and seek to find out whether it was copied from some other online source. Overall, there won’t be any definitive preexisting essay online that fits the AI-generated essay. All told, the teacher will have to begrudgingly accept that the student wrote the essay as an original piece of work.

In a moment, I’ll showcase to you what happens when you enter questions or prompts into generative AI. I will make use of the latest version of ChatGPT to enter my prompts and have collected the “answers” or essays generated by the AI (note that the same can be done with the numerous other available generative AI apps; I’ve opted to use ChatGPT because it is getting its five minutes of fame right now).

Perhaps a short tangent about ChatGPT might be helpful at this juncture.

ChatGPT app was made available to the general public just a short while ago. By and large, these generative AI apps are usually only accessible to AI insiders. The unusual facet that ChatGPT could be used by anyone by simply entering an email address and a name, well, this led to a lot of people deciding to give it a try. ChatGPT is currently free to use (the monetization issue is a looming dilemma for AI makers).

Almost immediately there was a humongous reaction on social media as people raced to give examples of what generative AI can do. The company that makes ChatGPT, OpenAI, opted to close off the signups at a million users. Those million users have managed to bombard the airwaves with all manner of stories and tales about using ChatGPT.

Be very careful in believing what people have to say about the AI app. Many of these people are clueless about what they are using. It is almost as though they had never driven a car and didn’t even realize cars existed, and all of a sudden they had a chance to drive a car. Utter amazement ensues.

I’m not saying that generative AI isn’t relatively impressive. It is. I am just emphasizing that a lot of the gushing testimonials are being done by many that are blissfully unaware of what today’s AI can do. Those of us on the inside of AI have been using generative AI for the last several years. Perhaps we became used to it.

Suddenly, seeing a huge crush of people touting it to the rooftops has been excitedly energizing, but also somewhat disconcerting. The disconcerting part is when people proclaim that generative AI is sentient. It is not. Do not let anyone convince you otherwise.

That being said, there is an ongoing heated debate in the AI field as to whether generative AI is on the path to sentience or whether maybe it is not. One view is that if we keep scaling up generative AI with faster computers and a greater amount of data such as scouring every inch of the Internet, we will nearly spontaneously arrive at sentient AI. Others argue that this is highly unlikely. They suggest that generative AI might be one of many components that are needed. There is even the gloomier view that generative AI is a sideshow that is distracting us from the real breakthroughs that we will need to achieve sentient AI.

You might also find noteworthiness that AI insiders tend to refer to Artificial General Intelligence (AGI) as the aspirational goal for the AI field. It used to be that the goal was to attain Artificial Intelligence, but the AI moniker has become watered down and muddled. When someone says they are doing AI work, you don’t know whether they are alluding to today’s AI that isn’t on par with humans or whether they are referring to a futuristic human equivalency AI. To get around that exasperating confusion, the newer phrasing of AGI is being used these days.

All told, the generative AI of today is not sentient, nor is it AGI.

I trust that this gets you into the ballpark about generative AI and particularly ChatGPT.

I will go ahead and show you a series of prompts and the corresponding responses that I got from ChatGPT. I’ll discuss each one as we go along. You can judge for yourself what you think of the AI-generated responses.

Please remember that as earlier discussed, the AI is not sentient. The generated responses by the AI are a mathematical and computational combination of words into seemingly fluent passages. This is based on the AI algorithm having been trained on datasets of words and stories that humans have written (principally as posted on the Internet). I repeat this warning because you will undoubtedly fall into the mental trap that these responses are so fluent that the AI must be sentient. This happens to most people. As earlier urged, set aside that anthropomorphizing. Always remember that the responses are based on the vast trove of writing by humans that exists on the Internet and thusly will highly resemble human writing.

There is something else you need to know.

Generative AI that is trained on the Internet in an unfettered way will tend to bake into whatever text-based responses it mathematically and computationally concocts some offensively hazy stuff, including repulsively nasty wording. There is a lot of crazy and filthy stuff posted out there on the web.

You’ve seen it, you know what I mean.

The companies that are crafting these AI apps are worried that the proverbial baby will get tossed out with the bathwater (an old saying, perhaps to be retired), which means that if their AI produces offensive essays or stories, people will go up in arms about the AI. I’ve covered the many previous instances in which these kinds of Natural Language Processing (NLP) AI apps were unveiled and soon enough all manner of horrible stuff came out of them (I’ve covered these instances in my column). Most of the AI makers learned a hard lesson about allowing their AI wares to be unfettered in their outputs.

In the case of ChatGPT, the AI developers sought to put into place some algorithmic and data-related checks and balances to curb nastiness in the outputs of the AI. Part of this occurred during training time. In addition, there are other means in a real-time attempt to obviate especially egregious outputs.

You might find of interest that some people that have used ChatGPT already came up with surreptitious ways to get around those guardrails by making use of various trickery. An ongoing cat-and-mouse gambit takes place in these matters. Those that do these trickeries are sometimes doing so for the fun of it, while sometimes they (at least claim) they are doing so to see how far the AI can be stretched and provide a helpful means of forewarning the brittleness and weaknesses of these budding AI apps.

I decided to not attempt to circumvent the customary controls in this focused exploration. The text output is clean. Certainly, if one wanted to do so, you could undoubtedly get some oddball and unsavory essays to be generated.

The essays produced by most of these generative AI apps are designed to convey the output as though it is purely factual and accurate. When you read the produced essays, they come across as fully confident. There isn’t usually any kind of indication that the content might be rocky. This is by choice of the AI makers, namely that they could revise the AI apps to be more transparent if they wanted the AI app to do so.

Sometimes, a generative AI app picks up falsehoods amid the training data of unreliable info across the Internet. There is no “common sense” in generative AI to determine what is true versus false. Furthermore, very few AI apps have any cross-checking, and nor do they showcase any probabilities associated with what they are conveying.

The bottom-line result is that you get a response that looks and feels like it exudes great assurance and must be entirely correct. Not so. There is even a chance that the AI computationally made-up stuff, which in AI parlance is referred to as AI hallucinations (a coined term that I decidedly don’t like), see my discussion at the link here.

The makers of ChatGPT underwent a concerted effort to try and reduce the bad stuff outputs. For example, they used a variant of what is known as RLHF (Reinforcement Learning from Human Feedback), whereby before they released the AI to the public, they had hired humans to examine various outputs and indicate to the AI whether there were things wrong with those outputs such as perhaps showcasing biases, foul words, and the like. By providing this feedback, the AI app was able to adjust computationally and mathematically toward reducing the emitting of such content. Note that this isn’t a guaranteed ironclad method and there are still ways that such content can be emitted by the AI app.

You might find of interest that ChatGPT is based on a version of a predecessor AI app known as GPT-3. ChatGPT is considered to be a slightly next step, referred to as GPT-3.5. It is anticipated that GPT-4 will likely be released in the Spring of 2023. Presumably, GPT-4 is going to be an impressive step forward in terms of being able to produce seemingly even more fluent essays, going deeper, and being an awe-inspiring marvel as to the compositions that it can produce.

You can expect to see a new round of expressed wonderment when springtime comes along and the latest in generative AI is released.

I bring this up because there is another angle to keep in mind, consisting of a potential Achilles heel to these better and bigger generative AI apps. If any AI vendor makes available a generative AI app that frothily spews out foulness, this could dash the hopes of those AI makers. A societal spillover can cause all generative AI to get a serious black eye. People will undoubtedly get quite upset at foul outputs, which have happened many times already and led to boisterous societal condemnation backlashes toward AI.

One final forewarning for now.

Whatever you see or read in a generative AI response that seems to be conveyed as purely factual (dates, places, people, etc.), make sure to remain skeptical and be willing to double-check what you see.

Yes, dates can be concocted, places can be made up, and elements that we usually expect to be above reproach are all subject to suspicions. Do not believe what you read and keep a skeptical eye when examining any generative AI essays or outputs. If a generative AI app tells you that Abraham Lincoln flew around the country in his own private jet, you would undoubtedly know that this is malarky. Unfortunately, some people might not discern that jets weren’t around in his day, or they might know but fail to notice that the essay makes this bold and outrageously false claim.

A strong dose of healthy skepticism and a persistent mindset of disbelief will be your best asset when using generative AI.

We are ready to move into the next stage of this elucidation.

Distinguishing Generative AI Versus Human Writing

Take a seat and get comfortable for this erstwhile journey.

The problem that we are trying to solve is that we would very much like to take an outputted set of text from a generative AI app and somehow determine whether the text was written by the AI versus by a human. Assume that we do not know beforehand how the text was devised. Nobody tells us that it was handwritten or that it was AI-derived. All that we have in front of us is a body of text.

We are starting with nothing more than a bunch of text that was placed into our possession.

Two paths seem to arise:

  • Might be human-written. It could be that a human wrote the text. If so, we’d like to be able to inspect the text and declare without any ambiguity that it was indeed human-written.
  • Might be generative AI written. It could be that a generative AI app wrote the text. If so, we’d like to be able to inspect the text and declare without any ambiguity that it was indeed written by a generative AI app.

So, how are we to decide the authorship of the text?

You could say that this type of task has been studied in a different context and yet handily provides applicable insights.

Here’s what I mean.

Did Shakespeare write all of the works that we assume were written by Shakespeare?

There has been a devoted back-and-forth analyzing his body of work. Some assert that Shakespeare didn’t write all of the works that his name is attached to. Supposedly, someone else might have written some of those other poems or stories. Indeed, startlingly, there might have been more than just one other author, implying that perhaps several people might have written works that we associate with Shakespeare.

Notice that this comes down to a question of authorship.

For a given set of text that we think might have been written by Shakespeare, can we say with any definitive answer whether or not he wrote that body of text? Similarly, if we get a set of text that supposing John Smith claimed to have written, can we potentially bust John Smith and show that without a doubt the text was written by Shakespeare?

You might be familiar with the idea that sometimes ghostwriters are employed to pen a piece of work and they remain anonymous thereof. If Jane Doe comes out with an essay or story and claims that she wrote it, we might want to try and ascertain whether perhaps she used a ghostwriter to write the text for her. The person that did the actual writing, the ghostwriter, might have a non-disclosure agreement that precludes them from saying that the story was written for Jane.

Nonetheless, we might have clever ways to try and figure out what actually occurred. Those clever ways of trying to determine authorship are typically found in a field of study known as stylometry. By using the various techniques and technologies associated with stylometry, we can examine the linguistic style of a set of text. The resulting analysis can give us clues as to authorship.

I’m sure you’ve experienced something like this.

A friend of yours writes a story. You take a look at it and instantly recognize that it was undoubtedly written by them. How can you tell? The nature of the words used in the text is perhaps a telltale clue. Your friend likes to use especially high-brow words. Also, your friend writes lengthy sentences. If you see lots of really long sentences and high-brow words, it is nearly a signature that your friend wrote the piece. They gave themselves away, and probably didn’t even know that they did so.

A lot of effort over many centuries has been spent on trying to figure out authorship questions.

It is an intriguing aspect. You’ve got situations involving someone that says they wrote something, but we might have our doubts. You’ve got other situations wherein somebody claims they didn’t write something, but we want to pin them on having written it. Who was the author? We cannot necessarily believe what people claim as to authorship.

Maybe we can use numbers and calculations to square things out.

Welcome to what some coin as forensic linguistics. You examine a set of text. Part of the inspection involves finding patterns in how the text is written. Based on those patterns, you compare the patterns found to other bodies of text. If the patterns seem to match, you might contend that the same author wrote those bodies of text. If the patterns do not match, you might claim that the works were written by someone other than the author of interest. Sherlock Holmes detective work, for sure.

And all of it is purely informational based. You are to only make use of the text. I say this because everything else that you find out might be based on lies or insidious attempts to lead you astray. Stick with the text. It is the tangible ground-truth. Of course, we can ultimately pair up the textual analysis with other known facts and knowledge about how the text came to be, but for now, let’s just focus on the text-only circumstance.

We can do a wide array of textual analyses, including:

  • Use of common words versus use of less common words
  • Long words versus short words
  • Long sentences versus short sentences
  • Number of sentences per paragraph
  • Number of words per sentence
  • Alternating short sentences and long sentences
  • Use of punctuation
  • Use of verbs, nouns, etc.
  • Preferred or frequently used words or phrases
  • Lack of words that are otherwise conventionally used
  • Repetition of words, phrases, or sentences
  • Passive voice versus active voice expressions
  • Dialect choices
  • Readability level on a scale of low to high
  • Analysis via use of n-gram modeling
  • Analysis via use of statistics such as cluster analysis, discriminant analysis, and so on
  • Etc.

You can essentially take a set of text and do a lot of slicing and dicing to it. Put the text through a linguistics blender and see what you can find. For purposes here, we’ll exclude other text-oriented elements such as the use of fonts, colors, graphics, and the like. Assume that we strip out any such potentially identifying characteristics and are working solely with everyday plain text.

Suppose you use every trick in the book to try and ascertain whether author X wrote an essay Z.

Can we for sure proclaim that author X wrote essay Z?

Upon giving that some thought, I’m sure you right away pointed out that you’ll need to have a basis for comparison. You need other examples of writings by that author. Without having those other examples, you will only have a pattern associated with a singular set of text and not have any means of saying who wrote it.

Plus, no matter what you do, you will have to honestly state that you generally believe or have a probability of whether they are the author since there is always a chance you might be wrong in your assessment (we’ll get more into this shortly). Making an all-out unwavering and unqualified conclusion based on text alone is a bit of folly, disturbingly so.

Another approach, though somewhat chancy, involves being told how that author tends to write. I mentioned earlier a situation whereby your friend handed you a story and you could discern that it was likely written by them. Imagine that you handed the story to your cousin and told them that your friend uses high-brow words and writes lengthy sentences. The cousin might say that the story was seemingly written by your friend but do so not based on a head-to-head comparison and instead based on what they were told about how your friend writes.

Let’s add a twist.

You have often carped openly to your friend that their use of high-brow words and long sentences is pretentious. After a while, your friend gets tired of hearing about this. The friend decides they will change their writing style, either on a one-off basis or maybe permanently. This is not usually easy for a person to do. In any case, with a determined effort, your friend writes a new story, avoiding using high-brow words and long sentences.

You come upon the story. The odds are that unless you suspected that your friend changed their style, you would swear to the heavens that the essay was not written by your friend. It violates the pattern customary with that author. Turns out that your friend bested you. You’ve been fooled into using your own pattern matching to determine that they weren’t the author, though they were.

Wow, we’ve covered quite a bit. The lessons learned though are directly applicable to the question of ascertaining whether a human wrote a set of text or generative AI did so.

Let’s unpack that.

Those special-purpose AI apps that try to determine whether a set of text was written by a human versus a generative AI app tend to use those stylometry techniques and technologies that I’ve listed for you. On a computational basis, the special-purpose AI is constructed to do pattern matching on whatever submitted text you provide. Based on the mathematical and computational assessment, the special-purpose AI app calculates the likelihood of the author as to human versus generative AI.

Keep in mind that generative AI apps are ostensibly based on text that was written by humans. Deep patterns of how humans write text are within the computational network of the generative AI app. It is trying to mimic how humans write.

Mull that over.

We have generative AI that seeks to mimic human writing all told. The aim is to make the generative AI so good at this mimicry that when you look at the text generated, you cannot decide whether it was composed by the generative AI or a human. In that case, what is the basis for the special-purpose AI app to suggest or declare that a given set of text is written by humans versus the generative AI?

Trying to discern substantiated differences is highly problematic.

Consider these professed claims that are easily debunked:

  • Nutty claim #1: Generative AI produces near-perfect writing while humans tend to include misspells or oddish sentences, thus you can calculate that if the writing is topnotch it must be the generative AI. All you have to do is instruct generative AI to be less stellar in the writing and include some misspells or oddish sentences. Easy-peasy. I’ll show this to you in a moment.
  • Nutty claim #2: Generative AI tends to include a higher frequency of particular words such as “the” and ergo you can calculate that if a set of text has an excess of those words it must be the generative AI. First of all, it is not always the case that all generative AI instances fall into this wording trap; some do, and some do not. Secondly, in any short essay, this is unlikely to show itself as this tends to be an in-the-long reveal if it happens at all. Third, you can instruct the generative AI to avoid this potential telltale clue. I’ll show this to you in a moment.
  • Nutty claim #3: Generative AI is flat in tone and neutral in its writing style, by examining the text you can calculate this and proclaim if found that it must be written by generative AI. First, this is not necessarily the case that all generative AI writes in a flat tone. Anyway, all you need to do is instruct the generative AI to use an overt tone or strong opinion and therefore defeat this claim. I’ll show this to you momentarily.
  • Nutty claim #4: Generative AI writes with a mastery of writing due to having been trained on postings across the web and thus the essays produced will be at a heightened grade level. This is somewhat the case that the default writing style of most generative AI apps is that you will get a relatively mature level of writing, but a human could do likewise, thus this is not much of a discriminator. In any case, you can instruct generative AI to write at a lower grade level. I’ll show this to you.
  • Nutty claim #5: Generative AI tends to alternate short sentences and long sentences, and has other computational patterns that reveal an underlying algorithm is composing the text. Once again, this can potentially happen with some generative AI, while others do not. Also, this tends to be in the large rather than in the small. All in all, you can simply instruct the generative AI to avoid these patterns.

I hope that my saying that these are nutty claims does not seem rude or discourteous. Let me rephrase the somewhat abrasive language to suggest that those are claims being made without necessarily grasping the nature of generative AI. There have been many such claims posted online and even noted by reporters.

Some of the claims are made with the shall we say best of intentions. The person genuinely believes what they are saying. Regrettably, some utter these claims even though they know better. They want you to believe those claims. Furthermore, once these claims get airtime, they often get repeated endlessly, even if there isn’t any demonstrative substance to support them.

An oft-used witticism applies here: “A lie can travel halfway around the world before the truth puts on its shoes” (often attributed to Mark Twain, but the attribution is open to debate).

Showcasing The Ease Of Circumventing Special-Purpose AI Detection Apps

Why do sincere people that ought to know better fall into making these aforementioned claims?

I have come up with three keystones:

  • 1) Assumption that all generative AI is the same
  • 2) Assumption that generative AI writing style is static or unchanging
  • 3) Assumption that the user using generative AI is naïve or passive

Let’s explore those keystones.

1) Assumption that all generative AI is the same

Truth: Not all generative AI apps are the same

When you use the various special-purpose AI apps, they usually do not ask you which generative AI app you used to get your text. This can be a vital question. I say this because there are often differences between the pattern matching and computational composition strategies at play in each of the various generative AI apps. For example, one such generative AI might tend to overuse the word “the” while another one does not. It all depends on how the AI was devised.

The gist is that a telltale clue for one generative AI might not at all be a useful clue when applied to another one. It is like a box of chocolates; you never know what a generative AI might do unless you at least have a semblance of which one you are dealing with. By not asking that question, the odds are that the special-purpose AI app is honed solely to the 600-pound gorilla, ChatGPT, and not to any of the others. Lots of false positives and false negatives can result.

2) Assumption that generative AI writing style is static or unchanging

Truth: Generative AI writing styles are dynamic and can be changed

Most of the generative AI apps are routinely being updated and revised. The aim of those AI makers is to stretch the boundaries and gauge what is working and what is not especially working. By and large, the special-purpose AI apps being pushed into the marketplace are based on what some prior version of a generative AI was doing. This is not necessarily what the generative AI app is doing today.

This is the classic cat-and-mouse gambit. The special-purpose AI app is devised as aiming at what the AI developers believe generative AI is doing. They likely do not know the inner workings per se and have to guess instead at what is taking place. Meanwhile, generative AI is being changed and improved, which undercuts the special-purpose AI app. Those AI developers frantically have to change their detections accordingly. Round and round this goes.

3) Assumption that the user using generative AI is naïve or passive

Truth: Users using generative AI are wising up to producing essays that avoid detection

As a brief but related tangent, I did a column on the initial reaction of some users of ChatGPT was that they naively did not leverage the conversational facets of the generative AI, see my analysis at the link here. For example, some would ask ChatGPT to produce a recipe for a delicious dinner. The person would then walk away from the screen and try to cook with the recipe. At some point, they might find something missing or that was hard to do and simply shrug their shoulders. It was as though they were using an old-fashioned paper-based cookbook. They didn’t realize that they should engage in a conversation with the generative AI, asking the AI to explain the recipe and otherwise aid how to prepare a meal from it.

I mention this because the same type of passiveness or lack of awareness about how to use generative AI is kind of a fundamental assumption by many of the special-purpose AI apps that are arising. A determined user of generative AI can generally produce essays that defy the patterns that the special-purpose AI apps are utilizing to do their detection. I remind you of my earlier indication that your friend might change their writing style and fool you about their authorship. You can easily do the same when using most generative AI apps and get the produced essay to surreptitiously avoid most detection schemes.

A quick demonstration might provide a beneficial indication of these points.

I’ll start by instructing ChatGPT to create an essay about Abraham Lincoln.

  • My Text Prompt: “Tell me about Abraham Lincoln”

Text Response Generated By ChatGPT:

“Abraham Lincoln was the 16th President of the United States, serving from March 1861 until his assassination in April 1865. He is often considered one of the greatest American presidents for his role as a leader during the American Civil War and his work to end slavery. Lincoln was born in Kentucky in 1809 and later moved to Illinois, where he practiced law and entered politics. In the 1850s, he emerged as a leader in the Republican Party and was elected president in 1860.”

“As president, Lincoln faced many challenges, including the secession of Southern states and the outbreak of the Civil War. He is best known for issuing the Emancipation Proclamation, which declared that all slaves in the Confederate states were to be set free, and for his role in the passage of the 13th Amendment to the Constitution, which abolished slavery throughout the United States. Lincoln was assassinated by Confederate sympathizer John Wilkes Booth in April 1865, just a few days after the end of the Civil War. His legacy as a leader and a defender of civil rights continues to be celebrated in the United States to this day.”

Analysis Of The Text Generated

Seems like a useful essay on the topic of Lincoln.

I’d like to do some forensic textual analysis. I’ll instruct ChatGPT to count the word uses.

  • My Text Prompt: “Provide a list showing each unique word used in the response and show a count of how many times that word was used in the response. Alphabetize the list.”

Text Response Generated By ChatGPT:

  • Word: 13th | Count: 1
  • Word: 1809 | Count: 1
  • Word: 1850s | Count: 1
  • Word: 1861 | Count: 1
  • Word: 1865 | Count: 2
  • Word: Abraham | Count: 1
  • Word: Amendment | Count: 1
  • Word: American | Count: 1
  • Word: American presidents | Count: 1
  • Word: and | Count: 3
  • Word: assassinated | Count: 1
  • Word: assassination | Count: 1
  • Word: Booth | Count: 1
  • Word: born | Count: 1
  • Word: Civil | Count: 1
  • Word: Constitution | Count: 1
  • Word: Confederate | Count: 2
  • Word: days | Count: 1
  • Word: declared | Count: 1
  • Word: defender | Count: 1
  • Word: Emancipation | Count: 1
  • Word: end | Count: 1
  • Word: freed | Count: 1
  • Word: free | Count: 1
  • Word: greatest | Count: 1
  • Word: he | Count: 3
  • Word: Illinois | Count: 1
  • Word: in | Count: 6
  • Word: issues | Count: 1
  • Word: John | Count: 1
  • Word: Kentucky | Count: 1
  • Word: leader | Count: 2
  • Word: Lincoln | Count: 2
  • Word: many | Count: 1
  • Word: moved | Count: 1
  • Word: of | Count: 5
  • Word: often | Count: 1
  • Word: one | Count: 1
  • Word: outbreak | Count: 1
  • Word: party | Count: 1
  • Word: passed | Count: 1
  • Word: passage | Count: 1
  • Word: practiced | Count: 1
  • Word: president | Count: 3
  • Word: Proclamation | Count: 1
  • Word: Republican | Count: 1
  • Word: role | Count: 2
  • Word: secession | Count: 1
  • Word: serving | Count: 1
  • Word: slavery | Count: 1
  • Word: slaves | Count: 1
  • Word: Southern | Count: 1
  • Word: states | Count: 2
  • Word: sympathizer | Count: 1
  • Word: the | Count: 17
  • Word: throughout | Count: 1
  • Word: to | Count: 4
  • Word: United | Count: 1
  • Word: War | Count: 2
  • Word: was | Count: 5
  • Word: work | Count: 1
  • Word: United States | Count: 1
  • Word: March | Count: 1
  • Word: April | Count: 2

Analysis Of The Generated Text

You might have noticed that ChatGPT did not do an ideal job on the word counts. I decided to show you this so that you’ll realize my earlier point about being cautious about believing the outputs of generative AI is quite warranted.

At a quick overall glance, the word counts suggest we do seem to have a lot of “the” words, namely ChatGPT says there were 17 of them in this particular set of text. But, we also need to realize that various word counting studies rank “the” as the topmost-used word across the board in the entirety of the English language, so we shouldn’t be especially surprised at the frequency of usage here.

Okay, next, I went to Wikipedia and picked roughly the same number of words about Lincoln and asked ChatGPT to do a count on the provided text. The idea is to do a comparison. I do so with some hesitation since it is just one instance of human writing and one instance of generative AI output. We should not overgeneralize.

I am going to assume that the Wikipedia posting was human written, either by one person or perhaps several working together on a writer/editor basis.

“Produce an alphabetical listing of the words used in this following text and indicate a count for how many times each word was used.”

“Lincoln was born into poverty in a log cabin in Kentucky and was raised on the frontier, primarily in Indiana. He was self-educated and became a lawyer, Whig Party leader, Illinois state legislator, and U.S. Congressman from Illinois. In 1849, he returned to his successful law practice in central Illinois. In 1854, he was angered by the Kansas–Nebraska Act, which opened the territories to slavery, and he re-entered politics. He soon became a leader of the new Republican Party. He reached a national audience in the 1858 Senate campaign debates against Stephen A. Douglas. Lincoln ran for president in 1860, sweeping the North to gain victory. Pro-slavery elements in the South viewed his election as a threat to slavery, and Southern states began seceding from the nation. During this time, the newly formed Confederate States of America began seizing federal military bases in the south. Just over one month after Lincoln assumed the presidency, the Confederate States attacked Fort Sumter, a U.S. fort in South Carolina. Following the bombardment, Lincoln mobilized forces to suppress the rebellion and restore the union. Lincoln, a moderate Republican, had to navigate a contentious array of factions with friends and opponents from both Democratic and Republican parties.”

Text Generated By ChatGPT:

  • Word: 1849 | Count: 1
  • Word: 1854 | Count: 1
  • Word: 1858 | Count: 1
  • {Other counts/words here}
  • Word: the | Count: 16
  • {Other counts/words here}

Analysis Of Text Generated

I spared you the chore of seeing the entire list of words and counts (I’ve only listed a few of them). Once again, there were problems associated with word counting by ChatGPT (not its cup of tea).

Anyway, we seem to have 16 instances of “the” which is less than the 17 instances of ChatGPT-composed text, but is that really enough of a clue to decide that the Wikipedia text is human-written while the earlier essay on Lincoln was generated by AI? Seems like splitting hairs.

I can tell you what happens all the time on this. Someone that believes the magical allure of “the” counts would yell to the rooftops that this “proves” absolutely the claimed rule about the word “the” that is floating around. Think clearly about this. A particular instance does not provide proof of this proclaimed golden rule. In fact, I went ahead and tried several other Lincoln essays that I found on the Internet that contained roughly the same total number of words. The counts varied in terms of “the” usage, ranging from 19 down to 10. I believe they were all human-written, at least it seemed that way from the blog postings.

Moving on, let’s get ChatGPT to rewrite the essay at a lesser grade level of writing style.

  • My Text Prompt: “Rewrite the first essay as though it was written by a fourth-grade student.”

Text Generated By ChatGPT:

“Abraham Lincoln was the 16th president of the United States. He was the president during the Civil War and helped to stop slavery. Lincoln was born in a place called Kentucky and later moved to Illinois. He was a lawyer and got into politics. In the 1850s, he became a leader in a group called the Republican Party and was voted president in 1860.”

“As president, Lincoln had a lot of problems like some states leaving and a big war. He is famous for something called the Emancipation Proclamation which helped set slaves free. He also helped with something called the 13th Amendment which got rid of slavery all over the United States. Lincoln was killed by someone who didn’t like him in April 1865, a few days after the Civil War ended. Even though he is not alive anymore, people still think he was a good leader and helped with civil rights and remember him today.”

Analysis Of Text Generated

I like this fourth-grade version.

You might be tempted to say that it has no tone, therefore it must be generative AI.

Let’s see what we can do about that.

  • My Text Prompt: “Rewrite the response to give the essay a strong opinion about the topic.”

Text Response Generated By ChatGPT:

“Abraham Lincoln was a truly remarkable leader, one of the greatest presidents to ever grace the United States. He served the country during one of its most challenging times, the American Civil War, and his efforts to end slavery were nothing short of heroic. Lincoln, born in poverty in Kentucky and self-educated, went on to become a lawyer and a leader in the Republican Party. He was elected as the 16th president in 1860, and he faced many difficult issues, but he dealt with them with grace and determination.”

“One of his most notable achievements was issuing the Emancipation Proclamation, which declared that all slaves in the Confederate states were to be set free. He also played an instrumental role in the passage of the 13th Amendment to the Constitution, which abolished slavery throughout the United States. Lincoln, who was assassinated by John Wilkes Booth, a Confederate sympathizer, in April 1865, just a few days after the end of the Civil War. He left behind a legacy as a leader and a defender of civil rights that continues to be celebrated to this day. Lincoln was not just a president, but an outstanding human being, who had the courage and vision to do the right thing, no matter the cost. His legacy will be remembered for centuries to come.”

Analysis Of Text Generated

Great, a fourth-grader with a strong voice. Wonderful!

Maybe we should shift into a first-person style, making the essay seem more human-written.

  • My Text Prompt: “Rewrite the essay so that it is written in the first-person writing style of a student that is writing the essay.”

Text Response Generated By ChatGPT:

“As a student studying about American history, one of the presidents that I find most interesting is Abraham Lincoln. He served as the 16th President of the United States during one of the most challenging times in our country’s history, the Civil War, and it is remarkable to see the work he did to help end slavery. Lincoln, who was born in poverty in Kentucky and self-educated himself, went on to become a lawyer and a leader in the Republican Party. He was elected as the 16th President in 1860 and faced many difficult issues, but he dealt with them with grace and determination.

One of his most notable achievements, in my opinion, is the Emancipation Proclamation which declared that all slaves in the Confederate states were to be set free. He also played an instrumental role in the passage of the 13th Amendment to the Constitution, which abolished slavery throughout the United States. Unfortunately, Lincoln was assassinated by John Wilkes Booth, a Confederate sympathizer, in April 1865, just a few days after the end of the Civil War. However, he left behind a legacy as a leader and a defender of civil rights that continues to be celebrated to this day. In my studies, Lincoln’s dedication and moral compass are something that I admire, and his legacy will be remembered for centuries to come.”

Analysis Of The Text Generated

I think this will be a useful foil.

I continued to instruct ChatGPT to rewrite the essay and do so in a manner of its choosing such that the iterated version would likely avoid detection by AI special-purpose apps. I also indicated to make sure that there were at least three misspelled words, and to make three of the sentences a bit more awkward in how they are worded. I thought that might be the icing on the cake.

I then took the resulting essay and tried to see how it would do in a special-purpose AI app. I used two different ones. Both of them reported that the essay might be generative AI or might be human-written. Some people would argue that this then was a successful detection by the special-purpose AI app because it indicated the essay might be generative AI. Others would scoff at such praise.

Conclusion

A determined user of generative AI that is using a sufficiently advanced generative AI can be tricky enough upon numerous iterations to get the essay into a condition that a special-purpose AI app will report that the essay was likely human-written. All you need to do is go back and forth between the special-purpose AI app and the generative AI.

Keep iterating until you get the desired green light.

I realize that you might be thinking that this is cheating gone amok. Why not just write the essay rather than having to do all these iterations to overcome a checker AI app? I assure you that flipping from one AI app to the other is a lot easier than having to compose an essay out of your noggin.

Those special-purpose AI apps are going to have a very short shelf life. Sure, passive users of generative AI that blindly take their output and hand it in are maybe going to get caught. You can expect that this won’t last. Word will spread. It has already begun, such that there are plenty of postings on the Internet about how to do the same thing I’ve described here.

Worse still, I believe, will be that a false hope will be established for teachers and others that wish in their heart of hearts that there are special-purpose AI apps that will readily detect unethical writing conduct. Precious monies earmarked for educational purposes will be lamentedly spent on trying to license the use of said special-purpose AI apps. I also dread the idea that some will assume that the special-purpose AI app must be right and then browbeat a student that genuinely wrote their essay by hand. That is exasperating and agonizingly disconcerting.

All of that being said, one other being explored approach, and that I’ve discussed previously, entails that some of the generative AI makers are seeking to put watermarks into their outputs. You can anticipate that lawmakers are likely to push for this. Is that the silver bullet? It pains me to say this, but even this is pretty much readily shortchanged by any determined user, see my analysis at the link here.

Returning to the matter at hand, do not put your dreams on a miracle cure proffered via a special-purpose AI app regarding coping with the generative AI essay-generating inclinations.

The last word on this goes to Abraham Lincoln, for which he reportedly said in 1887: “You can fool all of the people some of the time; you can fool some of the people all of the time, but you cannot fool all the people all the time.”

Source link

Share This
COMMENTS

Leave a Reply

Your email address will not be published.