AI FAQ for Confused Academics

Having worked in what was a relatively niche field (text generation) for a while, I was surprised to find that it had become the major topic of conversation in both the media and academic circles. I find that people’s discussions of AI are often based on misconceptions (both positive and negative), so this is a quick FAQ to provide an overview for academics and administrators who are now being asked to make decisions about AI in their work.

The goal is to explain things quickly so that you have about a 90% understanding of the topic. All of the answers below are intentionally brief, to the point, and lacking nuance. They deal with text generation, which is the most common use of AI. Image generation and other types of AI are similar but not identical. Perhaps most importantly, this does not cover the ethical implications of AI, which are a very important topic but are outside the scope of this FAQ.

1. How does AI work?

It predicts the most likely next word in a sequence of words. The result is then fed back into the model, which predicts the next word, and so on.

2. What is a token?

A token is a unit of text — a whole word, a part of a word, or punctuation — that the model predicts one at a time. When I said “next word” earlier, it is more accurate to say “next token.” AI providers usually charge by the number of tokens input and output.

3. Wouldn’t that be repetitive?

There is a setting called “temperature” that controls the randomness of the output. A low temperature makes the model more likely to choose the most probable next token, while a high temperature makes it more likely to choose a less probable token.

4. How does the chat part work?

There is a system prompt in the background that tells the model how to respond. For instance, it might give the model examples of a chat conversation, and the model will try to follow that example when it generates its response.

5. What is a context window?

The maximum number of tokens the model can process at once. If your input exceeds that limit, the model forgets the beginning of the input.

6. What is Agentic AI?

You create a software layer (called a harness) that triggers actions such as searching online. You tell the model it has access to specific words that can trigger those actions. The model generates text that includes those words. The harness detects them, triggers the action, and feeds the result back into the model so it can continue.

7. What is a reasoning model?

A model that generates its internal thought process before producing a final answer. It is still predicting tokens, but it is allowed to write out intermediate steps.

8. AI doesn’t seem very capable

Free versions you likely use are capable models, but are rarely the most powerful. The most advanced models require a paid subscription, and the field changes rapidly.

9. I’m worried about AI training on my data

If you have published text online, it is probably in the training data. The bigger concern is that AI services may retain your queries and outputs for future training or review. Read the privacy policy of the service you are using.

10. Are Chinese models riskier than American models?

Most Chinese models are open source and can be run on servers anywhere. The risk of a model being used for nefarious purposes is not really related to the country in which it was developed, but rather where it is hosted.

11. Why does AI hallucinate?

Because it has no fact-checking mechanism. It generates text by predicting which tokens are likely to come next, often producing correct-sounding but false information.

12. Can we detect AI-generated text?

No. The only way to know if something was generated by AI is if the person who created it tells you.

13. I heard that [Insert Tool Here] can detect AI-generated text. Is that true?

No. These tools are not reliable. They are based on the idea that AI-generated text exhibits certain patterns that can be detected, but these patterns can be easily circumvented by changing the prompt or temperature, or by light paraphrasing.

14. I heard that [Insert Tool Here] is so bad at detecting AI-generated text that it thought the Declaration of Independence and [Insert Famous Document Here] were written by AI. Is that true?

Yes and no. Detectors flag highly predictable text. Famous documents have fixed, well-known wording, so the model assigns a very high probability to their exact phrasing. As a result, they aren’t reliable in those cases but work correctly in others.

15. Is AI plagiarism?

No. It is more like a calculator. The model is not “copying” anything. It generates new text based on patterns. Your university’s plagiarism policy likely doesn’t cover it. It could still be considered academic dishonesty, though.

16. Should students cite the prompts they used to generate text?

No. The prompts are not the source of the content; they are just instructions for the model. In addition, using the same prompt will not necessarily produce the same output, so it is not a reliable way to reproduce the content.

17. Can I ban AI in my class?

You could, but enforcement is difficult. If a student claims they did not use AI, you would need strong evidence to prove otherwise.

18. Is AI profitable?

Not for most AI companies. The cost of training and running large language models is very high, and many AI divisions are not yet profitable on their own. It is profitable for chip manufacturers and cloud providers in the same way that selling shovels to gold miners is profitable.

19. Will AI replace jobs?

The hallucination problem makes it unlikely that AI will replace jobs that require a high level of accuracy and reliability. The jobs most at risk involve repetitive tasks and tasks that a human already has to check for accuracy. If a human has to check your work already, they can just check the AI’s work instead.

20. Can AI access the internet?

No, unless explicitly given tools to do so. Most LLMs cannot browse the web on their own and require Agentic AI (see question 6).