Align prose with STYLE.md across modules 01-07 and top-level README

Replace residual em-dashes, arrow-notation shorthand, and a handful of
filler intensifiers; fix two small typos. Add .gitignore to keep the
working CHANGES.md audit out of the repo.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Eric Furst 2026-05-29 08:47:19 -04:00
commit 4194680475
9 changed files with 102 additions and 82 deletions

View file

@ -15,14 +15,14 @@ AI assistants are useful because they generate plausible output fast. They are *
## Part 1: Verifying
> **A note on terminology.** This section uses "check" and "test" with different meanings. A **unit test** is a specific software-development practice a small piece of code (often written with a framework like `pytest`) that exercises a function with known inputs and confirms the output matches an expected result. Tests are automated and reusable, and they pay off when code will be edited many times by many people. A **check**, more broadly, is anything that verifies the code does what you intended: running on a known limit case, comparing to a published value, plotting and inspecting the shape, or hand-calculating a small input. Formal unit tests are one form of check, but for scientific code written for a single project they are often not the most natural form. Whenever this guide says "verification" or "check," any of these forms count; "test" appears only where an automated test is the right tool.
> **A note on terminology.** This section uses "check" and "test" with different meanings. A **unit test** is a specific software-development practice, which is a small piece of code (often written with a framework like `pytest`) that exercises a function with known inputs and confirms the output matches an expected result. Tests are automated and reusable, and they pay off when code will be edited many times by many people. A **check**, more broadly, is anything that verifies the code does what you intended: running on a known limit case, comparing to a published value, plotting and inspecting the shape, or hand-calculating a small input. Formal unit tests are one form of check, but for scientific code written for a single project they are often not the most natural form. Whenever this guide says "verification" or "check," any of these forms count; "test" appears only where an automated test is the right tool.
### Why verification matters
Hallucinations in AI-assisted coding fall into two broad categories:
1. **Loud hallucinations** — code that fails to compile or run. Easy to catch; the tool tells you.
2. **Quiet hallucinations** — code that runs and produces a result, but the result is wrong. These are the dangerous ones.
1. **Loud hallucinations.** Code that fails to compile or run. Easy to catch; the tool tells you.
2. **Quiet hallucinations.** Code that runs and produces a result, but the result is wrong. These are the dangerous ones.
There is often a familiar pattern: a function that uses an API method that doesn't exist, a regex that handles all the cases you mentioned but fails silently on an edge case you didn't think to mention, a math expression that is dimensionally inconsistent but produces a number anyway. The output *looks like* an answer, so you accept it. Hours or weeks later, you discover the silent failure.
@ -81,10 +81,10 @@ For most academic work, the cloud-service baseline is the right mental model. Yo
- **Restricted research data.** Anything covered by your IRB protocol, your data-use agreement with a collaborator or industrial partner, or institutional policies around HIPAA, FERPA, export controls, or similar regimes. If a category of data is restricted on your computer, it is restricted in your chat too.
- **Unpublished work that isn't yours.** Collaborator drafts, manuscripts under review, code from a lab that hasn't been released. You don't own the right to share these regardless of how you happen to be sharing them.
- **NDA-covered or proprietary material.** From an industrial collaboration, an internship, an advisor's industry consulting work. Check the specific agreement.
- **Personally identifying information.** Participant data, survey responses, names attached to outcomes even when "anonymized for internal use." If you need help analyzing it, paste a synthetic example with the same shape rather than the real thing.
- **Personally identifying information.** Participant data, survey responses, names attached to outcomes, even when "anonymized for internal use." If you need help analyzing it, paste a synthetic example with the same shape rather than the real thing.
- **Credentials, API keys, internal URLs.** Easy to leak by accident when pasting config files or logs.
For most students most of the time who are dealing with coursework, classroom exercises, your own scripts, public datasets, open-source libraries, and drafts of your own writing, the answer is "the chat is fine, same risk as email." Graduate students and undergradute reseearchers working with sensitive research data are the most common case for the categories above. If that's you, take the agreements that govern your data seriously, and when in doubt, ask your advisor or your IRB.
For most students most of the time who are dealing with coursework, classroom exercises, your own scripts, public datasets, open-source libraries, and drafts of your own writing, the answer is "the chat is fine, same risk as email." Graduate students and undergraduate researchers working with sensitive research data are the most common case for the categories above. If that's you, take the agreements that govern your data seriously, and when in doubt, ask your advisor or your IRB.
### A practical checklist
@ -118,7 +118,7 @@ Two complementary reasons:
The realistic bar is not "note every Copilot autocomplete." That standard is impossible to meet in practice, and treating it as required is part of why disclosure norms feel unrealistic. A more useful distinction:
- **Background assistance** that shaped *how* you worked, such as autocomplete, syntax help, name suggestions, quick lookups, debugging conversations. Usually no disclosure is needed unless your venue's policy is specific.
- **Background assistance** that shaped *how* you worked, such as autocomplete, syntax help, name suggestions, quick lookups, debugging conversations. Usually no disclosure is needed unless your venue's policy is specific.
- **Substantive contribution** that shaped *what* you produced, such as AI drafted a section, generated significant chunks of code that you reviewed and accepted, planned the analytical approach, wrote the literature summary, debugged a critical reasoning step. These likely warrant a brief note.
- **Substituted work** where AI produced something you submitted as your own without meaningful engagement, including running an assignment through ChatGPT and turning in the output. This is the case policies are most worried about, and it sits closer to academic dishonesty than to the disclosure question.
@ -136,7 +136,7 @@ The form depends on context:
A useful pattern when you do disclose is to state three things:
1. **What tool you used** (specific model and version if available "Claude Opus 4.7," "ChatGPT-4o," "GitHub Copilot")
1. **What tool you used** (specific model and version if available, such as "Claude Opus 4.7," "ChatGPT-4o," "GitHub Copilot")
2. **What you used it for** ("debugging error messages," "drafting the introduction," "generating boilerplate code")
3. **What you did with the output** ("reviewed and edited," "used as a starting point and rewrote," "used as-is after verification")