Periagoge
Concept
3 min readself knowledge

Prompt Engineering for Genealogy Research Questions

AI assistants respond to the specificity and clarity of your questions; vague prompts yield generic answers, while precise ones that name dates, places, and what you've already ruled out produce more useful research suggestions. Learning to frame genealogical questions as concrete problems rather than open-ended inquiries makes AI tools genuinely productive.

Hypatia
Why It Matters

Prompt engineering in genealogy contexts means structuring your AI queries with enough historical context, specificity, and constraints that the AI returns genealogically useful information rather than generic answers. A vague prompt like "Tell me about my ancestor John Smith" generates useless filler. A well-engineered prompt like "John Smith was listed in the 1880 census as age 35, born in Ohio. Based on that, what counties should I search for his 1870 census record, and what age range makes sense?" produces actionable research leads.

The core principle is treating the AI as a research collaborator who knows historical geography, naming patterns, and demographic trends, but not your specific family. You provide the facts and constraints; the AI helps you reason about their implications.

Constraint-Based Prompts

The most effective genealogy prompts establish boundaries before asking for analysis. Example structure: "I'm researching [person], documented in [source] as [key facts]. I'm trying to find [next record type]. Based on [historical context], what should I try?"

The constraints serve multiple functions: they anchor the AI to your specific research problem (not generic genealogy tips), they provide enough temporal/geographic context for accurate suggestions, and they prevent hallucination by explicitly framing the query as hypothetical exploration rather than factual assertion.

Weak prompt: "When would John Smith have gotten married?" (Generic, unstated assumptions.)

Strong prompt: "John Smith appears in the 1880 census as unmarried, age 28, Iowa. He appears in the 1900 census as married. What sources should I search to find his marriage record, and what date range makes demographic sense?"

Context Layering

Provide context in ascending specificity: (1) The historical period ("1880s Midwest"), (2) your research objective ("Determining whether two John Smiths are the same person"), (3) the evidence you have ("Document A says he was born in Ohio; Document B says Pennsylvania"), (4) what you've already tried ("I've searched FamilySearch for John Smith 1870-1890 Ohio with no clear matches").

This structure helps the AI understand not just what you're asking, but why it matters for your genealogical puzzle. It also signals when you're asking for hypothesis generation ("What could explain the birthplace discrepancy?") versus fact-checking ("I think he might be the John Smith in this will. Does that align with the dates?").

Avoiding Hallucination Through Prompt Design

Frame requests as interpretations of existing sources, not requests to invent information. Bad: "What was John Smith's occupation?" Good: "John Smith's 1910 census entry says his occupation was [OCR text, uncertain]. Could this be [possible occupations]?" The first invites hallucination; the second frames the AI as a decoder of ambiguous source material.

Use conditional language for inference: "If the shipping records show [fact], what would that suggest about [derived conclusion]?" rather than "Did [fact] happen?" Conditional phrasing signals to the AI that you understand this is reasoning from limited evidence, not a factual query.

Genealogy-Specific Prompt Patterns

The Name Variant Pattern: "I've seen this ancestor listed as [variant 1], [variant 2], and [variant 3] across different documents. Are these consistent with historical naming patterns for [ethnicity/region]? Should I search for other variants?"

The Timeline Conflict Pattern: "Document A suggests [date/age], but Document B suggests [different date/age]. How do I determine which is more reliable? What sources should I prioritize?"

The Relationship Inference Pattern: "Two people with the same surname appear in the same household in the 1870 census. Based on ages and naming patterns, what's the likelihood they're [relation type]?"

The Research Gap Pattern: "I have records from 1870 and 1900, but nothing from 1880-1895. Given my ancestor's known location, occupation, and family circumstances, where should I search to fill this gap?"

Try this: Take a genealogy problem you're currently stuck on (conflicting information, missing generation, location mystery). Write it as three progressively more specific prompts, starting vague and ending with full historical context, source citations, and what you've already tried. Submit all three to Claude and compare outputs. The best answer will come from the most specific prompt, which teaches you how much scaffolding the AI needs to provide useful guidance.

Helpful guides
Hypatia
Daily Life & Decisions
Related Concepts
Peri
Questions about Prompt Engineering for Genealogy Research Questions?

Peri can explain this concept, give practical examples, help you decide whether it applies to your situation, or recommend a journey if appropriate.

Ready to work on Prompt Engineering for Genealogy Research Questions?

Explore related journeys or tell Peri what you're working through.