Skills/research-paper-presenter/SKILL.md

# Research Paper → LibreOffice Impress Explainer

## Purpose
Turn a research-paper PDF into a **detailed, image-rich `.odp` presentation** that
explains the work clearly enough for a learner to understand everything — the
problem, the core idea, the method, the math intuition, the results, and why it
matters. Output is a **native LibreOffice Impress file (`.odp`)** — never `.pptx`,
never PowerPoint.

## When to use
The user gives you a paper (a `.pdf`, or an arXiv/URL they want explained) and
wants to *learn it* via slides. Phrases like "make slides from this paper",
"explain this paper", "turn this into a presentation", "deck I can study from".

## What "good" looks like
- **Teaches, doesn't summarize.** Every claim from the paper is unpacked into
  plain language with intuition first, formalism second. Assume the learner is
  smart but new to the subfield.
- **Image-rich.** Most content slides carry a visual. Two image sources:
  1. **The paper's own figures**, extracted automatically (highest fidelity —
     use these for the real architecture diagrams, result plots, tables).
  2. **Generated explainer visuals** from `gpt-image-2` — schematics, analogies,
     step-by-step diagrams, intuition pictures that the paper *doesn't* contain.
- **Detailed.** A typical paper becomes **18–32 slides**, not 8. Break the method
  into multiple slides. One idea per slide.
- **Coherent look.** Generated images share a style so the deck feels designed.

---

## Pipeline (run in order)

All scripts live in `scripts/`. Work inside a scratch dir, e.g. `workdir/`.

### 1 — Extract the paper
```bash
python3 scripts/extract_paper.py PAPER.pdf --out workdir
```
Produces `workdir/paper_text.md` (page-delimited text with detected section
headers), `workdir/figures/*.png` (the paper's real figures), `figures.json`
(manifest with page + dimensions), and `meta.json` (title, page/word counts).

### 2 — Read and understand
Read `paper_text.md` end to end. Build a mental model: What problem? What was
broken before? What's the key insight? How does the method work mechanically?
What do the experiments show? What are the limits? **Do not start slides until
you can explain the paper to a beginner without looking.**

Inspect the extracted figures (`workdir/figures/`). Decide which are worth putting
on slides directly (architecture diagrams, key result plots, tables).

### 3 — Plan the deck + write image prompts
Draft two files:

- `workdir/prompts.json` — visuals to generate (see schema below). Write a prompt
  for each concept that benefits from a picture the paper lacks: the core
  analogy, a simplified mechanism diagram, before/after comparisons, a "how data
  flows" schematic, an intuition pump for the math. Aim for **roughly one
  generated image per 1–2 content slides**, on top of reused paper figures.
- `workdir/deck.json` — the full deck spec (schema below). Reference generated
  images as `<id>.png` and reused paper figures by their path
  (`figures/fig03.png`).

### 4 — Generate images
```bash
export OPENAI_API_KEY=...      # Codex/Hermes usually has this in env
python3 scripts/generate_images.py workdir/prompts.json --assets workdir/assets
```
Writes one PNG per prompt as `workdir/assets/<id>.png`. Also copy any reused
paper figures into the assets dir so everything resolves from one place:
```bash
cp workdir/figures/*.png workdir/assets/    # optional, keeps paths simple
```
> If the Hermes runtime has **native gpt-image-2** generation (Codex does), you
> may generate images directly and just save them as `workdir/assets/<id>.png`.
> The script is the portable fallback.

### 5 — Build the .odp
```bash
python3 scripts/build_odp.py workdir/deck.json workdir/output.odp --assets workdir/assets
```
That's the deliverable. It opens directly in LibreOffice Impress on Ubuntu.
(Optional sanity check / PDF preview:
`libreoffice --headless --convert-to pdf workdir/output.odp`.)

---

## Deck spec schema (`deck.json`)

```jsonc
{
  "theme": "midnight",          // midnight | paper | forest
  "slides": [ /* slide objects, in order */ ]
}
```

Slide objects by `type`:

```jsonc
// Opening slide
{"type":"title","title":"...","subtitle":"...","eyebrow":"RESEARCH WALKTHROUGH",
 "meta":"Authors, year • one line", "notes":"speaker notes (optional)"}

// Section divider between major parts
{"type":"section","number":"02","title":"The Method","subtitle":"optional"}

// Workhorse slide: bullets, with an optional image on the right
{"type":"content","kicker":"motivation","title":"...",
 "bullets":["point","another point",{"text":"sub-point","level":1}],
 "image":"diagram_attention.png",     // omit for full-width text
 "caption":"Fig 2 — ...", "notes":"..."}

// Full-bleed image with a caption — use for big architecture diagrams / plots
{"type":"bigimage","kicker":"architecture","title":"...",
 "image":"figures/fig03.png","caption":"...","notes":"..."}

// Side-by-side comparison (before/after, baseline/proposed, RNN/Transformer)
{"type":"comparison","title":"...",
 "left":{"heading":"Baseline","bullets":["...","..."]},
 "right":{"heading":"Proposed","bullets":["...","..."]}}

// Pull-quote / key takeaway
{"type":"quote","text":"...","attribution":"paper abstract (paraphrased)"}
```

Notes:
- `bullets` items are strings, or `{"text": "...", "level": 1}` for one indent.
- `image` is resolved against `--assets`. A missing image renders a labelled
  placeholder (the deck still builds), so a failed generation never blocks you.
- Put the deeper explanation a learner can read later into `notes` (speaker
  notes) — keep on-slide bullets tight.

## Image prompts schema (`prompts.json`)
```jsonc
[
  {"id":"attention_schematic",
   "prompt":"Technical diagram: scaled dot-product attention. Show Q, K, V as "
            "labelled boxes, a matrix multiply, a softmax, and a weighted sum. "
            "Clear arrows and labels.",
   "shape":"landscape"},                 // landscape | portrait | square
  {"id":"rnn_bottleneck","prompt":"...","shape":"portrait",
   "transparent": false}
]
```
- `id` becomes the filename (`<id>.png`) → reference it in `deck.json`.
- gpt-image-2 renders **text inside images** well, so labelled diagrams,
  flowcharts and infographics are fair game — lean into them.
- A shared style suffix is auto-appended for visual coherence; override per-run
  with `--style-suffix`.

---

## Pedagogy checklist (the part that makes it a *learning* deck)
- Open with the **problem and stakes** before any method.
- For every mechanism: **intuition / analogy first**, then the precise version.
- Turn each equation into a sentence ("this just measures how similar two
  vectors are, normalized so big dimensions don't blow up the scale").
- Use `comparison` slides for "old way vs new way".
- Reuse the paper's real result figures; explain *what to look at* in the caption.
- End with: what's genuinely new, what it enables, and stated limitations.
- Prefer more slides over crowded ones. One idea per slide.

## Theme choice
`midnight` (dark, technical — default), `paper` (warm light, academic),
`forest` (dark green). Pick one that fits the subject; keep it consistent.