# Research Paper → LibreOffice Impress Explainer ## Purpose Turn a research-paper PDF into a **detailed, image-rich `.odp` presentation** that explains the work clearly enough for a learner to understand everything — the problem, the core idea, the method, the math intuition, the results, and why it matters. Output is a **native LibreOffice Impress file (`.odp`)** — never `.pptx`, never PowerPoint. ## When to use The user gives you a paper (a `.pdf`, or an arXiv/URL they want explained) and wants to *learn it* via slides. Phrases like "make slides from this paper", "explain this paper", "turn this into a presentation", "deck I can study from". ## What "good" looks like - **Teaches, doesn't summarize.** Every claim from the paper is unpacked into plain language with intuition first, formalism second. Assume the learner is smart but new to the subfield. - **Image-rich.** Most content slides carry a visual. Two image sources: 1. **The paper's own figures**, extracted automatically (highest fidelity — use these for the real architecture diagrams, result plots, tables). 2. **Generated explainer visuals** from `gpt-image-2` — schematics, analogies, step-by-step diagrams, intuition pictures that the paper *doesn't* contain. - **Detailed.** A typical paper becomes **18–32 slides**, not 8. Break the method into multiple slides. One idea per slide. - **Coherent look.** Generated images share a style so the deck feels designed. --- ## Pipeline (run in order) All scripts live in `scripts/`. Work inside a scratch dir, e.g. `workdir/`. ### 1 — Extract the paper ```bash python3 scripts/extract_paper.py PAPER.pdf --out workdir ``` Produces `workdir/paper_text.md` (page-delimited text with detected section headers), `workdir/figures/*.png` (the paper's real figures), `figures.json` (manifest with page + dimensions), and `meta.json` (title, page/word counts). ### 2 — Read and understand Read `paper_text.md` end to end. Build a mental model: What problem? What was broken before? What's the key insight? How does the method work mechanically? What do the experiments show? What are the limits? **Do not start slides until you can explain the paper to a beginner without looking.** Inspect the extracted figures (`workdir/figures/`). Decide which are worth putting on slides directly (architecture diagrams, key result plots, tables). ### 3 — Plan the deck + write image prompts Draft two files: - `workdir/prompts.json` — visuals to generate (see schema below). Write a prompt for each concept that benefits from a picture the paper lacks: the core analogy, a simplified mechanism diagram, before/after comparisons, a "how data flows" schematic, an intuition pump for the math. Aim for **roughly one generated image per 1–2 content slides**, on top of reused paper figures. - `workdir/deck.json` — the full deck spec (schema below). Reference generated images as `.png` and reused paper figures by their path (`figures/fig03.png`). ### 4 — Generate images ```bash export OPENAI_API_KEY=... # Codex/Hermes usually has this in env python3 scripts/generate_images.py workdir/prompts.json --assets workdir/assets ``` Writes one PNG per prompt as `workdir/assets/.png`. Also copy any reused paper figures into the assets dir so everything resolves from one place: ```bash cp workdir/figures/*.png workdir/assets/ # optional, keeps paths simple ``` > If the Hermes runtime has **native gpt-image-2** generation (Codex does), you > may generate images directly and just save them as `workdir/assets/.png`. > The script is the portable fallback. ### 5 — Build the .odp ```bash python3 scripts/build_odp.py workdir/deck.json workdir/output.odp --assets workdir/assets ``` That's the deliverable. It opens directly in LibreOffice Impress on Ubuntu. (Optional sanity check / PDF preview: `libreoffice --headless --convert-to pdf workdir/output.odp`.) --- ## Deck spec schema (`deck.json`) ```jsonc { "theme": "midnight", // midnight | paper | forest "slides": [ /* slide objects, in order */ ] } ``` Slide objects by `type`: ```jsonc // Opening slide {"type":"title","title":"...","subtitle":"...","eyebrow":"RESEARCH WALKTHROUGH", "meta":"Authors, year • one line", "notes":"speaker notes (optional)"} // Section divider between major parts {"type":"section","number":"02","title":"The Method","subtitle":"optional"} // Workhorse slide: bullets, with an optional image on the right {"type":"content","kicker":"motivation","title":"...", "bullets":["point","another point",{"text":"sub-point","level":1}], "image":"diagram_attention.png", // omit for full-width text "caption":"Fig 2 — ...", "notes":"..."} // Full-bleed image with a caption — use for big architecture diagrams / plots {"type":"bigimage","kicker":"architecture","title":"...", "image":"figures/fig03.png","caption":"...","notes":"..."} // Side-by-side comparison (before/after, baseline/proposed, RNN/Transformer) {"type":"comparison","title":"...", "left":{"heading":"Baseline","bullets":["...","..."]}, "right":{"heading":"Proposed","bullets":["...","..."]}} // Pull-quote / key takeaway {"type":"quote","text":"...","attribution":"paper abstract (paraphrased)"} ``` Notes: - `bullets` items are strings, or `{"text": "...", "level": 1}` for one indent. - `image` is resolved against `--assets`. A missing image renders a labelled placeholder (the deck still builds), so a failed generation never blocks you. - Put the deeper explanation a learner can read later into `notes` (speaker notes) — keep on-slide bullets tight. ## Image prompts schema (`prompts.json`) ```jsonc [ {"id":"attention_schematic", "prompt":"Technical diagram: scaled dot-product attention. Show Q, K, V as " "labelled boxes, a matrix multiply, a softmax, and a weighted sum. " "Clear arrows and labels.", "shape":"landscape"}, // landscape | portrait | square {"id":"rnn_bottleneck","prompt":"...","shape":"portrait", "transparent": false} ] ``` - `id` becomes the filename (`.png`) → reference it in `deck.json`. - gpt-image-2 renders **text inside images** well, so labelled diagrams, flowcharts and infographics are fair game — lean into them. - A shared style suffix is auto-appended for visual coherence; override per-run with `--style-suffix`. --- ## Pedagogy checklist (the part that makes it a *learning* deck) - Open with the **problem and stakes** before any method. - For every mechanism: **intuition / analogy first**, then the precise version. - Turn each equation into a sentence ("this just measures how similar two vectors are, normalized so big dimensions don't blow up the scale"). - Use `comparison` slides for "old way vs new way". - Reuse the paper's real result figures; explain *what to look at* in the caption. - End with: what's genuinely new, what it enables, and stated limitations. - Prefer more slides over crowded ones. One idea per slide. ## Theme choice `midnight` (dark, technical — default), `paper` (warm light, academic), `forest` (dark green). Pick one that fits the subject; keep it consistent.