7.0 KiB
Research Paper → LibreOffice Impress Explainer
Purpose
Turn a research-paper PDF into a detailed, image-rich .odp presentation that
explains the work clearly enough for a learner to understand everything — the
problem, the core idea, the method, the math intuition, the results, and why it
matters. Output is a native LibreOffice Impress file (.odp) — never .pptx,
never PowerPoint.
When to use
The user gives you a paper (a .pdf, or an arXiv/URL they want explained) and
wants to learn it via slides. Phrases like "make slides from this paper",
"explain this paper", "turn this into a presentation", "deck I can study from".
What "good" looks like
- Teaches, doesn't summarize. Every claim from the paper is unpacked into plain language with intuition first, formalism second. Assume the learner is smart but new to the subfield.
- Image-rich. Most content slides carry a visual. Two image sources:
- The paper's own figures, extracted automatically (highest fidelity — use these for the real architecture diagrams, result plots, tables).
- Generated explainer visuals from
gpt-image-2— schematics, analogies, step-by-step diagrams, intuition pictures that the paper doesn't contain.
- Detailed. A typical paper becomes 18–32 slides, not 8. Break the method into multiple slides. One idea per slide.
- Coherent look. Generated images share a style so the deck feels designed.
Pipeline (run in order)
All scripts live in scripts/. Work inside a scratch dir, e.g. workdir/.
1 — Extract the paper
python3 scripts/extract_paper.py PAPER.pdf --out workdir
Produces workdir/paper_text.md (page-delimited text with detected section
headers), workdir/figures/*.png (the paper's real figures), figures.json
(manifest with page + dimensions), and meta.json (title, page/word counts).
2 — Read and understand
Read paper_text.md end to end. Build a mental model: What problem? What was
broken before? What's the key insight? How does the method work mechanically?
What do the experiments show? What are the limits? Do not start slides until
you can explain the paper to a beginner without looking.
Inspect the extracted figures (workdir/figures/). Decide which are worth putting
on slides directly (architecture diagrams, key result plots, tables).
3 — Plan the deck + write image prompts
Draft two files:
workdir/prompts.json— visuals to generate (see schema below). Write a prompt for each concept that benefits from a picture the paper lacks: the core analogy, a simplified mechanism diagram, before/after comparisons, a "how data flows" schematic, an intuition pump for the math. Aim for roughly one generated image per 1–2 content slides, on top of reused paper figures.workdir/deck.json— the full deck spec (schema below). Reference generated images as<id>.pngand reused paper figures by their path (figures/fig03.png).
4 — Generate images
export OPENAI_API_KEY=... # Codex/Hermes usually has this in env
python3 scripts/generate_images.py workdir/prompts.json --assets workdir/assets
Writes one PNG per prompt as workdir/assets/<id>.png. Also copy any reused
paper figures into the assets dir so everything resolves from one place:
cp workdir/figures/*.png workdir/assets/ # optional, keeps paths simple
If the Hermes runtime has native gpt-image-2 generation (Codex does), you may generate images directly and just save them as
workdir/assets/<id>.png. The script is the portable fallback.
5 — Build the .odp
python3 scripts/build_odp.py workdir/deck.json workdir/output.odp --assets workdir/assets
That's the deliverable. It opens directly in LibreOffice Impress on Ubuntu.
(Optional sanity check / PDF preview:
libreoffice --headless --convert-to pdf workdir/output.odp.)
Deck spec schema (deck.json)
{
"theme": "midnight", // midnight | paper | forest
"slides": [ /* slide objects, in order */ ]
}
Slide objects by type:
// Opening slide
{"type":"title","title":"...","subtitle":"...","eyebrow":"RESEARCH WALKTHROUGH",
"meta":"Authors, year • one line", "notes":"speaker notes (optional)"}
// Section divider between major parts
{"type":"section","number":"02","title":"The Method","subtitle":"optional"}
// Workhorse slide: bullets, with an optional image on the right
{"type":"content","kicker":"motivation","title":"...",
"bullets":["point","another point",{"text":"sub-point","level":1}],
"image":"diagram_attention.png", // omit for full-width text
"caption":"Fig 2 — ...", "notes":"..."}
// Full-bleed image with a caption — use for big architecture diagrams / plots
{"type":"bigimage","kicker":"architecture","title":"...",
"image":"figures/fig03.png","caption":"...","notes":"..."}
// Side-by-side comparison (before/after, baseline/proposed, RNN/Transformer)
{"type":"comparison","title":"...",
"left":{"heading":"Baseline","bullets":["...","..."]},
"right":{"heading":"Proposed","bullets":["...","..."]}}
// Pull-quote / key takeaway
{"type":"quote","text":"...","attribution":"paper abstract (paraphrased)"}
Notes:
bulletsitems are strings, or{"text": "...", "level": 1}for one indent.imageis resolved against--assets. A missing image renders a labelled placeholder (the deck still builds), so a failed generation never blocks you.- Put the deeper explanation a learner can read later into
notes(speaker notes) — keep on-slide bullets tight.
Image prompts schema (prompts.json)
[
{"id":"attention_schematic",
"prompt":"Technical diagram: scaled dot-product attention. Show Q, K, V as "
"labelled boxes, a matrix multiply, a softmax, and a weighted sum. "
"Clear arrows and labels.",
"shape":"landscape"}, // landscape | portrait | square
{"id":"rnn_bottleneck","prompt":"...","shape":"portrait",
"transparent": false}
]
idbecomes the filename (<id>.png) → reference it indeck.json.- gpt-image-2 renders text inside images well, so labelled diagrams, flowcharts and infographics are fair game — lean into them.
- A shared style suffix is auto-appended for visual coherence; override per-run
with
--style-suffix.
Pedagogy checklist (the part that makes it a learning deck)
- Open with the problem and stakes before any method.
- For every mechanism: intuition / analogy first, then the precise version.
- Turn each equation into a sentence ("this just measures how similar two vectors are, normalized so big dimensions don't blow up the scale").
- Use
comparisonslides for "old way vs new way". - Reuse the paper's real result figures; explain what to look at in the caption.
- End with: what's genuinely new, what it enables, and stated limitations.
- Prefer more slides over crowded ones. One idea per slide.
Theme choice
midnight (dark, technical — default), paper (warm light, academic),
forest (dark green). Pick one that fits the subject; keep it consistent.