AI Native Paper Engineering: Lessons Learned from the Trenches

AI Native Paper Engineering: Lessons Learned from the Trenches

After spending ~$680 and countless hours with AI agents (Codex, Cursor, Gemini, etc.), I’ve learned a lot about what works—and what painfully doesn’t—when using AI to write a technical paper (thesis/journal).

This is not "best practice." It’s a raw, evolving experience report. Human in the loop is non-negotiable.



🔥 The Hard Problems We Faced

  • AI can’t read local PDFs well

  • No idea how to scaffold experimental code from zero

  • Can’t articulate requirements clearly to AI

  • Model output doesn’t match expectations

  • Output is slow or unstable

  • Token burn is terrifying

  • One-session addiction (hard to restart)

  • AI moves too fast → ADHD-like chaos


🧠 Two Prompting Frameworks That Actually Help

 SCAFF — for feature/component requests

  • Situation: tech stack, current progress, design style

  • Challenge: exact requirements (validation, behavior)

  • Audience: yourself or maintainers

  • Format: file names, types, styling

  • Foundations: constraints (no UI libs, useState only)

### SCAFF
1. [Situation - 情境]
2. I am developing an admin backend for a personal blog system.
3. Tech stack: Next.js 16 + TypeScript + Tailwind CSS
4. Current progress: Homepage and article list page are complete, now need to add login functionality.
5. Design style: Minimalist style, referencing Notion's login page.
6. 
7. [Challenge - 挑战]
8. Implement the admin login page:
   - Form includes: email input field, password input field, login button
   - After clicking login, validation (email format, password not empty)
   - When showing errors, display red prompt below the corresponding input field
   - No real backend validation needed for now, just frontend interface and interaction
9. 
10. [Audience - 受众]
11. User: Only me (blog administrator)
12. Code maintainer: Myself, a React beginner
13. Please add comments at key logic points to help me understand
14. 
15. [Format - 格式]
16. Please provide:
    1. Complete login page component code (single file)
    2. Include TypeScript type definitions
    3. Use Tailwind CSS for styling
    4. Filename: LoginPage.tsx
17. 
18. [Foundations - 基础约束]
19. Do not use any UI component libraries (e.g., shadcn/ui, Ant Design)
20. Do not use any third-party form libraries (e.g., React Hook Form)
21. State management using React's native useState
22. Responsive design: adapt to both mobile and desktop

2. RGC — for refactoring or coding tasks

  • Role: Python engineer focused on readability

  • Goal: clarify names, extract duplication, add comments

  • Constraints: preserve behavior, no list comprehensions


### RGC
1. [Role - 角色]
2. You are a Python engineer who focuses on code readability and maintainability.
3. 
4. [Goal - 目标]
5. Refactor the following code to improve readability:
   - Make variable names clearer
   - Extract duplicate logic into functions
   - Add necessary comments
6. 
7. ```python
8. def f(l):
9.     r = []
10.     for i in l:
11.         if i > 0:
12.             r.append(i * 2)
13.     return r
14. ```
15. 
16. [Constraints - 约束]
17. Keep functionality exactly the same
18. Do not use list comprehensions (I haven't learned that part yet)
19. No need to add type annotations
20. Use Chinese for comments

Garbage in, garbage out.


🛠️ Solutions to Common Nightmares

📄 PDF reading

→ Use MinerU (free desktop/client) to convert PDF → Markdown.
Then extract key points with AI before feeding into context.

🧱 Building experiments from scratch

→ Start by imitating open-source code.
Use SimPy for discrete simulation. Let AI refactor later.

🎯 Model output off-target

  • Break tasks into tiny PRDs

  • One agent implements, another reviews (repeat)

  • TDD (test-driven development) skills help a lot

⏱️ Slow experiments + long waits

  • Use git worktree → parallel branches for different chapters/algorithms (DRL, game theory, evolution)

  • Remote desktop (Sunflower) to run AI from phone

🔥 Token burn too fast

  • Know what to change → point AI exactly there

  • Don’t switch models mid-chat (breaks KV cache)

  • Cheap model for simple tasks, strong model for hard ones

  • 200k context = double cost → keep it focused

💬 One-session addiction

  • Use AGENTS.md to define compression rules

  • Ask AI to write a handover doc before hitting limits

🌀 AI-induced ADHD + losing control

→ Stop. Think. Restart.
Let AI refactor everything from scratch. Remove redundant components.

Attention is all you need.


🧭 Recommended Workflow (Simplified)

PhaseToolsKey Idea
LiteratureIEEE, MinerU, AIConvert PDF → MD → extract
IdeationXmind, Co-Scientist3 core contributions
CodingCodex, CursorTDD + parallel branches
WritingCursor + Opus 4.6Markdown first, logic over beauty
FiguresEdraw, PaperBananaLong sentences → diagrams
ReferencesMCP / APIAuto-fetch, no manual

💸 Real Cost Estimate (Personal)

  • Edraw student: $30

  • Cursor Ultra ×3: $500

  • Codex ×4: $50

  • Google AI Studio: $100

  • Total ~$680

Can be much lower if planned well. I learned the hard way.


✅ Final Takeaway

Think → Act → Observe → Repeat.

AI is fast, but you are the architect.
Structure your prompts, your context, and your branches.
When lost — restart. When overwhelmed — pause the AI.

Human attention is still the scarcest resource. 

Comments

Popular posts from this blog

Cognitive Biases: Psychology of Decision making(Part-02)

Instagram Marketing 101 - ABC of Instagram Marketing

Cognitive Biases: Psychology of Decision making(Part-01)