Writing — Tom Wang

These are notes, thoughts, progress logs, and ideas I'm still forming. Some will age well. Some won't — that's the point.

Learning Log May 19, 2026

AI Notes — May 19

Meta moves 7,000 employees into AI-native divisions days before layoffs. Humanoid robots as four interconnected systems. When marginal cost goes to zero, value migrates rather than disappears.

Learning Log May 18, 2026

AI Notes — May 18

Drone warfare: Yaroslav Azhnyuk on FPV drones as the new god of war, China's 4 billion-unit drone capacity, why rare earths are a hard constraint, and what the West has not yet built.

Learning Log May 17, 2026

AI Notes — May 17

US vs China frontier model Elo over time. Similar slopes, staggered starts — current gap around 450 Elo and not closing on linear extrapolation.

Learning Log May 16, 2026

AI Notes — May 16

Skepticism about Figure vs broader conviction about robotics acceleration. Sales as a steady role. Cerebras IPO. Codex remote control.

Learning Log May 15, 2026

AI Notes — May 15

Abridge: from AI medical scribe to clinical intelligence layer. Three strategic phases: save time, save money, save lives. Why prototypes beat PRDs.

Learning Log May 14, 2026

AI Notes — May 14

A classic data-leakage training failure: a sepsis-prediction model that cheated with future information and collapsed in real hospitals.

Learning Log May 13, 2026

AI Notes — May 13

Use your longest agent run as a difficulty proxy. Perplexity's skill design. Blackwell as the reference platform for large-MoE serving. Hassabis on AI for health.

Learning Log May 12, 2026

AI Notes — May 12

Few-shot vs zero-shot prompting explained, with selection advice. Plus full-duplex multimodal interaction.

Learning Log May 11, 2026

AI Notes — May 11

Jack Clark's Import AI: a 60%+ chance of human-out-of-the-loop AI R&D. Forward Deployed Engineers. What Chinese AI and robotics companies are building.

Learning Log May 10, 2026

AI Notes — May 10

Code with Claude: Managed Agents — accomplish a goal by giving Claude an outcome and a budget. Anthropic platform team on harness and model path dependence.

Learning Log May 9, 2026

AI Notes — May 9

Anthropic's valuation in light of secondary-market and media reports. Notes on Figure Robotics.

Learning Log May 8, 2026

AI Notes — May 8

Why the data industry is still immature: $1M+ per RL environment, hundreds of millions a year, and labs still prefer to build in-house. Anthropic's Dreaming: agents that review past sessions, rewrite memory, and learn between runs as a team. GPT-Realtime-Translate goes live with 70+ input languages to 13 outputs.

Learning Log May 7, 2026

AI Notes — May 7

Claude rate limits doubled after the SpaceX compute deal. Harvey's LAB benchmark covers 1,200 long-horizon legal-agent tasks. Genesis AI ships GENE-26.5 with 100x cheaper data collection hardware. Figure HQ tour: ~1M hours of pre-training data, sim-to-real zero-shot, 50-200 Hz onboard inference, and a high-DOF hand built to learn from human videos. Hugging Face's Reachy Mini App Store points at a desktop-robot category.

Learning Log May 6, 2026

AI Notes — May 6

AI for science: o3 cuts a multi-day physics calculation to 11 minutes. Anthropic JV with Blackstone/H&F/Goldman and OpenAI's Deployment Company push model-makers downstream into B2B consulting. GPT-5.5 Instant becomes the ChatGPT default with memory sources exposed. RL infra shifts from single-shot rewards to long-running action systems; Anthropic Orbit and Manus point to a new proactive-assistant category.

Learning Log May 5, 2026

AI Notes — May 5

From software that gives you tools to software that delivers results. The next-gen data supplier playbook: outcome delivery, lifecycle management, productized service tiers, pricing tied to model metrics. Meta acquires ARI to bet on robotics as a training strategy. Model × harness × context wins: prompt and middleware swaps move gpt-5.2-codex 13.7 points on Terminal-Bench 2.0.

Learning Log May 4, 2026

AI Notes — May 4

Cyber psychosis: builders shipping 163 commits a day, vibe-coding straight to production. What AI cannot copy — premium subscription letters, boutique consulting, curated brands, members clubs, anyone bearing legal responsibility. Cursor's Composer 2: continued pretraining before RL adds 17.1 CursorBench points. Keep Rate as the behavioral north-star. Why PMs become loop designers and product taste is cost judgment. Defending against AI cyberattacks.

Report May 5, 2026

Data Annotation Industry Report

Market landscape, company profiles, pricing models, technical trends and pain points across the AI data annotation industry. Field research and analysis (Chinese).

Learning Log May 3, 2026

AI Notes — May 3

AI-native organizations: why companies see zero gains while individuals get 15-40% faster. Three rebuild patterns — subsidiary spin-off, internal Pods, and laying off everyone who codes. End-to-end ownership, trait-based teams, and context as the moat. Cursor's UIUX lead on software as concept stacks. Why fine-tune became continued pretraining, and the new pre/mid/post training pipeline. Bad data, taste at scale, and benchmark leakage.

Learning Log May 2, 2026

AI Notes — May 2

Agent orchestration as a while-loop of tool calls, in five steps. LLM-era distillation: data distillation and CoT distillation. PMs writing only the roadmap and letting Claude do everything else. The six layers of AI products. GPT-5.5, Grok 4.3, DeepSeek V4 Pro and the closing open/closed gap. Six places synthetic data can't replace human annotation.

Learning Log May 1, 2026

AI Notes — May 1

Coding agent shootout: Claude Code, Claude Design, Cursor, Codex on a single landing-page brief. nanochat depth scaling and the FP8 training trick. Cursor SDK leads Terminal-Bench 2.0. Why Apache 2.0 actually matters for enterprise. 2023–2025 AI value flowed to infrastructure: VR NVL72 economics and the neocloud margin compression.

Learning Log April 30, 2026

AI Notes — April 30

Why the agent era CPU narrative is real but smaller than the GPU one. The CPU player map: AMD vs Intel vs hyperscaler ARM vs Ampere. How much GPU/CPU one humanoid robot actually needs — Jetson Thor as the de facto onboard monopoly. Mayo's REDMOD catches pancreatic cancer up to 3 years early. Stripe's four-protocol agent payment stack.

Learning Log April 29, 2026

AI Notes — April 29

NVIDIA Nemotron 3 Nano Omni: 30B/A3B multimodal MoE, 256K context, 9x throughput. Mini-SGLang prefix matching with the radix tree. Unsloth LoRA: merged vs non-merged tradeoffs. Mimicking Dream of the Red Chamber style with a 167MB adapter. TRL DPO end-to-end.

Learning Log April 28, 2026

AI Notes — April 28

Sakana's 7B Conductor orchestrates frontier models, hits 83.9% on LiveCodeBench. OpenAI's AI-first phone targeting 2028. GUI Agent annotation needs a totally different paradigm. YC Summer 2026 RFS: 14 directions betting AI is now infrastructure, not feature.

Learning Log April 27, 2026

AI Notes — April 27

Medical-LLM refactor: 4 findings on overnight runs and multi-format interference. Architectural breakdown of Gemma 4, Qwen 3.6, GLM-5.1, Kimi K2.6 and DeepSeek V4-Pro. Anthropic's Project Deal: Opus agents close better trades than Haiku.

Learning Log April 26, 2026

AI Notes — April 26

SkillsBench vs our skillrank — a postmortem on seven mistakes: LLM-as-judge instead of deterministic verifiers, pairwise instead of pass/fail, no with/without baseline, and too much time on infra.

Book April 26, 2026

Where Sages Agree

A book on where four wisdom traditions converge — Zen, Confucianism, Stoicism, and Adlerian psychology — on what it means to live well in an anxious age.

Learning Log April 25, 2026

AI Notes — April 25

DeepSeek-V4 vs Flash Attention vs MHA — algorithmic vs architectural innovation. CSA/HCA shrinks KV cache 5-10x via low-rank latent compression. GPT-Image 2 + Seedance 2.0 short-film workflow.

Learning Log April 24, 2026

AI Notes — April 24

GPT-5.5 ships — faster, cheaper net, smarter. swyx on AI-native: skills as the agent unit, app companies outlast infra, Taalas bakes models into silicon. World ID 4.0 hits Tinder, Zoom, DocuSign.

Learning Log April 23, 2026

AI Notes — April 23

Shopify at ~100% internal AI use, critique loops over parallel agents, Tangle/Tangent/SimGym. MacAskill on AI character as the most underrated lever. mini-sglang RadixAttention vs nano-vllm: 7311 tok/s on a single 3090.

Learning Log April 22, 2026

AI Notes — April 22

Claude Design locks in creativity. GPT-Image-2 tops Image Arena by +242 Elo. ChatGPT Images 2.0 adds reasoning before drawing. RankAI's SEO+GEO stack. Google: 75% of new code is AI.

Learning Log April 21, 2026

AI Notes — April 21

RLVR explained via DeepSeek-R1. Hermes agent patterns: stateless units, structured failure traces, directory-scoped AGENTS.md. Alex Imas on the post-commodity economy.

Learning Log April 20, 2026

AI Notes — April 20

Generative Agents (Smallville), OASIS large-scale social simulation, and Love First Know Later — three papers mapping the theoretical base for persona products like Halo.

Learning Log April 19, 2026

AI Notes — April 19

Claude Code terminal shortcuts (Shift+Tab, Esc, @). Fengtian's workflow: two Max plans + voice input + Agent Team mode = 10x productivity.

Learning Log April 18, 2026

AI Notes — April 18

Claude Design pipeline: Pinterest inspiration → AI-generated background and character → Seedance 2.0 animation → motionsites.ai template → Landbook layouts.

Learning Log April 17, 2026

AI Notes — April 17

Overseeing agents is the future, not writing code. Deep dive into nano-vllm attention, preempt, prefix caching. McKinsey on the agentic organization.

Learning Log April 16, 2026

AI Notes — April 16

Energy-Based Models: not new — Hopfield Networks, Boltzmann Machines, diffusion models all trace back here. Yann LeCun's bet against autoregressive LLMs.

Learning Log April 15, 2026

AI Notes — April 15

Local model rankings from Reddit, how to steer AI toward your design style with images, the 2026 AI engineer roadmap, Karpathy on the AI capability gap.

Learning Log April 14, 2026

AI Notes — April 14

nano-vLLM deep dive: prefill vs decode, KV cache, PagedAttention, continuous batching. Plus Notion's Model Behavior Engineer role and software factory design.

Learning Log April 13, 2026

AI Notes — April 13

GLM-5.1 architecture explained (MoE, MLA, DSA). Using Claude for tax filing: what broke. AI writing is harder than it looks. The folder-as-agent pattern.

Learning Log April 12, 2026

AI Notes — April 12

A quiet day. Sometimes letting ideas settle is the work.

Learning Log April 11, 2026

AI Notes — April 11

Consultant-style agent coordination: cheap executor + expensive advisor. Haiku + Opus doubles BrowseComp scores vs Haiku alone.

Learning Log April 10, 2026

AI Notes — April 10

Meta's Muse Spark: 10x efficiency over Llama 4, 16 hidden tools in meta.ai. Two thoughts: AI tools as games, vibe coding as web fiction.

Learning Log April 9, 2026

AI Notes — April 9

Mythos scores 93.9% on SWE-bench — a nuclear weapon. Picotron distributed training: DP naive vs bucket, AFAB vs 1F1B pipeline schedules.

Learning Log April 8, 2026

AI Notes — April 8

Moltbook: AI theater or genuine emergence? Nebius $46B in signed contracts. Ryan Leoplo on harness engineering and zero human-written code.

Learning Log April 7, 2026

AI Notes — April 7

Why changing one character in an image is harder than generating a cyberpunk city. Full diffusion model walkthrough with math and code.

Learning Log April 6, 2026

AI Notes — April 6

Claude's Cowork feature supports Computer Use across devices — control a remote machine's browser without touching your own.

Share February 7, 2026

The Force That Keeps Me Moving

A simple number changed everything. Thirty thousand days. That's roughly how many days a human life has. This realization reshaped how I live, work, and think about time.

Reflection February 1, 2026

Why I'm Building in Public

The decision to document everything publicly wasn't easy. Here's why I chose transparency over polish, and what I hope to gain from it.

Product Coming Soon

What Steplify Taught Me About Product-Market Fit

My startup failed. But the lessons about listening to users, timing, and the gap between conviction and validation are worth more than any success.