Login Machine

One browser agent loop that logs into any website.
Credentials never touch the model. They flow directly into the browser DOM.

Try the live demo • Article • Quick Start • How It Works • Screen Types • Design Principles • Architecture

Why

If you're building browser agents that need to log into websites, you know the pain. Every website has a different login flow, and traditional automation means writing a dedicated script for each one. Hardcoded selectors, brittle state machines, everything breaks when a site ships a redesign.

But login pages are designed for humans. Every screen is self-contained. You can always figure out what to do just by looking at it. An LLM with vision can do the same.

At Anon the Login Machine replaced hundreds of per-website scripts with a single agent loop that works for any login flow: multi-step credentials, SSO pickers, MFA prompts, magic links, all handled by the same code.

Quick Start

cp .env.example .env.local
# Fill in your API keys
npm install
npm run dev

Open http://localhost:3000 and paste a login URL, or try the hosted demo.

Environment Variables

Variable	Description
`ANTHROPIC_API_KEY`	Anthropic API key for Claude
`BROWSERBASE_API_KEY`	BrowserBase API key
`BROWSERBASE_PROJECT_ID`	BrowserBase project ID

How It Works

flowchart TD
    A(["Navigate to login page"]) --> B["Take Screenshot<br/>+ Extract HTML"]
    B --> C["LLM Classifies<br/>Screen Type"]
    C --> D{"Logged in?"}
    D -->|"No"| F["Request Input<br/>from User"]
    D -->|"Yes"| E(["Done"])
    F --> G["Submit Input<br/>in Browser"]
    G --> B

Stripped HTML + screenshot. Raw page HTML is full of scripts, styles, SVGs, and tracking pixels. The extractor walks the DOM recursively, strips everything except form-relevant tags and attributes, and traverses Shadow DOM boundaries so enterprise SSO widgets aren't missed. This cuts token usage by roughly 10x on complex pages and reduces hallucinated locators.

Credential isolation. The LLM analyzes the page and returns structured data describing what fields exist and their Playwright locators. It never sees what the user types. Credentials flow directly from the user into the browser DOM via Playwright.

Self-correcting locators. Every LLM-generated Playwright locator is validated against the live DOM before use. If a locator doesn't match, the error is fed back to the LLM in <error-history> tags for retry with context (up to 3 attempts).

Screen Types

The LLM classifies every page into one of six types, each with a strict Zod schema:

Type	What It Is	How It's Handled
`credential_login_form`	Email, password, OTP fields + submit button	Shows dynamic form → user fills → agent types into DOM
`choice_screen`	Account picker, SSO options, workspace selector	Shows buttons → user picks → agent clicks
`magic_login_link`	"Check your email" screens	Shows URL input → user pastes link → agent navigates
`loading_screen`	Spinners, redirects, Cloudflare challenges	Auto-waits and re-analyzes (max 12 retries)
`blocked_screen`	Cookie banners, popups blocking the flow	Auto-dismisses and re-analyzes
`logged_in_screen`	Dashboard, homepage	Terminal success state

Design Principles

Observe, don't assume. Every action is followed by a fresh page analysis. The system never guesses what screen comes next.
Validate before acting. LLM outputs are checked against the live DOM. Hallucinated selectors are caught and corrected before they cause errors.
Fail forward with context. When something goes wrong, the error becomes part of the next attempt's context. The LLM doesn't repeat the same mistake.

Architecture

src/
├── app/api/chat/route.ts           # Single API endpoint (start/submit/close)
├── components/
│   ├── chat.tsx                     # Main UI shell (browser + chat columns)
│   ├── message-bubble.tsx           # Dynamic message rendering
│   ├── credential-form.tsx          # Login form with inline errors
│   ├── choice-buttons.tsx           # SSO / option selection
│   ├── magic-link-input.tsx         # Magic link URL input
│   └── log-panel.tsx               # Terminal-style log viewer
├── hooks/use-login-session.ts       # State management + SSE streaming
└── lib/ai-login/
    ├── agent.ts                     # LLM analysis + screen handlers
    ├── browser.ts                   # BrowserBase + Playwright (stateless)
    ├── prompts.ts                   # System prompt for classification
    └── types.ts                     # Zod schemas for all screen types

Stack

Component	Technology
Frontend	Next.js 16, React 19, Tailwind 4
LLM	Claude Sonnet 4.5 via Vercel AI SDK
Browser	Playwright over CDP via BrowserBase
Validation	Zod schemas + live DOM locator checks

License

MIT. See LICENSE.

Built by @RichardHruby and @jesse-olympus at Anon

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Login Machine

Why

Quick Start

Environment Variables

How It Works

Screen Types

Design Principles

Architecture

Stack

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

RichardHruby/login-machine

Folders and files

Latest commit

History

Repository files navigation

Login Machine

Why

Quick Start

Environment Variables

How It Works

Screen Types

Design Principles

Architecture

Stack

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages