Building Pluto’s Chrome Extension

Extension Architecture Overview

Pluto’s extension has five distinct Chrome contexts to help in the proof process. Chrome extensions have elevated permissions so they can use APIs regular web apps can’t, and we need different contexts because each environment has certain capabilities that the others don’t. For example, background scripts provide privileged extension APIs and persistent state, but have no DOM. Content/injected scripts can scrape and intercept page data but can’t use extension APIs. The offscreen document gives us a full DOM plus additional threading support for heavy WASM-based proof generation. Each context has a specific role, and they communicate via message passing (chrome.runtime.sendMessage, window.postMessage). The background service worker is the main hub, passing messages and maintaining state for the proof generation process.

Background Service Worker: It coordinates the entire proof generation flow, handles messages between all the contexts, captures low-level HTTP metadata at the browser network level, and manages long-lived tasks.
Content Script: Injected into the parent page, and target webpages (e.g. Reddit, Amazon) to interact with the page’s DOM and scripts. Content scripts can read page content and initiate proof flows, and captures full request/response payloads.
Injected Page Script: A helper script injected into the parent page’s execution context. This is used to run Pluto’s Web SDK within the page as if it were native page script, bypassing extension sandboxing limitations.
Offscreen Document: A hidden offscreen page used for heavy computation and multi-threading. This is where the proof generation (the WebAssembly execution) happens without blocking the UI.
Sandboxed Iframe: An isolated, invisible iframe used for tasks that need a fully isolated environment – for example, running content in a context not subject to the page’s Content Security Policy (CSP) or executing untrusted scripts to parse data.

sequenceDiagram
    participant CS as Content Script
    participant PS as Page Script (Injected)
    participant BG as Background Worker
    participant OS as Offscreen Doc
    participant SB as Sandbox Iframe
    participant UI as Sidebar UI
    CS->>PS: Inject Pluto SDK script
    PS-->>CS: Provide Web2 data (DOM, cookies)
    CS->>BG: Request proof generation
    BG->>OS: Run ZK proof (WASM execution)
    OS-->>BG: Proof result ready
    BG->>UI: Update sidebar UI with proof result
    Note over BG,SB: Sandbox iframe used for CSP-bound or isolated flows

Context Interaction and Messaging

When a user wants to generate a proof of certain data (e.g. proving their Venmo account balance), the content script first injects Pluto’s SDK into the page. This allows it to execute code in the parent page context (where it can, for example, make authenticated requests). The injection is done by creating a <script> tag with the code:

// In content script: inject Pluto SDK into the page context
const script = document.createElement('script')
script.setAttribute('type', 'text/javascript')
script.setAttribute('async', 'true')
const url = chrome.runtime.getURL('parent-page/pluto-sdk.js')
script.setAttribute('src', url)
;(document.head || document.documentElement).appendChild(script)

Once injected, the page script (Pluto SDK) runs within the webpage alongside the normal site scripts. It can collect the necessary data (for example, reading page-rendered content) and then hands it back to the extension’s content script. The content script then sends this data to the background service worker.

The background now opens a dedicated offscreen document—a hidden page that can host a DOM and spawn Web Workers.

// In background: create offscreen document for proof computation if not exists
const offscreenUrl = chrome.runtime.getURL('offscreen/offscreen.html')
const contexts = await chrome.runtime.getContexts({
  contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
  documentUrls: [offscreenUrl]
})
if (contexts.length === 0) {
  await chrome.offscreen.createDocument({
    url: offscreenUrl,
    reasons: [chrome.offscreen.Reason.WORKERS, chrome.offscreen.Reason.DOM_PARSER],
    justification: 'Proof generation'
  })
}

The Role of the Sandboxed Iframe

Every data-collection flow goes through the sandboxed iframe embedded in the offscreen document:

Isolation & CSP safety: The iframe is loaded from sandbox.html with strict <iframe sandbox> flags, so any untrusted prepare.js executes in a lockbox free from site CSPs and with zero access to extension APIs.
Dynamic manifest building: The offscreen page injects three items into the sandbox on each poll:
- the raw HTML of the popup window,
- a JSON blob of current cookies, and
- the user-authored prepare.js. prepare.js inspects that context (e.g. grabs cookies["api_access_token"]) and mutates a ManifestBuilder instance until it has all required auth headers.
Readiness signalling: If prepare() returns false, the sandbox posts PollManifestBuilderNotReady; the offscreen page keeps polling. When it returns true, the sandbox posts PollManifestBuilderSuccess with the compiled manifest, ending the poll loop.
Security boundary: Because only JSON messages cross the iframe boundary, the extension’s privileged contexts stay safe, while the sandbox shoulders all DOM parsing and token extraction.

With the offscreen document running, the background passes it the data and instructions to generate the ZK proof. The offscreen page, being a full webpage context (albeit hidden), can load the WebAssembly module and even spawn Web Workers for multi-threaded computation. It performs the zero-knowledge proof generation (more on the WebAssembly prover in the next section) and then returns the proof artifact back to the background service worker (often via chrome.runtime.sendMessage targeting the background).

Finally, the extension shows the proof results to the user through a Sidebar UI. In Manifest V3, the extension can utilize a sidebar (a panel in the browser’s UI) to display content. Pluto’s extension uses a React-based interface in the sidebar to guide the user through the proof steps and display outcomes. The background sends messages to update the sidebar UI (e.g., when a proof is successfully generated or if an error occurs). The result is a seamless flow: the user clicks “Generate Proof” in the sidebar, and after a short processing period, the proof (e.g., a cryptographic attestation of their Reddit data) is displayed, ready to be shared or verified.

WebAssembly for In-Browser Proof Generation

The core cryptographic proving logic is powered by WebAssembly (WASM). Instead of writing a ZK prover from scratch in JavaScript, the team integrated an existing WASM module that implements the proof protocol (likely compiled from Rust or another systems language). This WASM module, along with its JavaScript/TypeScript bindings, was provided externally (e.g., via Pluto’s Web Proofs SDK), so the extension developers didn’t need to hand-author the cryptographic internals – they could focus on integration and performance.

Running the prover entirely in-browser is a heavy task: it can involve large computations, multi-megabyte arithmetic operations, and possibly multi-threading for speed. That’s why this work is delegated to the offscreen document. The offscreen context can create Web Workers (using the chrome.offscreen.Reason.WORKERS permission as shown earlier) to parallelize tasks. In practice, the WASM prover uses a Web Worker pool (via libraries like wasm-bindgen-rayon) to utilize multiple CPU cores, and this requires spawning worker scripts – something not allowed in a service worker context, but possible in an offscreen page. By running in offscreen, the proof generation doesn’t block the extension UI or content script, and can continue even if the user navigates away from the original page.

It’s worth noting that using WebAssembly in an extension requires careful bundling of the WASM binary and its JavaScript glue code. Pluto’s extension caches the WASM binary (and related proving key parameters) in IndexedDB for quick reuse, since these files can be large. The offscreen context loads the WASM module, initializes it (which might fetch or retrieve proving keys), then executes the proof generation function with the collected web data. Once the proof is computed, it’s sent back as a compact artifact (e.g. a proof JSON or blob) to the background script.

The decision to run all of this in-browser (instead of offloading to a server) reflects a trust-minimized design: the user doesn’t have to trust an external server with their data or proofs; their browser proves the statement with the provided WASM module, and only the proof (which reveals no sensitive data) leaves the browser.

Challenges and Lessons Learned

Building a complex ZK-proof extension within Chrome’s extension sandbox came with a series of challenges. Here are some notable ones and how they were addressed:

Offscreen Context Lifecycle: Managing the offscreen document was tricky. In Manifest V3, the background service worker is ephemeral – it may suspend when idle. The extension must create the offscreen document at the right time (when a proof is needed) and destroy it when done to free resources. We implemented a check to reuse an existing offscreen context if one is already open (to avoid multiples) and to shut it down when proofs are complete. One gotcha was ensuring the service worker doesn’t terminate mid-proof; keeping the offscreen document open (and performing work) helps keep the service worker alive. It required careful orchestration to avoid orphaned offscreen instances or lingering service workers.
Communication Across Isolated Contexts: With so many contexts (background, content script, page script, offscreen, sandbox, UI), message passing architecture was critical. We defined a clear message schema with targets (e.g., AppTargets.Background, AppTargets.Offscreen, AppTargets.Sidebar, etc.) and actions, so each context’s listener can filter and handle only relevant messages. For example, the content script would forward page messages to the background with target “Background”, and the offscreen script would post messages with target “Background” when a proof was ready. Likewise, the background directed certain responses to the sidebar UI (target “Sidebar”). Using Chrome’s messaging API and window.postMessage (for page <-> content script) together requires caution with asynchronous responses and ensuring no responses are missed if a context unloads. The team had to work around quirks like the extension messaging runtime throwing errors if a response isn’t sent – by catching and ignoring specific benign errors (e.g., the “message channel closed” error when a context unloads just after responding).
Content Security Policy (CSP) Issues: Many web platforms have strict CSPs that can prevent injected scripts from running or limit resource access. For instance, a site might disallow inline scripts or only allow scripts from certain domains. Our approach had two facets:
1. The initial content script injection of the Pluto SDK uses an extension script URL (chrome-extension://<id>/parent-page/pluto-sdk.js), which Chrome extensions are typically exempt from CSP restrictions – this allows the SDK to load on most sites. However, any inline script injection or dynamic code in the page context could be blocked by CSP.
2. In cases where CSP posed a problem (or where we wanted a safer isolation), we leveraged the sandboxed iframe in the offscreen document. The extension could fetch the page’s content as text (or use the page’s network responses) and then feed it into the sandbox, where we run a parser and extractor script. Because the sandboxed iframe is not the actual webpage, it isn’t subject to the page’s CSP. This way, we could still retrieve the needed data in a controlled environment. This proved especially useful for complex proof scenarios like zkTLS (TLS notary proofs), where raw HTML and network data need to be processed without interference from the page’s environment.
OAuth and Authentication Quirks: To gather data from certain sites (like Reddit’s API or user account info), the user needs to be authenticated. Integrating an OAuth login flow into a Chrome extension required using Chrome’s identity APIs. We used chrome.identity.launchWebAuthFlow to open an OAuth login window and retrieve a token. For example, to authenticate the user with Reddit, we direct them to Reddit’s OAuth page with a redirect URI pointing back to our extension’s own special URL (Chrome provides a {extensionid}.chromiumapp.org domain for this purpose). Once the user logs in, the redirect URI delivers an access token which the extension can capture:
```
const redirectUri = `https://${chrome.runtime.id}.chromiumapp.org/`
chrome.identity.launchWebAuthFlow(
  {
    url: authUrl.toString(),
    interactive: true
  },
  (responseUrl) => {
    if (chrome.runtime.lastError || !responseUrl) {
      // handle error (user closed window or network issue)
      return
    }
    const url = new URL(responseUrl)
    const accessToken = new URLSearchParams(url.hash.substring(1)).get('access_token')
    // Store and use the accessToken for authenticated requests
  }
)
```
One challenge here was ensuring the OAuth window could communicate the result back to the extension correctly, and handling the case where the user might already have a valid token (we stored tokens in localStorage for reuse until expiry). Additionally, because the extension operates in-page, if the user is already logged in to the site normally, we might not need OAuth at all – the content script could potentially use session cookies to fetch data. However, for consistency and broader site support, the OAuth route was used when available, as it gives the extension a token to call APIs directly. We also had to manage the user experience: prompting for login in the sidebar, opening the OAuth popup, then returning to the extension flow once authenticated.

By tackling these challenges, we engineered an extension that leverages Chrome’s latest capabilities (like offscreen documents and side panel UI) to bring zero-knowledge proofs to Web2 data. The result is a complex but elegant dance of isolated contexts – each doing what it’s best at – coming together to prove something useful about data locked behind web interfaces, all without violating user privacy or site security.

Conclusion

Pluto’s Chrome extension for Web Proofs demonstrates how far in-browser technology has come. We can now perform non-trivial cryptographic computations (ZK proofs) purely on the client side and integrate with existing websites securely. This was achieved by carefully composing Chrome extension contexts: a background service worker to orchestrate, content and injected scripts to interface with websites, offscreen documents for heavy lifting with WebAssembly, and sandboxed iframes to navigate around security constraints.

For senior engineers, the key takeaway is the power of modern extension architecture – by splitting responsibilities across context boundaries, we maintain security and performance, while enabling features that would have seemed impossible in-browser just a few years ago. Generating a zero-knowledge proof about, say, your Reddit karma or Amazon purchase, directly from your browser, is no longer science fiction – it’s running live in an extension like Pluto’s. The patterns used here (smart message routing, context-specific processing, and careful use of new Chrome APIs) can be applied to build robust, complex browser extensions that do more than just manipulate the DOM – they can deliver full-fledged decentralized and privacy-preserving capabilities right at the user’s fingertips.