Private AI

Use AI on sensitive client work without sending it into the public cloud.

Wilcoe designs and manages private AI systems for law firms, medical practices, tax and accounting teams, advisers, and therapy groups that need speed without giving up control.

For 5–50 person firms with confidential workflows. Apple-silicon hardware. Managed by Wilcoe.

Three forces are pushing private AI from "nice idea" to "the only way."

Public-cloud AI is fast and convenient. It is also the wrong default for sensitive work in regulated firms. Three things changed at the same time.

🔒

Compliance pressure is now sharp.

HIPAA needs a Business Associate Agreement when ePHI hits a cloud. The ABA's Formal Opinion 512 makes lawyers explicitly responsible for what self-learning tools do with client information. The IRS and FTC point tax and financial firms to written security plans, MFA, logs, and service-provider oversight. Public-tier AI tools were not built for any of that.

💰

Cloud-AI billing is volatile.

Once your team adopts AI in earnest, token consumption jumps. Consumer-tier limits start to bite. Enterprise-tier minimums quote you out of the price you scoped. Forecasting becomes guesswork. A managed local appliance trades that volatility for a predictable run cost on hardware you control.

🔗

Lock-in and latency hurt.

Vendor terms, regional data routing, model retraining, hidden subcontractors. Even setting compliance aside, "your provider's roadmap is now your roadmap" is a poor architecture for a 10-person firm with a five-year client roster. Private AI keeps the workflow yours.

What private AI actually replaces.

Not all AI moves to the appliance — that would waste time and money. We focus the local stack on the sensitive 20% of workflows where cloud AI is hardest to justify and local AI is finally good enough. The other 80% can stay where it is, with policy.

Public cloud AI
  • Data location: a vendor data center in a region you don't pick.
  • Cost shape: usage-based, hard to forecast, sensitive to model price changes.
  • Control: the vendor's policy is your policy. Subprocessors and retraining are theirs.
  • Latency: internet-dependent.
  • Audit: bring your own logging and hope it's enough for your regulator.
Wilcoe Private AI
  • Data location: on your premises or in your private rack. Your hardware. Your data.
  • Cost shape: fixed pilot + predictable monthly retainer. No surprise bills.
  • Control: your retention rules. Your audit logs. Your exit path.
  • Latency: local network speeds for routine work.
  • Audit: matter-level access logs, model allowlists, prompt-template management — built in.

Want the full comparison across nine dimensions including hybrid? See cloud vs. hybrid vs. on-prem →

The five-layer service.

Hardware alone doesn't make AI usable in a regulated firm. We design and run all five layers so your team gets governed AI on day one — not a science project.

01

Assessment & policy design.

A paid Readiness Sprint. We inventory your sensitive workflows, document systems, retention rules, approval steps, and review obligations. For law and tax we translate ethics and security duties into operating policy. For healthcare we scope PHI exposure and whether any cloud fallback is acceptable. Local inference alone doesn't solve governance — this layer does.

02

Private AI appliance deployment.

A right-sized Apple-silicon system, on your premises or in a private co-located rack you control. For the smallest firms that's one or two higher-memory Macs. For busier offices it's a Mac mini cluster or a Mac Studio node design with encrypted local storage, segmented networking, automated backup, standardized updates, and hardware redundancy. We make the box boring, supportable, and safe.

03

Knowledge layer & retrieval.

Most of the value comes from putting models in controlled contact with your firm's actual knowledge: policies, templates, matter files, intake forms, contracts, SOPs, CRM notes, tax checklists, scheduling data, approved precedents. We build local indexing, role-based retrieval, document tagging, and matter-specific workspaces. We stay model-agnostic so we can swap in approved open models as needed.

04

Vertical copilots & limited agents.

Narrow, high-ROI workflows. For legal: matter summarization, clause extraction, deposition or call-note condensation, chronology building, first-draft internal memos with mandatory human review. For clinics: transcription-to-summary, internal documentation support, referral packet preparation, scheduling and follow-up. For tax and accounting: document classification, client checklist generation, workpaper organization, policy lookup. We sell governed jobs, not "AI."

05

Managed governance & training.

The actual moat. We run patching, model allowlists, prompt-template management, logging, access reviews, backup checks, incident response playbooks, and quarterly workflow tuning. Plus team training: how to use the system, what not to put into it, how outputs should be reviewed, when human signoff is required, and how to document use for client or regulator scrutiny.

Who it's for.

Founder-led or partner-led firms with roughly 5 to 50 knowledge workers, real confidentiality concerns, weak in-house IT capacity, and clear pressure to adopt AI.

Four shapes we deploy.

Illustrative starting points. Each engagement is sized in the Readiness Sprint and quoted firm-specifically afterward — pilots vary several-fold across firm shapes, so we don't publish prices that are unlikely to match yours.

Profile Hardware shape Models & fallback
Solo law ~5 users 1× Mac mini M4 Pro, 48–64GB, locked office or closet, UPS, segmented network. On-device Apple model + a local open model. Cloud off by default.
Small clinic ~10 users 1× Mac Studio M4 Max, 64–128GB, encrypted storage, server room or locked cabinet. Local models. Cloud only with a signed BAA on approved workflows.
Accounting firm ~30 users 2–3× Mac mini Pro racked, or 1–2× Mac Studio. Department namespaces. Retention by engagement. Local models. Client-specific cloud fallback for non-SSN work.
Therapy group ~50 users, multi-office 2× Mac Studio M4 Max or 1× M3 Ultra. Central rack. Per-office partitions. Local models. Cloud disabled for notes by default. Tight breach-response playbook.

90 days from sprint to live.

A pilot is one workflow, one office, one signed off compliance frame. Then we expand on what worked.

Days 1–14

Risk, workflow, and legal review.

Inventory sensitive systems. Map the first workflow. Coordinate with your counsel or compliance partner.

Days 15–30

Hardware design, procurement, and policy pack.

Right-sized Apple-silicon system. Written policies covering retention, access, and review steps.

Days 31–50

Install, identity, logging, backup, and access rules.

Network segmentation, MFA, role-based access, encrypted backup, audit logging.

Days 51–70

Data connectors, local memory, and one agent workflow.

Local indexing of the workflow's source documents. The first vertical copilot, with mandatory human review gates.

Days 71–90

Training, go-live, audit review, expansion decision.

Team training on what to use the system for and what to keep out of it. Audit log review. Decide what to add next.

What stays local. What stays optional.

Defaults matter more than capability. Sensitive data should be locked down by default, with cloud fallback only if your policy explicitly allows it.

Local by default

  • Sensitive client documents and indexes.
  • Matter, patient, or engagement file workspaces.
  • Audit logs, retention rules, backups.
  • Model inference for approved workflows.
  • Prompt templates and approval rules.

Cloud only with policy

  • Generic non-sensitive drafting (newsletter copy, public-facing posts).
  • Tasks where your client or counsel has explicitly approved a vendor.
  • Capabilities that genuinely need a frontier model — gated, logged, and reviewed.
  • Anything covered by an existing BAA or vendor agreement that fits your obligations.

Built to fit the rules you already work under.

"Private" isn't automatically "compliant." We design around your obligations and document the architecture so your counsel can sign off.

HIPAA & BAAs

What ePHI is, why a vendor becomes a Business Associate, and how a private appliance simplifies the Security Rule.

Read the explainer →

ABA Formal Opinion 512

Competence, confidentiality, communication, supervision, candor — and how matter-level retrieval + audit logs map to those duties.

Read the explainer →

IRS Pub 4557 + FTC Safeguards

Written security plans, MFA, logging, service-provider oversight. A managed appliance simplifies WISP alignment.

Read the explainer →

Wilcoe Private AI is designed around your obligations. Final compliance signoff is client-specific and remains with your counsel or compliance officer.

What it connects to.

Private AI doesn't sit alone. It's the governed core of the rest of what Wilcoe builds.

HA5H — offline memory for AI agents

Crystal-clear memory with no API calls. 93.6% accuracy across six indexed dimensions. The retrieval substrate that powers your private-AI knowledge layer when you need agents to remember without phoning home.

Visit ha5h.com →

Wilcoe AI Teams

Role-mapped AI agents for marketing, sales, ops, brand, and client work. The same teams can run in private mode for sensitive workflows.

See the AI Teams →

Wilcoe Audit

The one-time discovery document. Available with a private-AI scope so the recommendations you receive already account for your confidentiality boundaries.

See the Audit →

Common questions.

Is this compliant?

Wilcoe Private AI is designed around your obligations, but compliance is firm-specific. We provide the architecture, policies, and documentation; your counsel or compliance officer signs off on what it means for your firm.

What stays local?

By default: sensitive documents, indexes, matter or patient workspaces, audit logs, retention rules, backups, and model inference for approved workflows. Cloud fallback is opt-in by policy.

Can we still use cloud AI for some things?

Yes — if your policy allows it for specific tasks and contracts. Many firms keep public-cloud AI for non-sensitive drafting and use the private appliance for everything tied to clients, patients, or matter files.

How fast can we start?

Most pilots scope in two weeks and launch inside 90 days. The Readiness Sprint is the first two weeks. Hardware and policy come next, then install, then a single workflow goes live.

What hardware do you use?

Apple-silicon. Mac mini M4 Pro for small firms, Mac Studio M4 Max for busier offices, M3 Ultra for the largest deployments. Right-sized to the workload — we don't oversize pilots.

Do we have to replace our current software?

No. We integrate the highest-value workflows first and leave everything else alone. The goal is to move sensitive work into a controlled environment, not to swap your stack.

Is there vendor lock-in?

Not on our side. The hardware is yours. Models are swappable. Indexes are exportable. Documentation belongs to you. If you ever need to leave, we hand it off cleanly.

What do you cost?

It depends on firm size, sensitivity profile, fallback policy, workflow count, and IT-handoff scope. We don't publish a price list because pilots vary several-fold across firm shapes — a price that's right for a 5-person law office is wrong for a 50-person therapy group. See how we think about cost →

Use AI on your sensitive work. Without giving it away.

Book a Readiness Call. We'll walk through your highest-leverage workflow, the compliance frame that applies, and what a 90-day pilot would look like for your firm.

Book a Readiness Call →

or

Take the readiness check →