Skip to content

AI Engine Reference

The @ashim/ai package bridges Node.js to a persistent Python sidecar for all ML operations. The dispatcher process stays alive between requests for fast warm-start performance. GPU is auto-detected at startup and used when available.

13 AI tool routes. All models run locally - no internet required after initial model download.

Architecture

Node.js Tool Route


 @ashim/ai bridge.ts
      │ (stdin/stdout JSON + stderr progress events)

 Python dispatcher (persistent process)

      ├─ remove_bg.py        (rembg / BiRefNet)
      ├─ upscale.py          (RealESRGAN)
      ├─ inpaint.py          (LaMa ONNX)
      ├─ ocr.py              (PaddleOCR / Tesseract)
      ├─ detect_faces.py     (MediaPipe)
      ├─ face_landmarks.py   (MediaPipe landmarks)
      ├─ enhance_faces.py    (GFPGAN / CodeFormer)
      ├─ colorize.py         (DDColor)
      ├─ noise_removal.py    (tiered denoising)
      ├─ red_eye_removal.py  (landmark + color analysis)
      ├─ restore.py          (scratch repair + enhancement + denoising)
      └─ seam_carving        (Go caire binary - not Python)

Timeouts: 300 s default; OCR and BiRefNet background removal get 600 s.

Background Removal

Function: removeBackground
Tool route: remove-background
Model: rembg with BiRefNet (default) or U2-Net variants

ParameterTypeDefaultDescription
modelstringbirefnet-generalModel variant - see table below
alphaMattingForegroundnumber (1–255)240Foreground threshold for alpha matting
alphaMattingBackgroundnumber (1–255)10Background threshold for alpha matting
returnMaskbooleanfalseReturn the mask instead of the cutout
backgroundColorstring-Fill removed area (hex color or "transparent")

Available models:

Model IDBest for
birefnet-generalGeneral purpose (default)
birefnet-portraitPeople / portraits
birefnet-disDichotomous Image Segmentation
birefnet-hrsodHigh-resolution salient objects
birefnet-codCamouflaged objects
u2netFast general purpose
u2net_human_segHuman segmentation
isnet-general-useHigh quality general

Image Upscaling

Function: upscale
Tool route: upscale
Model: RealESRGAN (with Lanczos fallback on CPU-constrained systems)

ParameterTypeDefaultDescription
scale2 | 44Upscale factor
modelstringrealesrgan-x4plusModel variant
faceEnhancebooleanfalseApply GFPGAN face enhancement pass
denoisenumber (0–1)0.5Denoising strength
formatstring-Output format override
qualitynumber95Output quality (for JPEG/WebP)

OCR / Text Extraction

Function: extractText
Tool route: ocr
Models: Tesseract (fast), PaddleOCR PP-OCRv5 (balanced), PaddleOCR-VL 1.5 (best)

ParameterTypeDefaultDescription
qualityfast | balanced | bestbalancedProcessing tier
languagestringenLanguage code (ISO 639-1)
enhancebooleanfalsePre-process image to improve OCR accuracy

Returns structured results with bounding boxes, confidence scores, and extracted text blocks.

Face / PII Blur

Function: blurFaces
Tool route: blur-faces
Model: MediaPipe face detection

ParameterTypeDefaultDescription
blurRadiusnumber30Gaussian blur radius
sensitivitynumber (0–1)0.5Detection confidence threshold

Face Enhancement

Function: enhanceFaces
Tool route: enhance-faces
Models: GFPGAN, CodeFormer

ParameterTypeDefaultDescription
modelgfpgan | codeformergfpganEnhancement model
strengthnumber (0–1)0.7Enhancement strength
sensitivitynumber (0–1)0.5Face detection threshold
centerFacebooleanfalseFocus enhancement on center face only

AI Colorization

Function: colorize
Tool route: colorize
Model: DDColor (with OpenCV DNN fallback)

Converts black-and-white or grayscale photos to full color.

ParameterTypeDefaultDescription
intensitynumber (0–1)0.85Color saturation strength
modelstringddcolorModel variant

Noise Removal

Function: noiseRemoval
Tool route: noise-removal

Three-tier denoising pipeline (fast: OpenCV bilateral filter; balanced: frequency-domain; best: deep learning model).

ParameterTypeDefaultDescription
qualityfast | balanced | bestbalancedProcessing tier
strengthnumber (0–1)0.5Denoising strength
preserveDetailbooleantrueEdge-preserving mode
colorNoisebooleanfalseTarget color noise specifically

Red Eye Removal

Function: removeRedEye
Tool route: red-eye-removal

Detects face landmarks, locates eye regions, and corrects red-channel oversaturation.

ParameterTypeDefaultDescription
sensitivitynumber (0–1)0.5Red pixel detection threshold
strengthnumber (0–1)0.9Correction strength

Photo Restoration

Function: restorePhoto
Tool route: restore-photo

Multi-step pipeline for old or damaged photos: scratch/tear detection and repair → face enhancement → denoising → optional colorization.

ParameterTypeDefaultDescription
modeauto | light | heavyautoRestoration intensity
scratchRemovalbooleantrueDetect and repair scratches, tears
faceEnhancementbooleantrueApply face enhancement pass
fidelitynumber (0–1)0.7Face enhancement strength
denoisebooleantrueApply denoising pass
denoiseStrengthnumber (0–100)40Denoising strength
colorizebooleanfalseColorize after restoration

Passport Photo

Function: Uses detectFaceLandmarks + removeBackground
Tool route: passport-photo
Model: MediaPipe face landmarks

Generates government-compliant ID photos. Supports 37 countries across 6 regions (Americas, Europe, Asia, Africa, Oceania, Middle East). Each spec includes physical dimensions, DPI, head-height ratio, eye-line position, and background color requirements.

ParameterTypeDefaultDescription
countrystringusISO country code (see list in UI)
printLayout4x6 | A4 | nonenoneOutput as print sheet or standalone
backgroundColorstringcountry defaultBackground fill color

Object Erasing (Inpainting)

Function: inpaint
Tool route: erase-object
Model: LaMa via ONNX Runtime

ParameterTypeRequiredDescription
maskDatastringYesBase64-encoded PNG mask (white = erase)
maskThresholdnumber (0–255)NoThreshold for mask binarization

GPU-accelerated when an NVIDIA GPU is available.

Smart Crop

Function: Uses MediaPipe + Sharp attention/entropy
Tool route: smart-crop
Model: MediaPipe face detection

ParameterTypeDefaultDescription
modesubject | face | trimsubjectCrop strategy
widthnumber-Output width
heightnumber-Output height
facePresetstring-Preset framing when mode=face

Face presets:

PresetHead ratioBest for
close-up1.8× faceHeadshots
head-and-shoulders2.8× faceProfile photos
upper-body4.5× faceLinkedIn / formal
half-body7.0× faceFull upper body

Content-Aware Resize (Seam Carving)

Function: seamCarve
Tool route: content-aware-resize
Engine: Go caire binary (not Python - no GPU benefit)

Intelligently resizes images by removing or adding low-energy seams, preserving important content.

ParameterTypeDefaultDescription
widthnumber-Target width
heightnumber-Target height
protectFacesbooleantrueProtect detected face regions from seam removal
blurRadiusnumber0Pre-blur to reduce noise sensitivity
sobelThresholdnumber10Edge sensitivity threshold
squarebooleanfalseForce square output

Max input edge before auto-downscaling: 1200 px.