Snowflake Architecture

AIQu VEIL is a Snowflake Native App that trains deterministic autoencoders on consumer-specified feature columns and applies the trained encoder transparently via column masking policies. All processing happens inside the consumer’s Snowflake account. No data leaves the account. No external network calls are made.

The app runs two SPCS services (encoding and training), uses three schemas (versioned, runtime, persistent state), and exposes no public endpoints.

Query-time encoding

When a consumer queries a protected column, the masking policy transparently replaces raw features with encoded vectors. The call chain: masking policy calls core.encode(), which delegates to app_runtime.encode() (service-bound), which routes to the encoding service over internal SPCS HTTP.

Query-time encoding flow

The encoding service loads ONNX models from an internal stage. Models are never downloaded from external sources. A consumer sees encoded vectors; the raw features are never returned to non-bypass roles.

Training

Training starts when a consumer calls train_encoder() directly or when a schedule fires via schedule_training(). The procedure inserts a job record and calls the training service via a service-bound function. The training service reads feature data through a warehouse, trains an autoencoder, exports the ONNX model to the internal stage, and signals the encoding service to reload.

Training flow

Training runs on CPU by default. GPU acceleration is optional and requires explicit configuration. The training service authenticates to Snowflake via the SPCS OAuth token. The reload signal uses a bearer token for inter-service authentication.

Null handling

Condition	Action
Column has some nulls	Impute with column mean, log count
Column is >50% null	Warn (training continues)
Column is 100% null	Error (cannot compute mean or std)

Compute modes

Mode	Pool family	When to use
CPU (default)	CPU_X64_XS	Simpler models, smaller datasets, universal availability
GPU (opt-in)	Region-dependent	Larger datasets, faster training

Tested GPU families:

Cloud	Family	GPU
AWS	GPU_NV_S	A10G 24GB
Azure	GPU_NV_SM	A10 24GB
GCP	GPU_GCP_NV_L4_1_24G	L4 24GB

GPU mode selection uses SHOW COMPUTE POOL INSTANCE FAMILIES at runtime. If no tested family is available in the consumer’s region, GPU mode is rejected with a clear error.

Deployment layout

The app creates three schemas inside the app database.

Deployment layout

Schema	Type	Contents
`core`	Versioned	Procedures, wrapper functions, consumer-facing views
`app_runtime`	Non-versioned	Services, service-bound functions, masking policies, feature views
`app_state`	Non-versioned	Encoder registry, training jobs, configuration, ONNX model stage

core is replaced on upgrade. Services and masking policies live in app_runtime because Snowflake prohibits these object types in versioned schemas. Feature views live in app_runtime by design: they are app-owned rather than created in consumer schemas, avoiding consumer-side CREATE VIEW privileges. The core.encode() wrapper delegates to the service-bound function in app_runtime.

Information flow

Flow	Source	Destination	Mechanism
Query-time encoding	Consumer query on masked column	Encoding service	Masking policy calls `core.encode()`, delegates to `app_runtime.encode()` (service-bound), calls encoding service via HTTP
Training data read	Consumer source table	Training service	SELECT via warehouse, scoped to consumer-specified table
Model storage	Training service	Internal stage	ONNX file written to `@app_state.encoders` via volume mount
Model reload	Training service	Encoding service	HTTP POST `/reload` with bearer token
Masking policy attach	`protect_column` procedure	Consumer column	ALTER via app-owned policy on app-owned view

Container images

Three containers, all running as non-root users.

Container	Purpose	GPU
Encoding service	ONNX Runtime inference	No
Training service (CPU)	Autoencoder training, default mode	No
Training service (GPU)	Autoencoder training, GPU mode	Yes

Model lifecycle

Publishing. The training service writes to a temp file, then copies to the active path. Registry entry is created only after the live file is confirmed on disk.

Archiving. When retraining an existing encoder, the current model is archived as <encoder>_vN.onnx before the new version is published.

Loading. The inference service loads only active models (files without the archive suffix). On /reload, models whose stage files no longer exist are evicted.

Deletion. Hard delete removes the active model file, all archived versions, the feature view, the feature function, and the registry entry.

Concurrency

One training job runs at a time across all encoders. A database-level slot lock enforces this. The lock is acquired atomically via UPDATE on a single-row table and released when the job completes, fails, is cancelled, or the training service encounters a startup error.

Performance

Use case	Approach
Ad-hoc queries	Use masking (transparent)
Dashboards	Use masking (small result sets)
ML training export	Precompute to table, then export
Real-time serving	Query protected view directly

The encoder service batches all rows in a single HTTP request (1 request with 1000 rows, not 1000 separate requests).