Skip to content

Privacy Boundary

The trust boundary is the consumer’s Snowflake account (or Databricks workspace, or AWS account). All sensitive operations occur inside this boundary. Only compressed, non-invertible latent representations are visible to non-privileged roles within the same environment.

Consumer Environment (Trusted) Non-Privileged Role Only latent vectorscross the role boundary Raw Data Encoder Latent Vector Raw data stays hereEncoder stays hereModel artifacts stay here Sees: latent vectors Cannot see: raw features,column names, encoder internals
CrossesDoes not cross
Latent vectors (to non-privileged roles)Raw feature values
Encoder metadata (name, dimensions, status)Training data
Billing events (to Snowflake billing system)Model weights
Consumer credentials
Any data to the provider

The masking policy checks the caller’s role at query time.

  • Bypass roles (default: ACCOUNTADMIN): see raw feature values
  • All other roles: see latent vectors

Additional bypass roles can be specified when calling protect_column().

Feature views hide individual column names behind a single features array. Queries reference features, not price, sqft, bedrooms. This prevents metadata leakage about which columns are being encoded.

On Snowflake, EXPLAIN output is automatically redacted for secure views.

Non-invertible does not mean non-informative. Latent vectors that are useful for ML contain task-relevant information. If sensitive attributes correlate with the prediction task, partial information about those attributes is present in the encoding.

Access control, not wire-level enforcement. On Snowflake, mitigation is via RBAC. Privileged insiders can access raw data. Defense in depth requires strict role management, monitoring, and audit logging.