Privacy Boundary
Trust boundary
Section titled “Trust boundary”The trust boundary is the consumer’s Snowflake account (or Databricks workspace, or AWS account). All sensitive operations occur inside this boundary. Only compressed, non-invertible latent representations are visible to non-privileged roles within the same environment.
What crosses the boundary
Section titled “What crosses the boundary”| Crosses | Does not cross |
|---|---|
| Latent vectors (to non-privileged roles) | Raw feature values |
| Encoder metadata (name, dimensions, status) | Training data |
| Billing events (to Snowflake billing system) | Model weights |
| Consumer credentials | |
| Any data to the provider |
Role-based enforcement
Section titled “Role-based enforcement”The masking policy checks the caller’s role at query time.
- Bypass roles (default: ACCOUNTADMIN): see raw feature values
- All other roles: see latent vectors
Additional bypass roles can be specified when calling protect_column().
Schema protection
Section titled “Schema protection”Feature views hide individual column names behind a single features array. Queries reference features, not price, sqft, bedrooms. This prevents metadata leakage about which columns are being encoded.
On Snowflake, EXPLAIN output is automatically redacted for secure views.
Limitations
Section titled “Limitations”Non-invertible does not mean non-informative. Latent vectors that are useful for ML contain task-relevant information. If sensitive attributes correlate with the prediction task, partial information about those attributes is present in the encoding.
Access control, not wire-level enforcement. On Snowflake, mitigation is via RBAC. Privileged insiders can access raw data. Defense in depth requires strict role management, monitoring, and audit logging.