These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Protecting sensitive data is an essential part of security in cloud
computing. However, only specific privileged individuals have access to view or
interact with this data; therefore, it is unscalable to depend on these
individuals also to maintain the software. A solution to this is to allow
non-privileged individuals access to maintain these systems but mask sensitive
information from egressing. To this end, we have created a machine-learning
model to predict and redact fields with sensitive data. This work concentrates
on Azure PowerShell, showing how it applies to other command-line interfaces
and APIs. Using the F5-score as a weighted metric, we demonstrate different
transformation techniques to map this problem from an unknown field to the
well-researched area of natural language processing.
External Datasets
manually labeled data set containing over 60,000 entries derived from 1,420 commands