We are pleased to announce our partnership with Truffle Security and the integration of TruffleHog's powerful risk information scanning features into our platform. These features areWe are continually working to improve safety One of the key initiatives of the
TruffleHog is an open source tool for detecting and verifying leaks of confidential information in code. It has a wide range of detectors across multiple popular SaaS and cloud service providers that scan files and code repositories for sensitive information such as credentials, tokens, and encryption keys.
Incorrectly committing sensitive information to code repositories can cause serious problems.TruffleHog helps developers protect data and prevent costly security incidents by scanning code repositories for confidential information, helping to capture and remove this sensitive information before problems occur.
To combat the risk of confidential information leakage in public and private code repositories, we've partnered with the TruffleHog team on two initiatives: augmenting our automated scanning process with TruffleHog, and creating a native Hugging Face scanner in TruffleHog.
Enhancing Our Automated Scanning Process with TruffleHog
At Hugging Face, we are committed to protecting our users' sensitive information. As a result, we've expanded our automated scanning process to include TruffleHog
With each push to the codebase, we run thetrufflehog filesystem
command that scans for potential risks. If a verified risk is detected, we notify the user via email so that they can take corrective action
Validated risks are those risks that have been identified as being available for authentication against their corresponding provider. Note that unauthenticated risks are not necessarily harmless or invalid: authentication may fail for technical reasons, such as provider downtime.
It is always valuable to run trufflehog even if we do it for you or if you run it yourself on your codebase. For example, you may have replaced compromised keys and want to make sure they show up as "unverified", or you may want to manually check that unverified risks still pose a threat.
We will eventually migrate to the trufflehog huggingface
command, the native Hugging Face scanner, once support for LFS lands.
When we support LFS, we will eventually migrate to the native Hugging Face scanner, thetrufflehog huggingface
Command.
TruffleHog Native Hugging Face Scanner
The goal of creating a native Hugging Face scanner is to proactively help our users (and the security teams that protect them) scan their own account data for risks of compromise.
TruffleHog's new open source Hugging Face integration scans models, datasets and Spaces, as well as any associated PRs or Discussions.
The only limitation is that TruffleHog will not currently scan any files stored in the LFS format. Their team is working to resolve this issue so that they can soon support all of thegit
Source.
To use TruffleHog to scan for secrets in your or your organization's Hugging Face models, datasets, and Spaces, run the following command.
# For your user
trufflehog huggingface --user <username>
# For your organization
trufflehog huggingface --org <orgname>
# Or both
trufflehog huggingface --user <username> --org <orgname>
You can use (--include-discussions
) and PRs (--include-prs
) optional command to scan Hugging Face discussions and PR comments.
If you want to scan just one model, dataset, or Space, TruffleHog has specific commands for each.
# Scan one model
trufflehog huggingface --model <model_id>
# Scan one dataset
trufflehog huggingface --dataset <dataset_id>
# Scan one Space
trufflehog huggingface --space <space_id>
If you need to pass in an authentication token, you can use the -token command, or set the HUGGINGFACE_TOKEN environment variable.
Here's TruffleHog onmcpotato/42-eicar-street Example of output when running on.
trufflehog huggingface --model mcpotato/42-eicar-street
🐷🔑🐷 TruffleHog. Unearth your secrets. 🐷🔑🐷
2024-09-02T16:39:30+02:00 info-0 trufflehog running source {"source_manager_worker_id": "3KRwu", "with_units": false, "target_count": 0, "source_manager_units_configurable": true}
2024-09-02T16:39:30+02:00 info-0 trufflehog Completed enumeration {"num_models": 1, "num_spaces": 0, "num_datasets": 0}
2024-09-02T16:39:32+02:00 info-0 trufflehog scanning repo {"source_manager_worker_id": "3KRwu", "model": "/mcpotato/", "repo": "/mcpotato/"}
Found unverified result 🐷🔑❓
Detector Type: HuggingFace
Decoder Type: PLAIN
Raw result: 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
Commit: 9cb322a7c2b4ec7c9f18045f0fa05015b831f256
Email: Luc Georges <@>
File: token_leak.yml
Line: 1
Link: /mcpotato/42-eicar-street/blob/9cb322a7c2b4ec7c9f18045f0fa05015b831f256/token_leak.yml#L1
Repository: /mcpotato/
Resource_type: model
Timestamp: 2024-06-17 13:11:50 +0000
2024-09-02T16:39:32+02:00 info-0 trufflehog finished scanning {"chunks": 19, "bytes": 2933, "verified_secrets": 0, "unverified_secrets": 1, "scan_duration": "2.176551292s", "trufflehog_version": "3.81.10"}
Hats off to the TruffleHog team for providing such a great tool to make our community safer! Stay tuned for more features as we continue to work together to make it even safer for all through the Hugging Face Hub platform.
Original in English./blog/trufflesecurity-partnership
Original author: Luc Georges
Translator: smartisan