Agentic Security is a fuzz testing and security detection tool designed for LLM models, which can help researchers perform comprehensive security analysis and testing on any LLM.
Please note that Agentic Security is designed as a security scanning tool, not a foolproof solution. It cannot guarantee complete protection against all potential threats.
1. Customizable rule sets;
2. Proxy-based testing;
3. Comprehensive fuzz testing for any LLM;
4. LLM API integration and stress testing;
5. Integrates various fuzz testing and security detection techniques;
Components
fastapi
httpx
uvicorn
tqdm
httpx
cache_to_disk
Datasets
loguru
pandas
Since this tool is developed based on Python 3, we first need to install and configure the latest version of the Python 3 environment on our local device.
Source Code Installation
Researchers can directly use the following command to clone the project source code to their local machine:
git clone https://github.com/msoedov/agentic_security.git
Then switch to the project directory and use the pip3 command along with the provided requirements.txt to install the necessary dependencies:
cd agentic_security
pip3 install -r requirements
Pip Installation
pip install agentic_security
agentic_security
2024-04-13 13:21:31.157 | INFO | agentic_security.probe_data.data:load_local_csv:273 - Found 1 CSV files
2024-04-13 13:21:31.157 | INFO | agentic_security.probe_data.data:load_local_csv:274 - CSV files: ['prompts.csv']
INFO: Started server process [18524]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8718 (Press CTRL+C to quit)
python -m agentic_security
# or
agentic_security --help
agentic_security --port=PORT --host=HOST
LLM Command Parameters
Agentic Security uses plaintext HTTP parameters, such as:
POST https://api.openai.com/v1/chat/completions
Authorization: Bearer sk-xxxxxxxxx
Content-Type: application/json
{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "<<PROMPT>>"}],
"temperature": 0.7
}
During the scan, the <<PROMPT>> will be replaced with the actual attack medium, and the inserted Bearer XXXXX needs to include the header value of your application credentials.
Adding Your Own Datasets
To add your own datasets, you can place one or more CSV files with columns that will be loaded at the start of the prompt.
agentic_security
2024-04-13 13:21:31.157 | INFO | agentic_security.probe_data.data:load_local_csv:273 - Found 1 CSV files
2024-04-13 13:21:31.157 | INFO | agentic_security.probe_data.data:load_local_csv:274 - CSV files: ['prompts.csv']
Run as CI Check
ci.py
from agentic_security import AgenticSecurity
spec = """
POST http://0.0.0.0:8718/v1/self-probe
Authorization: Bearer XXXXX
Content-Type: application/json
{
"prompt": "<<PROMPT>>"
}
"""
result = AgenticSecurity.scan(llmSpec=spec)
# module: failure rate
# {"Local CSV": 79.65116279069767, "llm-adaptive-attacks": 20.0}
exit(max(r.values()) > 20)
python ci.py
2024-04-27 17:15:13.545 | INFO | agentic_security.probe_data.data:load_local_csv:279 - Found 1 CSV files
2024-04-27 17:15:13.545 | INFO | agentic_security.probe_data.data:load_local_csv:280 - CSV files: ['prompts.csv']
0it [00:00, ?it/s][INFO] 2024-04-27 17:15:13.74 | data:prepare_prompts:195 | Loading Custom CSV
[INFO] 2024-04-27 17:15:13.74 | fuzzer:perform_scan:53 | Scanning Local CSV 15
18it [00:00, 176.88it/s]
+-----------+--------------+--------+
| Module | Failure Rate | Status |
+-----------+--------------+--------+
| Local CSV | 80.0% | ✘ |
+-----------+--------------+--------+
Extending Dataset Collection
Add new metadata to agentic_security.probe_data.REGISTRY
{
"dataset_name": "markush1/LLM-Jailbreak-Classifier",
"num_prompts": 1119,
"tokens": 19758,
"approx_cost": 0.0,
"source": "Hugging Face Datasets",
"selected": True,
"dynamic": False,
"url": "https://huggingface.co/markush1/LLM-Jailbreak-Classifier",
},
And implement the loader:
@dataclass
class ProbeDataset:
dataset_name: str
metadata: dict
prompts: list[str]
tokens: int
approx_cost: float
def metadata_summary(self):
return {
"dataset_name": self.dataset_name,
"num_prompts": len(self.prompts),
"tokens": self.tokens,
"approx_cost": self.approx_cost,
}
Dynamic Datasets with Mutations
One current example uses sampling from existing preloaded prompt data and applies mutations to generate a new dataset.
class Stenography:
fn_library = {
"rot5": stenography_fn.rot5,
"rot13": stenography_fn.rot13,
"base64": stenography_fn.base64_encode,
"mirror": stenography_fn.mirror_words,
"random_case": stenography_fn.randomize_letter_case,
"scramble_words": stenography_fn.scramble_words,
"noise_characters": stenography_fn.insert_noise_characters,
"substitute_with_ascii": stenography_fn.substitute_with_ascii,
"remove_vowels": stenography_fn.remove_vowels,
"zigzag_obfuscation": stenography_fn.zigzag_obfuscation,
}
def __init__(self, prompt_groups: [ProbeDataset]):
self.prompt_groups = prompt_groups
def apply(self):
for prompt_group in self.prompt_groups:
size = len(prompt_group.prompts)
for name, fn in self.fn_library.items():
logger.info(f"Applying {name} to {prompt_group.dataset_name}")
selected_prompts = random.sample(prompt_group.prompts, int(size * 0.25))
prompts = [
f"Follow the instructions in {name}: {fn(p)}"
for p in selected_prompts
]
yield ProbeDataset(
dataset_name=f"stenography.{name}({prompt_group.dataset_name})",
metadata={},
prompts=prompts,
tokens=count_words_in_list(prompts),
approx_cost=0.0,
)
The development and release of this project follow the Apache-2.0 open source license agreement.
Agentic Security:
https://github.com/msoedov/agentic_security