前端开发入门到精通的在线学习网站

网站首页 > 资源文章 正文

使用Python实现网络入侵检测系统(IDS)

qiguaw 2024-09-11 06:22:34 资源文章 22 ℃ 0 评论

阅读文章前辛苦您点下“关注”,方便讨论和分享,为了回馈您的支持,我将每日更新优质内容。

如需转载请附上本文源链接!

1. 项目简介

本教程将带你一步步实现一个简单的网络入侵检测系统。我们将使用Python和一些常用的库,如scikit-learn和pandas。最终,我们将实现一个可以检测网络流量中的异常行为的模型。

2. 环境准备

首先,你需要安装以下库:

  • pandas
  • numpy
  • scikit-learn

你可以使用以下命令安装这些库:

pip install pandas numpy scikit-learn

3. 数据准备

我们将使用KDD Cup 1999数据集,这是一个常用的网络入侵检测数据集。你可以从UCI机器学习库下载该数据集。

import pandas as pd

# 加载数据集
column_names = ["duration","protocol_type","service","flag","src_bytes","dst_bytes","land","wrong_fragment","urgent","hot","num_failed_logins","logged_in","num_compromised","root_shell","su_attempted","num_root","num_file_creations","num_shells","num_access_files","num_outbound_cmds","is_host_login","is_guest_login","count","srv_count","serror_rate","srv_serror_rate","rerror_rate","srv_rerror_rate","same_srv_rate","diff_srv_rate","srv_diff_host_rate","dst_host_count","dst_host_srv_count","dst_host_same_srv_rate","dst_host_diff_srv_rate","dst_host_same_src_port_rate","dst_host_srv_diff_host_rate","dst_host_serror_rate","dst_host_srv_serror_rate","dst_host_rerror_rate","dst_host_srv_rerror_rate","label"]
data = pd.read_csv("kddcup.data_10_percent_corrected", header=None, names=column_names)
print(data.head())

4. 数据预处理

我们需要对数据进行预处理,包括编码分类变量和标准化数据。

from sklearn.preprocessing import LabelEncoder, StandardScaler

# 编码分类变量
categorical_columns = ["protocol_type", "service", "flag"]
for column in categorical_columns:
    encoder = LabelEncoder()
    data[column] = encoder.fit_transform(data[column])

# 分离特征和标签
X = data.drop("label", axis=1)
y = data["label"]

# 标准化数据
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

5. 构建模型

我们将使用scikit-learn中的随机森林算法来构建入侵检测模型。

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# 数据集划分
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# 构建模型
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

6. 评估模型

使用测试数据评估模型性能。

from sklearn.metrics import classification_report, accuracy_score

# 预测
y_pred = model.predict(X_test)

# 评估
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")
print(classification_report(y_test, y_pred))

7. 完整代码

将上述步骤整合成一个完整的Python脚本:

import pandas as pd
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score

# 加载数据集
column_names = ["duration","protocol_type","service","flag","src_bytes","dst_bytes","land","wrong_fragment","urgent","hot","num_failed_logins","logged_in","num_compromised","root_shell","su_attempted","num_root","num_file_creations","num_shells","num_access_files","num_outbound_cmds","is_host_login","is_guest_login","count","srv_count","serror_rate","srv_serror_rate","rerror_rate","srv_rerror_rate","same_srv_rate","diff_srv_rate","srv_diff_host_rate","dst_host_count","dst_host_srv_count","dst_host_same_srv_rate","dst_host_diff_srv_rate","dst_host_same_src_port_rate","dst_host_srv_diff_host_rate","dst_host_serror_rate","dst_host_srv_serror_rate","dst_host_rerror_rate","dst_host_srv_rerror_rate","label"]
data = pd.read_csv("kddcup.data_10_percent_corrected", header=None, names=column_names)

# 编码分类变量
categorical_columns = ["protocol_type", "service", "flag"]
for column in categorical_columns:
    encoder = LabelEncoder()
    data[column] = encoder.fit_transform(data[column])

# 分离特征和标签
X = data.drop("label", axis=1)
y = data["label"]

# 标准化数据
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# 数据集划分
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# 构建模型
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# 预测
y_pred = model.predict(X_test)

# 评估
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")
print(classification_report(y_test, y_pred))

8. 总结

通过本教程,你学会了如何使用Python和scikit-learn实现一个简单的网络入侵检测系统。你可以尝试使用不同的算法和参数,进一步提升检测效果。

本文暂时没有评论,来添加一个吧(●'◡'●)

欢迎 发表评论:

最近发表
标签列表