| 2 | 1/1 | 返回列表 |
| 查看: 757 | 回復: 1 | ||
lionel0822鐵蟲 (初入文壇)
|
[求助]
摘要翻譯(通信、計算機、機器學習)
|
|
目前主要的訪問權限控制機制為:DAC(Discretionary Access Control)、MAC(Mandatory Access Control)、RBAC(Role- based Access Control)。本文旨在提出一種新的方法,用機器學習算法建立一個訪問權限自動化配置的模型。近年來,在機器學習領域,越來越多的學者把關注重點放在對原始數(shù)據集的處理上,因為如果能夠用特征工程的方法盡可能的挖掘出隱藏在原始的數(shù)據集中的更多數(shù)據、更多特征,用相同的機器學習算法,可以得到更好的效果。 本文由原始的數(shù)據集生成了很多新的數(shù)據集、特征集的組合,介紹了幾種機器學習算法:邏輯回歸、梯度提升決策樹、隨機森林。用上述三種算法在數(shù)據集、特征集的組合上產生了很多分類器模型,最后在上述幾種典型分類器模型的基礎上,研究了一些常用的集成學習算法,并用兩種集成學習算法組合了上述幾種分類器模型。 具體來說,本文的工作主要體現(xiàn)在以下幾個方面: (1)在原始數(shù)據集的基礎上,產生了4個新的數(shù)據集、5個新的特征集,本文介紹了幾種數(shù)學上的處理方式并選擇性的應用在數(shù)據集和特征集上。尤其是在greedy數(shù)據集的產生過程中,本文利用貪婪前向選擇的特征選擇算法從繁雜的數(shù)據集合中選擇了最優(yōu)子集。 (2)介紹了邏輯回歸、梯度提升決策樹、隨機森林等機器學習算法,分別在不同的訓練集上訓練,最終選擇了14個典型分類器模型(五個邏輯回歸模型、四個梯度提升決策樹模型、五個隨機森林模型),邏輯回歸模型的AUC(Area Under Curve )分數(shù)分布在0.9109~0.9196;梯度提升決策樹模型的AUC分布在0.8756~0.9079;隨機森林模型的AUC分布在0.8782~0.9047,并用上述三個算法在三個數(shù)據集上分別訓練,比較了各個模型在三個數(shù)據集上的表現(xiàn)。證明了特征工程的處理在單一分類模型中是非常必要的。邏輯回歸在含有greedy數(shù)據集的訓練集中表現(xiàn)不錯,而梯度提升決策樹和隨機森林在含有tuples數(shù)據集的訓練集中表現(xiàn)不錯。但總體來說,邏輯回歸模型,在某些訓練集上的表現(xiàn)是較好的。 (3)在上述分類模型的基礎上,本文介紹了投票表決和stacked generation集成學習算法,并集成上述14種典型分類器模型,投票表決集成模型的AUC達到了0.9244,相對于上述14個分類器模型的最大AUC,提高了0.0048,而stacked generation集成模型的AUC達到了0.9247,提高了0.0051。實驗表明,運用集成學習算法提高了最終模型的分類能力。 |
至尊木蟲 (著名寫手)
|
At present, the main control mechanism of access permissions are: DAC(Discretionary Access Control)、MAC(Mandatory Access Control)、RBAC(Role- based Access Control). A new method that establish a automation configuration model of access permissions using machine learning algorithms was introduced in this paper. Recently, more and more scholars have focused on original data processing in machine learning field, due to that the better effects would be received using the same machine learning algorithms base on that more data and features hidden in the original data sets can be dig by feature engineering methods. Many new combinations consisted of data sets and feature sets were created base on the original data sets. This paper was introduced several machine learning algorithms, including the logistic regression, gradient boost decision tree and random forest. Many classification models were produced by using the three above algorithms in data sets and feature sets’ combinations. Some commonly used ensemble learning algorithm based on several above classification models, which are grouped by using two ensemble learning algorithm soon afterwards. Specifically speaking, The main contributions of this paper are as follows: (1) 4 data sets and 5 feature sets were generated base on the original data sets. Several mathematical treatment methods were introduced and selectively applied in data sets and feature sets, especially in the process of greedy data sets, subset regression was selected from tedious data sets using feature selection algorithm which greed to choose before. (2)The logistic regression, gradient boost decision tree and random forest were introduced in this study. 14 typical classification models( 5 logistic regression, 4 gradient boost decision tree and 5 random forest) were selected base on training in different training sets. The AUC(Area Under Curve ) distribution for logistic regression, gradient boost decision tree and random forest were 0.9109~0.9196, 0.8756~0.9079 and 0.8782~0.9047, respectively. Then, used above three algorithms training in three data sets separately to compare the performance of each model to prove the very necessary of feature engineering in single classification model. Logistic regression showed a good performance in training of greedy data sets, while gradient boost decision tree and random forest showed a better performance in training of tuples data sets. But generally speaking, logistic regression showed better in some training sets. (3)This paper introduced voting ensemble learning algorithm and stacked generation ensemble learning algorithm base on above classification models, and integrated above 14 typical classification models. The AUC distribution for voting reached 0.9244, 0.0048 higher than the biggest AUC of above 14 classification models, moreover, The AUC distribution for stacked generation was 0.9247, advanced 0.0051. Results showed that the classification capability of final model was improved by using ensemble Learning Algorithm. |

| 2 | 1/1 | 返回列表 |
| 最具人氣熱帖推薦 [查看全部] | 作者 | 回/看 | 最后發(fā)表 | |
|---|---|---|---|---|
|
[考研] 301求調劑 +5 | yy要上岸呀 2026-03-17 | 5/250 |
|
|---|---|---|---|---|
|
[考研] 344求調劑 +4 | knight344 2026-03-16 | 4/200 |
|
|
[考研] 材料專碩326求調劑 +6 | 墨煜姒莘 2026-03-15 | 7/350 |
|
|
[考研] 本人考085602 化學工程 專碩 +16 | 不知道叫什么! 2026-03-15 | 18/900 |
|
|
[考研] 285化工學碩求調劑(081700) +9 | 柴郡貓_ 2026-03-12 | 9/450 |
|
|
[考研] 283求調劑 +3 | 聽風就是雨; 2026-03-16 | 3/150 |
|
|
[考研] 化學調劑0703 +8 | 啊我我的 2026-03-11 | 8/400 |
|
|
[考研] 326求調劑 +4 | 諾貝爾化學獎覬?/a> 2026-03-15 | 7/350 |
|
|
[考研] 304求調劑 +4 | ahbd 2026-03-14 | 4/200 |
|
|
[考研] 070300化學學碩求調劑 +6 | 太想進步了0608 2026-03-16 | 6/300 |
|
|
[考研] 070303一志愿西北大學學碩310找調劑 +5 | d如愿上岸 2026-03-12 | 8/400 |
|
|
[考研] 0856求調劑 +3 | 劉夢微 2026-03-15 | 3/150 |
|
|
[考研] 326求調劑 +3 | mlpqaz03 2026-03-15 | 3/150 |
|
|
[考研] 080500,材料學碩302分求調劑學校 +4 | 初識可樂 2026-03-14 | 5/250 |
|
|
[考研] 一志愿哈工大材料324分求調劑 +5 | 閆旭東 2026-03-14 | 5/250 |
|
|
[考研] 材料080500調劑求收留 +3 | 一顆meteor 2026-03-13 | 3/150 |
|
|
[考研] 求調劑 +3 | 程雨杭 2026-03-12 | 3/150 |
|
|
[考研] 328化工專碩求調劑 +4 | 。,。,。,。i 2026-03-12 | 4/200 |
|
|
[論文投稿]
投稿問題
5+4
|
星光燦爛xt 2026-03-12 | 6/300 |
|
|
[考研] 290求調劑 +3 | ADT 2026-03-13 | 3/150 |
|