ZeVigilante: Detecting Zero-Day Malware Using Machine Learning and Sandboxing Analysis Techniques. 2022

Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
Saudi Aramco Cybersecurity Chair, Dhahran, Saudi Arabia.

For the enormous growth and the hysterical impact of undocumented malicious software, otherwise known as Zero-Day malware, specialized practices were joined to implement systems capable of detecting these kinds of software to avert possible disastrous consequences. Owing to the nature of developed Zero-Day malware, distinct evasion tactics are used to remain stealth. Hence, there is a need for advance investigations of the methods that can identify such kind of malware. Machine learning (ML) is among the promising techniques for such type of predictions, while the sandbox provides a safe environment for such experiments. After thorough literature review, carefully chosen ML techniques are proposed for the malware detection, under Cuckoo sandboxing (CS) environment. The proposed system is coined as Zero-Day Vigilante (ZeVigilante) to detect the malware considering both static and dynamic analyses. We used adequate datasets for both analyses incorporating sufficient samples in contrast to other studies. Consequently, the processed datasets are used to train and test several ML classifiers including Random Forest (RF), Neural Networks (NN), Decision Tree (DT), k-Nearest Neighbor (kNN), Naïve Bayes (NB), and Support Vector Machine (SVM). It is observed that RF achieved the best accuracy for both static and dynamic analyses, 98.21% and 98.92%, respectively.

UI MeSH Term Description Entries
D000069550 Machine Learning A type of ARTIFICIAL INTELLIGENCE that enable COMPUTERS to independently initiate and execute LEARNING when exposed to new data. Transfer Learning,Learning, Machine,Learning, Transfer
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001499 Bayes Theorem A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result. Bayesian Analysis,Bayesian Estimation,Bayesian Forecast,Bayesian Method,Bayesian Prediction,Analysis, Bayesian,Bayesian Approach,Approach, Bayesian,Approachs, Bayesian,Bayesian Approachs,Estimation, Bayesian,Forecast, Bayesian,Method, Bayesian,Prediction, Bayesian,Theorem, Bayes
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software
D016571 Neural Networks, Computer A computer architecture, implementable in either hardware or software, modeled after biological neural networks. Like the biological system in which the processing capability is a result of the interconnection strengths between arrays of nonlinear processing nodes, computerized neural networks, often called perceptrons or multilayer connectionist models, consist of neuron-like units. A homogeneous group of units makes up a layer. These networks are good at pattern recognition. They are adaptive, performing tasks by example, and thus are better for decision-making than are linear learning machines or cluster analysis. They do not require explicit programming. Computational Neural Networks,Connectionist Models,Models, Neural Network,Neural Network Models,Neural Networks (Computer),Perceptrons,Computational Neural Network,Computer Neural Network,Computer Neural Networks,Connectionist Model,Model, Connectionist,Model, Neural Network,Models, Connectionist,Network Model, Neural,Network Models, Neural,Network, Computational Neural,Network, Computer Neural,Network, Neural (Computer),Networks, Computational Neural,Networks, Computer Neural,Networks, Neural (Computer),Neural Network (Computer),Neural Network Model,Neural Network, Computational,Neural Network, Computer,Neural Networks, Computational,Perceptron
D060388 Support Vector Machine SUPERVISED MACHINE LEARNING algorithm which learns to assign labels to objects from a set of training examples. Examples are learning to recognize fraudulent credit card activity by examining hundreds or thousands of fraudulent and non-fraudulent credit card activity, or learning to make disease diagnosis or prognosis based on automatic classification of microarray gene expression profiles drawn from hundreds or thousands of samples. Support Vector Network,Machine, Support Vector,Machines, Support Vector,Network, Support Vector,Networks, Support Vector,Support Vector Machines,Support Vector Networks,Vector Machine, Support,Vector Machines, Support,Vector Network, Support,Vector Networks, Support

Related Publications

Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
April 2022, Sensors (Basel, Switzerland),
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
June 2014, Sensors (Basel, Switzerland),
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
January 2021, Multimedia tools and applications,
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
May 2024, Scientific reports,
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
July 2014, Proceedings. International Conference on Machine Learning and Cybernetics,
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
March 2014, Proceedings of the 2014 ACM Southeast Regional Conference,
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
January 2023, Sensors (Basel, Switzerland),
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
November 2020, Journal of autism and developmental disorders,
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
February 2023, Journal of psychiatric research,
Fahd Alhaidari, and Nouran Abu Shaib, and Maram Alsafi, and Haneen Alharbi, and Majd Alawami, and Reem Aljindan, and Atta-Ur Rahman, and Rachid Zagrouba
December 2021, Health information science and systems,
Copied contents to your clipboard!